Dynamic visual effect generation method and apparatus
By receiving user interaction commands to generate dynamic visual effects, and combining procedural rules with background layer fusion, the high storage space and power consumption problems of existing technologies are solved, and efficient and highly interactive dynamic visual effects generation is achieved.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2025-05-27
- Publication Date
- 2026-06-25
AI Technical Summary
In existing technologies, the generation of dynamic visual effects requires a lot of storage space and high computing power, which leads to increased power consumption and difficulty in optimization, making it difficult to meet the needs of real-time response to user interaction.
By receiving user interaction commands, dynamic visual effects are generated using programmatic rules, including the geometric, material, and motion characteristics of visual elements. These effects are then combined with background layers to respond to user interactions in real time, reducing storage requirements and computing power consumption.
It achieves the goal of saving storage space and reducing power consumption on terminal devices, while improving the generation efficiency and interactive fun of dynamic visual effects, and meeting the user's interactive needs for real-time response.
Smart Images

Figure CN2025097477_25062026_PF_FP_ABST
Abstract
Description
Method and apparatus for generating dynamic visual effects Technical Field
[0001] This application relates to visual effects generation technology, and more particularly to a method and apparatus for generating dynamic visual effects. Background Technology
[0002] The theme of a terminal device's (e.g., a mobile phone) operating system is the first thing users see when they turn on the screen, and therefore it is receiving increasing attention. A theme with excellent visual effects can be pleasing to the eye, interesting, and engaging, thus creating user stickiness and a differentiated competitive advantage. More broadly, it can further drive the evolution of various visual effects, such as the development of visual effects in applications like media players, wallpapers, and browsers.
[0003] In related technologies, dynamic visual effects (referred to as motion effects) of themes have evolved from simple static visual effects to realistic dynamic visual effects. For example, weather themes can present weather phenomena such as rain and snow, including effects such as overlapping dark clouds and falling raindrops and snowflakes. Some 3D themes also present character images based on trigger conditions through pre-rendered videos or pre-set animation sequences. However, such presentation methods may require a lot of storage space and high computing power to meet real-time requirements, which can easily lead to increased power consumption and difficulties in optimization. Summary of the Invention
[0004] This application provides a method and apparatus for generating dynamic visual effects, which saves storage space and reduces power consumption of terminal devices, while meeting the needs of real-time user response and increasing interactive enjoyment.
[0005] In a first aspect, this application provides a method for generating dynamic visual effects, comprising: receiving an interaction instruction, the interaction instruction being generated by a user's interaction with a visual element within a screen space; acquiring a procedural rule corresponding to the interaction instruction, the procedural rule being used to generate a dynamic visual effect after the interaction is applied to the visual element; generating a first layer with a dynamic visual effect according to the procedural rule, the dynamic visual effect including visual features of the visual element, the visual features including geometric features, material features, and motion features; and fusing the first layer with a background layer to obtain an image containing a target visual effect.
[0006] This application generates dynamic changes in corresponding visual elements—such as pose changes, shape changes, and qualitative changes—through procedural rules corresponding to user-input interactive commands, and renders these dynamic changes as real-time dynamic visual effects presented to the user. Compared to real-time rendering techniques, this application eliminates the need to store materials during the intermediate process, saving storage space. Furthermore, the computational power required by procedural generation technology is far less than that of real-time rendering, improving the efficiency of dynamic visual effect generation and reducing the power consumption of terminal devices. Compared to other rendering techniques based on procedural generation, this application can respond to user interactions in real time, promptly presenting the dynamic visual effects generated by the user's interactive operations on visual elements, increasing the interactive experience.
[0007] Interactive commands can be generated by a user's interaction with visual elements within the screen space. The screen space can refer to the screen space of a terminal device, including planar images displayed on the screen, or images with perspective effects. Visual elements are one or more digital elements within the aforementioned screen space, such as raindrops, snowflakes, or particles.
[0008] Users can perform interactive operations on visual elements in the screen space, such as clicking raindrops, smearing snowflakes, blowing away particles, etc. The aforementioned interactive operations can trigger the generation of corresponding interactive instructions, which may include coordinate information in the screen space, touch pressure (size and direction), volume (determined by the strength of the blowing), smearing range, etc.
[0009] It should be noted that users of this application can perform various interactive operations on visual elements. In addition to the aforementioned interactive operations, other operations may also be included, without specific limitations. Correspondingly, the interactive commands generated by the interactive operations are also multi-modal, without specific limitations. Diverse interactive operations can increase the fun of user interaction, and the resulting multi-modal interactive commands can enrich the content of dynamic visual effects.
[0010] For example, the above interactive operations include at least one of the following:
[0011] (1) Flip-top terminal equipment;
[0012] For example, the water flow surface is locked at the upper short side. At this time, the water flow changes direction as the phone is tilted, and the flow direction conforms to the actual direction of gravity. When the phone is changed from being laid flat to being picked up, the water flow speed increases; in the reverse process, the water flow speed decreases to stagnation; the water flow surface changes with the tilt of the phone (i.e., the water flow surface is always in the vertical direction), and at this time the water flow direction is always in the vertical direction of the world coordinate system.
[0013] (2) Actions performed on the screen include clicking, pressing, or swiping;
[0014] For example, touching via a mobile phone's touchscreen sensor, including clicking, pressing, smearing, etc., causes the blocked water flow to branch out, and the water droplets that are touched to accelerate their flow or disperse.
[0015] (3) Blow into the microphone;
[0016] For example, when a user blows into a phone, the phone's microphone records the real-time volume changes and controls the direction of the blow based on the volume. For instance, it can control the speed at which water droplets disperse from a fixed point, or the speed at which fog dissipates from a fixed point. If the phone has multiple microphones, it can also control the position of the point of force applied to the blow based on the volume of sounds picked up by multiple microphones.
[0017] (4) Eye gaze.
[0018] For example, the location of the water source can be controlled by capturing changes in the user's gaze point through the phone's camera.
[0019] (5) Gestures and gestures
[0020] For example, the spatial position changes of a user's hand gestures can be captured by the phone's camera, thereby controlling the water flow density and speed.
[0021] In this application, procedural rules can generate dynamic visual effects after interactive operations are applied to visual elements in real time by controlling parameters (this is deterministic), or by combining random expressions to generate dynamic visual effects after interactive operations are applied to visual elements in real time (this is diverse). The purpose is to generate corresponding dynamic changes in visual elements, including pose changes, shape changes, and qualitative changes, after the user performs an interactive operation on a visual element, and to render these dynamic changes as a real-time dynamic visual effect presented to the user. Interactive instructions can indicate the procedural parameters in the procedural rules, and / or, interactive instructions can instruct the matching of corresponding expressions from pre-set procedural rules. Therefore, based on the interactive instructions, the procedural rules required for subsequent processing can be determined, thereby generating the corresponding dynamic visual effects.
[0022] Dynamic visual effects include the visual characteristics of visual elements, which include geometric features (e.g., the shape and form of the visual element), material features (e.g., the texture, surface, and fill of the visual element), and motion features (e.g., at least one of speed, acceleration, or direction). It should be noted that, in addition to the aforementioned features, the visual characteristics of visual elements may also include other features, such as reflection, refraction, and light transmission. This application does not specifically limit the information included in the visual characteristics.
[0023] In one possible implementation, the method for generating a first layer with dynamic visual effects according to procedural rules may include: obtaining a first procedural parameter; defining procedural primitives corresponding to visual elements on a second layer according to the first procedural parameter, the second layer corresponding to the first layer; and performing procedural animation processing on the procedural primitives according to procedural rules to obtain the first layer (also known as a mask layer).
[0024] Optionally, the first procedural parameter can be set to the default value. For example, it can be set to the first procedural parameter corresponding to a rain motion effect, or the first procedural parameter corresponding to a particle occlusion motion effect.
[0025] Optionally, the first programmed parameter can be set by the user on the interactive interface. For example, the interactive interface includes preset map settings, rainwater density settings, gravity sensor switches, and pressure feedback switches, which the user can set according to their preferences to generate the corresponding first programmed parameter.
[0026] Optionally, the first programmed parameter can be obtained through system services. For example, the weather service in the system can provide real-time weather data, based on which the first programmed parameter corresponding to rain, cloudy, sunny, snow, etc., can be obtained.
[0027] It should be noted that, in addition to the methods described above, this application can also obtain the first procedural parameters through other means, without specific limitations. Multiple parameter setting methods allow for more flexible generation of dynamic visual effects, better meeting user needs.
[0028] In this application, the basic geometric primitives of the visual effect can be procedurally defined on the second layer (which can be the initial state of the first layer or the state of the first layer before the completion of this animation rendering, without specific limitations). This results in procedural primitives, which can use basic normal maps to express the geometric features of the corresponding visual elements (e.g., raindrops, fog, frost, etc.). That is, the screen space is divided into multiple layers of meshes in different ways. Each mesh carries several basic primitives based on local mesh coordinates. Different types of effects are expressed using one or more meshes. The mesh shape can be rectangular, polygonal, or any irregular shape, defined by function parameters. The smallest mesh can be a single pixel.
[0029] In this application, the method of performing procedural animation processing on procedural primitives according to procedural rules to obtain a first layer may include: obtaining a first visual feature corresponding to the procedural primitive at a first moment; obtaining a second visual feature corresponding to the procedural primitive at a second moment based on the first visual feature and in combination with procedural rules; and obtaining the first layer based on the second visual feature.
[0030] For example, the programmed geometric features, material features, and motion features (corresponding to the first visual features) at time T (the first time) are received. Combined with physical laws, the programmed geometric features, material features, and motion features (corresponding to the second visual features) at time T+1 (the second time) are calculated. The aforementioned first and second visual features can be referred to in the description of visual features above; the difference between the two is that they correspond to different times. Then, based on the programmed geometric features, material features, and motion features at time T+1, a series of time-series functions f(t) are provided to drive mesh movement and / or geometric primitive deformation for programmed primitives represented by one or more mesh layers, thereby obtaining a masking layer.
[0031] The aforementioned physical laws can be, for example, raindrops sliding along the direction of gravity, or icons falling along the direction of gravity. These laws conform to the motion characteristics of real elements in nature. Therefore, dynamic visual effects based on physical laws on the mask layer can enhance the user's sense of realism.
[0032] Optionally, random noise can be added to the definition of procedural primitives. This noise can be used for the procedural generation of irregular graphics, primitive distributions, and irregular motion patterns, so that procedural primitives and their motion or deformation have the randomness characteristics of real elements in nature.
[0033] Optionally, the background layer can be pre-set. For example, a default image provided by the system can be selected as the background layer, or a previously used background image can be selected as the background layer; there are no specific limitations on this. Optionally, the background layer can be entered by the user on the interactive interface. In addition, the user can also upload a local image as the background layer on the interactive interface; there are no specific limitations on this.
[0034] By merging, dynamic visual effects can be displayed on the background layer, such as a rainy street scene or a frosty view outside a window. This makes the dynamic visual effects no longer monotonous and repetitive, but rather have a certain degree of randomness, making them more in line with real-world scenarios. Furthermore, by combining user interactions and displaying the real-time impact of those interactions on procedural primitives, the interactivity of the visual effects can be enhanced.
[0035] In one possible implementation, this application may also obtain the depth information of the background layer; based on this, a second procedural parameter is obtained, the second procedural parameter including the depth information; and the three-dimensional masking effect of the corresponding visual element is simulated on the layer in the corresponding screen space according to the second procedural parameter to obtain the masking layer.
[0036] In this application, the depth information of the background layer can be used as a parameter in the procedural visual effect masking to participate in the generation and fusion of volumetric visual effects, further generalizing the scenarios for visual effect generation and enhancing the dimensionality of layer fusion. During the procedural rendering stage, volumetric visual effects can utilize multi-frequency noise and depth to simulate the effect of 3D volumetric media in image space. In this process, depth information serves as the carrier of image space information, expanding the 2D image into a 2.5D space, allowing the generation of the volumetric visual effect mask and its fusion with the image to occur in 2.5D space. Since noise- and depth-based volumetric visual effects do not require structured primitives, this embodiment adopts field-based deformation and motion for its interaction method and logic. During interaction, users can not only change the overall movement direction and speed of the volumetric visual effect through clicks or drags, but also influence the rendering results of the volumetric visual effect within a certain distance based on the trajectory of the clicks or drags.
[0037] Secondly, this application provides a dynamic visual effect generation device, comprising: a receiving module for receiving an interaction instruction, the interaction instruction being generated by a user's interaction with a visual element within a screen space; a processing module for acquiring a programmed rule corresponding to the interaction instruction, the programmed rule being used to generate a dynamic visual effect after the interaction is applied to the visual element; generating a first layer with a dynamic visual effect according to the programmed rule, the dynamic visual effect including visual features of the visual element, the visual features including geometric features, material features, and motion features; and a fusion module for fusion of the first layer with a background layer to obtain an image containing the target visual effect.
[0038] In one possible implementation, the processing module is specifically used to obtain a first procedural parameter; define a procedural primitive corresponding to the visual element on a second layer according to the first procedural parameter, the second layer corresponding to the first layer; and perform procedural animation processing on the procedural primitive according to the procedural rules to obtain the first layer.
[0039] In one possible implementation, the processing module is specifically configured to obtain a first visual feature corresponding to the procedural primitive at a first time step, the first visual feature including a first geometric feature, a first material feature, and a first motion feature; obtain a second visual feature corresponding to the procedural primitive at a second time step based on the first visual feature and in conjunction with the procedural rules, the second visual feature including a second geometric feature, a second material feature, and a second motion feature; and obtain the first layer based on the second visual feature.
[0040] In one possible implementation, the first programmatic parameter is set by default; or, the first programmatic parameter is set by the user on the interactive interface; or, the first programmatic parameter is obtained through a system service.
[0041] In one possible implementation, the interactive operation includes at least one of the following: flipping the terminal device; or, clicking, pressing, or swiping on the screen; or, blowing into the microphone; or, looking at the eyes; or, gestures.
[0042] In one possible implementation, the motion feature includes at least one of velocity, acceleration, or direction.
[0043] In one possible implementation, the background layer is preset; or, the background layer is entered by the user on the interactive interface.
[0044] In one possible implementation, the processing module is further configured to: acquire depth information of the background layer; acquire a second procedural parameter, the second procedural parameter including the depth information; and simulate a three-dimensional masking effect of the corresponding visual element on the second layer according to the second procedural parameter to obtain the first layer.
[0045] Thirdly, this application provides a terminal device, comprising: one or more processors; a memory for storing one or more programs; and when the one or more programs are executed by the one or more processors, causing the one or more processors to implement the method as described in any one of the first aspects above.
[0046] Fourthly, this application provides a computer-readable storage medium, characterized in that it includes a computer program, which, when executed on a computer, causes the computer to perform the method described in any one of the first aspects above.
[0047] Fifthly, this application provides a computer program product, characterized in that the computer program product includes computer program code, which, when run on a computer, causes the computer to perform the method described in any one of the first aspects above. Attached Figure Description
[0048] Figure 1 shows a schematic diagram of the structure of the terminal device 100;
[0049] Figure 2 is a system structure block diagram of the terminal device 100 of this application;
[0050] Figure 3 is a schematic diagram of the architecture of the network model acquisition system 300 according to an embodiment of this application;
[0051] Figure 4 is a schematic diagram of the CNN structure according to an embodiment of this application;
[0052] Figure 5 is a flowchart of the dynamic visual effect generation method 500 provided in the embodiment of this application;
[0053] Figure 6a is a schematic diagram of the gravity interaction operation of this application;
[0054] Figure 6b is a schematic diagram of the click-touch interaction operation of this application;
[0055] Figure 6c is a schematic diagram of the finger smearing touch interaction operation of this application;
[0056] Figure 6d is a schematic diagram of the air blowing interaction operation of this application;
[0057] Figures 7a-7e are schematic diagrams of the interactive interface of this application;
[0058] Figure 8a is a schematic diagram of the fast irregular primitive representation method based on regular primitive normal UV sampling offset of this application;
[0059] Figure 8b is a schematic diagram of the fast primitive random irregular distribution generation algorithm based on nesting in this application;
[0060] Figure 8c is a schematic diagram of parallel fast intersection detection within the fragment shader of this application;
[0061] Figure 8d is a schematic diagram of the procedural lighting and shadow generation of this application;
[0062] Figure 9 is a schematic diagram of layer fusion in this application;
[0063] Figures 10a-10c are schematic diagrams illustrating the application of this application in generating raindrop masking effects using procedural techniques;
[0064] Figure 11 is a flowchart of the application of this invention using procedural technology to generate interactive masking layers;
[0065] Figure 12 is a flowchart of the subject matter application of this application;
[0066] Figure 13 is a flowchart of the animation visual effect generation process for the raindrop theme in this application;
[0067] Figures 14a and 14b are schematic diagrams of continuous playback of motion visual effects in this application;
[0068] Figure 15 is a flowchart of the procedural volumetric visual effect control generation and fusion based on image depth in this application;
[0069] Figures 16a and 16b are schematic diagrams of the fog effect interaction of this application;
[0070] Figure 17 is a structural schematic diagram of the dynamic visual effects generation device 1700 of this application;
[0071] Figure 18 shows a schematic block diagram of an apparatus 1800 according to an embodiment of this application. Detailed Implementation
[0072] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0073] The terms "first," "second," etc., used in the specification, embodiments, claims, and drawings of this application are for distinguishing purposes only and should not be construed as indicating or implying relative importance or order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion, such as including a series of steps or units. A method, system, product, or apparatus is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to these processes, methods, products, or apparatuses.
[0074] It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.
[0075] Before describing the technical solutions of the embodiments of this application, the application scenarios of the embodiments of this application will first be described with reference to the accompanying drawings. This application can be applied to the generation of dynamic visual effects for system themes on terminal devices. For example, a weather theme can present weather phenomena such as rain and snow, including dynamic visual effects such as overlapping dark clouds and falling raindrops and snowflakes (hereinafter referred to as dynamic visual effects); or, this application can be applied to applications such as players and weather applications (APPs), for example, displaying flashing colored lights in sync with the rhythm of music on the playback interface of a music player; in addition, it can also be applied to other scenarios that require dynamic visual effects, which this application does not specifically limit. The aforementioned scenarios can be implemented as system services on terminal devices, such as mobile phones, tablets, headphones, etc.; or, they can be added to relevant APPs as Application Programming Interface (API) interfaces, such as weather APPs.
[0076] Figure 1 shows a schematic diagram of the terminal device 100. It should be understood that the terminal device 100 shown in Figure 1 is merely an example, and the terminal device 100 may have more or fewer components than shown in the figure, may combine two or more components, or may have different component configurations. The various components shown in Figure 1 can be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and / or application-specific integrated circuits.
[0077] Terminal device 100 may include: processor 110, external memory interface 120, internal memory 121, universal serial bus (USB) interface 130, charging management module 140, power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and subscriber identification module (SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, a barometric pressure sensor 180C, a magnetic sensor 180D, an accelerometer sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, etc.
[0078] Processor 110 may include one or more processing units, such as: application processor (AP), modem processor, graphics processing unit (GPU), image signal processor (ISP), controller, memory, video codec, digital signal processor (DSP), baseband processor, and / or neural network processing unit (NPU), etc. Different processing units may be independent devices or integrated into one or more processors.
[0079] The controller can serve as the central nervous system and command center of the terminal device 100. The controller can generate operation control signals based on the instruction opcode and timing signals to control the fetching and execution of instructions.
[0080] The processor 110 may also include a memory for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. This memory can store instructions or data that the processor 110 has just used or that are used repeatedly. If the processor 110 needs to use the instruction or data again, it can retrieve it directly from the memory. This avoids repeated accesses, reduces the waiting time of the processor 110, and thus improves the efficiency of the system.
[0081] In some embodiments, the processor 110 may include one or more interfaces. Interfaces may include an inter-integrated circuit (I2C) interface, an inter-integrated circuit sound (I2S) interface, a pulse code modulation (PCM) interface, a universal asynchronous receiver / transmitter (UART) interface, a mobile industry processor interface (MIPI), a general-purpose input / output (GPIO) interface, a subscriber identity module (SIM) interface, and / or a universal serial bus (USB) interface, etc.
[0082] The I2C interface is a bidirectional synchronous serial bus, including a serial data line (SDA) and a serial clock line (SCL). In some embodiments, the processor 110 may include multiple I2C buses. The processor 110 can couple to the touch sensor 180K, charger, flash, camera 193, etc., through different I2C bus interfaces. For example, the processor 110 can couple to the touch sensor 180K through the I2C interface, enabling the processor 110 and the touch sensor 180K to communicate through the I2C bus interface, thereby realizing the touch function of the terminal device 100.
[0083] The I2S interface can be used for audio communication. In some embodiments, the processor 110 may include multiple I2S buses. The processor 110 can be coupled to the audio module 170 via the I2S bus to enable communication between the processor 110 and the audio module 170. In some embodiments, the audio module 170 can transmit audio signals to the wireless communication module 160 via the I2S interface to enable the function of answering phone calls through a Bluetooth headset.
[0084] The PCM interface can also be used for audio communication, sampling, quantizing, and encoding analog signals. In some embodiments, the audio module 170 and the wireless communication module 160 can be coupled via the PCM bus interface. In some embodiments, the audio module 170 can also transmit audio signals to the wireless communication module 160 via the PCM interface, enabling the function of answering phone calls through a Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
[0085] The UART interface is a universal serial data bus used for asynchronous communication. This bus can be a bidirectional communication bus. It converts the data to be transmitted between serial and parallel communication. In some embodiments, the UART interface is typically used to connect the processor 110 and the wireless communication module 160. For example, the processor 110 communicates with the Bluetooth module in the wireless communication module 160 via the UART interface to implement Bluetooth functionality. In some embodiments, the audio module 170 can transmit audio signals to the wireless communication module 160 via the UART interface to enable music playback through Bluetooth headphones.
[0086] The MIPI interface can be used to connect the processor 110 to peripheral devices such as the display screen 194 and the camera 193. The MIPI interface includes a camera serial interface (CSI) and a display serial interface (DSI). In some embodiments, the processor 110 and the camera 193 communicate via the CSI interface to enable the shooting function of the terminal device 100. The processor 110 and the display screen 194 communicate via the DSI interface to enable the display function of the terminal device 100.
[0087] The GPIO interface can be configured via software. It can be configured as a control signal or a data signal. In some embodiments, the GPIO interface can be used to connect the processor 110 to a camera 193, a display screen 194, a wireless communication module 160, an audio module 170, a sensor module 180, etc. The GPIO interface can also be configured as an I2C interface, an I2S interface, a UART interface, a MIPI interface, etc.
[0088] USB port 130 is a USB standard compliant interface, specifically a Mini USB port, Micro USB port, USB Type-C port, etc. USB port 130 can be used to connect a charger to charge terminal device 100, and can also be used for data transfer between terminal device 100 and peripheral devices. It can also be used to connect headphones for audio playback. This interface can also be used to connect other terminal devices, such as AR devices.
[0089] It is understood that the interface connection relationships between the modules illustrated in the embodiments of this application are merely illustrative and do not constitute a structural limitation on the terminal device 100. In other embodiments of this application, the terminal device 100 may also adopt different interface connection methods or a combination of multiple interface connection methods as described in the above embodiments.
[0090] The charging management module 140 receives charging input from a charger. The charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 140 receives charging input from the wired charger via the USB interface 130. In some wireless charging embodiments, the charging management module 140 receives wireless charging input via the wireless charging coil of the terminal device 100. While charging the battery 142, the charging management module 140 can also supply power to the terminal device via the power management module 141.
[0091] The power management module 141 connects the battery 142, the charging management module 140, and the processor 110. The power management module 141 receives input from the battery 142 and / or the charging management module 140, providing power to the processor 110, internal memory 121, external memory, display screen 194, camera 193, and wireless communication module 160, etc. The power management module 141 can also monitor parameters such as battery capacity, battery cycle count, and battery health status (leakage current, impedance). In some other embodiments, the power management module 141 may also be located within the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may be located in the same device.
[0092] The wireless communication function of the terminal device 100 can be implemented through antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, modem processor and baseband processor, etc.
[0093] Antennas 1 and 2 are used to transmit and receive electromagnetic wave signals. Each antenna in terminal device 100 can be used to cover one or more communication frequency bands. Different antennas can also be multiplexed to improve antenna utilization. For example, antenna 1 can be multiplexed as a diversity antenna for a wireless local area network. In some other embodiments, the antennas can be used in conjunction with a tuning switch.
[0094] The mobile communication module 150 can provide solutions for wireless communication, including 2G / 3G / 4G / 5G, applied to the terminal device 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc. The mobile communication module 150 can receive electromagnetic waves via antenna 1, and perform filtering, amplification, and other processing on the received electromagnetic waves before transmitting them to a modem processor for demodulation. The mobile communication module 150 can also amplify the signal modulated by the modem processor and convert it into electromagnetic waves for radiation via antenna 1. In some embodiments, at least some functional modules of the mobile communication module 150 may be housed in the processor 110. In some embodiments, at least some functional modules of the mobile communication module 150 and at least some modules of the processor 110 may be housed in the same device.
[0095] The modem processor may include a modulator and a demodulator. The modulator modulates the low-frequency baseband signal to be transmitted into a mid-to-high frequency signal. The demodulator demodulates the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After processing by the baseband processor, the low-frequency baseband signal is transmitted to the application processor. The application processor outputs sound signals through an audio device (not limited to speaker 170A, receiver 170B, etc.) or displays images or videos through the display screen 194. In some embodiments, the modem processor may be a separate device. In other embodiments, the modem processor may be independent of the processor 110 and may be housed in the same device as the mobile communication module 150 or other functional modules.
[0096] The wireless communication module 160 can provide solutions for wireless communication applications on the terminal device 100, including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), and infrared (IR) technologies. The wireless communication module 160 can be one or more devices integrating at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via antenna 2, performs frequency modulation and filtering of the electromagnetic wave signals, and sends the processed signal to processor 110. The wireless communication module 160 can also receive signals to be transmitted from processor 110, perform frequency modulation and amplification, and convert them into electromagnetic waves for radiation via antenna 2.
[0097] In some embodiments, antenna 1 of terminal device 100 is coupled to mobile communication module 150, and antenna 2 is coupled to wireless communication module 160, enabling terminal device 100 to communicate with networks and other devices via wireless communication technology. The wireless communication technology may include Global System for Mobile Communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time-Division Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and / or IR technologies, etc. The GNSS may include the Global Positioning System (GPS), the Global Navigation Satellite System (GLONASS), the BeiDou Navigation Satellite System (BDS), the Quasi-Zenith Satellite System (QZSS), and / or satellite-based augmentation systems (SBAS).
[0098] Terminal device 100 implements display functions through a GPU, display screen 194, and application processor. The GPU is a microprocessor for image processing, connected to the display screen 194 and the application processor. The GPU is used to perform mathematical and geometric calculations and for graphics rendering. Processor 110 may include one or more GPUs, which execute program instructions to generate or modify display information.
[0099] The display screen 194 is used to display images, videos, etc. The display screen 194 includes a display panel. The display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), a minimized LED, a microLED, a quantum dot light-emitting diode (QLED), etc. In some embodiments, the terminal device 100 may include one or N display screens 194, where N is a positive integer greater than 1.
[0100] Terminal device 100 can perform shooting functions through ISP, camera 193, video codec, GPU, display 194 and application processor.
[0101] The ISP (Image Signal Processor) is used to process data fed back from the camera 193. For example, when taking a picture, the shutter is opened, and light is transmitted through the lens to the camera's photosensitive element. The light signal is converted into an electrical signal, and the camera's photosensitive element transmits the electrical signal to the ISP for processing, transforming it into an image visible to the naked eye. The ISP can also perform algorithmic optimization of image noise, brightness, and skin tone. The ISP can also optimize parameters such as exposure and color temperature of the shooting scene. In some embodiments, the ISP can be set in the camera 193.
[0102] Camera 193 is used to capture still images or videos. An object is projected onto a photosensitive element by generating an optical image through the lens. The photosensitive element can be a charge-coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the light signal into an electrical signal, which is then passed to an ISP for conversion into a digital image signal. The ISP outputs the digital image signal to a DSP for processing. The DSP converts the digital image signal into image signals in standard RGB, YUV, or other formats. In some embodiments, the terminal device 100 may include one or N cameras 193, where N is a positive integer greater than 1.
[0103] A digital signal processor (DSP) is used to process digital signals. Besides digital image signals, it can also process other digital signals. For example, when terminal device 100 selects a frequency, the DSP can perform Fourier transforms on the frequency energy.
[0104] Video codecs are used to compress or decompress digital video. Terminal device 100 may support one or more video codecs. Thus, terminal device 100 can play or record videos in various encoding formats, such as Moving Picture Experts Group (MPEG) 1, MPEG 2, MPEG 3, MPEG 4, etc.
[0105] NPU stands for Neural Network (NN) Computing Processor. By borrowing the structure of biological neural networks, such as the transmission patterns between neurons in the human brain, it can rapidly process input information and continuously learn on its own. NPUs enable intelligent cognitive applications in terminal devices, such as image recognition, facial recognition, speech recognition, and text understanding.
[0106] The external storage interface 120 can be used to connect an external storage card, such as a Micro SD card, to expand the storage capacity of the terminal device 100. The external storage card communicates with the processor 110 through the external storage interface 120 to perform data storage functions. For example, music, video, and other files can be saved on the external storage card.
[0107] Internal memory 121 can be used to store computer executable program code, which includes instructions. Processor 110 executes various functional applications and data processing of terminal device 100 by running the instructions stored in internal memory 121. Internal memory 121 may include a program storage area and a data storage area. The program storage area may store the operating system, at least one application program required for a function (such as sound playback, image playback, etc.), etc. The data storage area may store data created during the use of terminal device 100 (such as audio data, phonebook, etc.). Furthermore, internal memory 121 may include high-speed random access memory and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc.
[0108] Terminal device 100 can implement audio functions, such as music playback and recording, through audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, and application processor.
[0109] The audio module 170 is used to convert digital audio information into analog audio signals for output, and also to convert analog audio input into digital audio signals. The audio module 170 can also be used for encoding and decoding audio signals. In some embodiments, the audio module 170 may be located in the processor 110, or some functional modules of the audio module 170 may be located in the processor 110.
[0110] The speaker 170A, also known as a "loudspeaker," is used to convert audio electrical signals into sound signals. The terminal device 100 can listen to music or make hands-free calls through the speaker 170A.
[0111] The receiver 170B, also known as the "earpiece," is used to convert audio electrical signals into sound signals. When the terminal device 100 answers a phone call or voice message, the receiver 170B can be brought close to the listener's ear to hear the voice.
[0112] Microphone 170C, also known as a "microphone" or "voice transducer," is used to convert sound signals into electrical signals. When making a phone call or sending a voice message, the user can speak by bringing their mouth close to microphone 170C, inputting the sound signal into microphone 170C. Terminal device 100 may be equipped with at least one microphone 170C. In some embodiments, terminal device 100 may be equipped with two microphones 170C, which, in addition to collecting sound signals, can also perform noise reduction. In other embodiments, terminal device 100 may be equipped with three, four, or more microphones 170C, which can collect sound signals, reduce noise, identify the sound source, and perform directional recording, etc.
[0113] The 170D headphone jack is used to connect wired headphones. The 170D headphone jack can be a USB 130 interface or a 3.5mm Open Mobile Terminal Platform (OMTP) standard interface, a CTIA (Cellular Telecommunications Industry Association of the USA) standard interface.
[0114] Pressure sensor 180A is used to sense pressure signals and convert them into electrical signals. In some embodiments, pressure sensor 180A can be disposed on display screen 194. There are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, and capacitive pressure sensors. A capacitive pressure sensor may include at least two parallel plates with conductive material. When force is applied to pressure sensor 180A, the capacitance between the electrodes changes. Terminal device 100 determines the pressure intensity based on the change in capacitance. When a touch operation is applied to display screen 194, terminal device 100 detects the intensity of the touch operation based on pressure sensor 180A. Terminal device 100 can also calculate the touch position based on the detection signal from pressure sensor 180A. In some embodiments, touch operations applied to the same touch position but with different touch operation intensities can correspond to different operation commands. For example: when a touch operation with an intensity less than a first pressure threshold is applied to the SMS application icon, a command to view an SMS is executed. When a touch operation with an intensity greater than or equal to the first pressure threshold is applied to the SMS application icon, a command to create a new SMS is executed.
[0115] The gyroscope sensor 180B can be used to determine the motion attitude of the terminal device 100. In some embodiments, the gyroscope sensor 180B can determine the angular velocity of the terminal device 100 around three axes (i.e., the x, y, and z axes). The gyroscope sensor 180B can be used for image stabilization. For example, when the shutter is pressed, the gyroscope sensor 180B detects the angle of the terminal device 100's shake, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to counteract the shake of the terminal device 100 through reverse movement, thus achieving image stabilization. The gyroscope sensor 180B can also be used in navigation and motion-sensing game scenarios.
[0116] The barometric pressure sensor 180C is used to measure air pressure. In some embodiments, the terminal device 100 calculates altitude using the air pressure value measured by the barometric pressure sensor 180C to assist in positioning and navigation.
[0117] The magnetic sensor 180D includes a Hall sensor. The terminal device 100 can use the magnetic sensor 180D to detect the opening and closing of the flip cover. In some embodiments, when the terminal device 100 is a flip phone, the terminal device 100 can detect the opening and closing of the flip cover using the magnetic sensor 180D. Then, based on the detected opening and closing state of the cover or the flip cover, features such as automatic flip unlocking can be set.
[0118] The 180E accelerometer can detect the magnitude of acceleration of the terminal device 100 in various directions (typically three axes). When the terminal device 100 is stationary, it can detect the magnitude and direction of gravity. It can also be used to identify the attitude of the terminal device, and can be applied to applications such as landscape / portrait switching and pedometers.
[0119] A distance sensor 180F is used to measure distance. The terminal device 100 can measure distance via infrared or laser. In some embodiments, during a shooting scene, the terminal device 100 can utilize the distance sensor 180F to measure distance for rapid focusing.
[0120] The proximity sensor 180G may include, for example, a light-emitting diode (LED) and a light detector, such as a photodiode. The LED may be an infrared LED. The terminal device 100 emits infrared light outward through the LED. The terminal device 100 uses the photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the terminal device 100. When insufficient reflected light is detected, the terminal device 100 can determine that there is no object near the terminal device 100. The terminal device 100 may use the proximity sensor 180G to detect when a user holds the terminal device 100 close to their ear for a call, so as to automatically turn off the screen to save power. The proximity sensor 180G can also be used in holster mode and pocket mode for automatic unlocking and screen locking.
[0121] The ambient light sensor 180L is used to sense the ambient light intensity. The terminal device 100 can adaptively adjust the brightness of the display screen 194 based on the sensed ambient light intensity. The ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures. The ambient light sensor 180L can also work with the proximity sensor 180G to detect whether the terminal device 100 is in a pocket to prevent accidental touches.
[0122] The fingerprint sensor 180H is used to collect fingerprints. The terminal device 100 can use the collected fingerprint characteristics to achieve fingerprint unlocking, accessing application locks, taking photos with fingerprints, answering calls with fingerprints, etc.
[0123] Temperature sensor 180J is used to detect temperature. In some embodiments, terminal device 100 uses the temperature detected by temperature sensor 180J to execute a temperature handling strategy. For example, when the temperature reported by temperature sensor 180J exceeds a threshold, terminal device 100 reduces the performance of the processor located near temperature sensor 180J to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is below another threshold, terminal device 100 heats battery 142 to prevent abnormal shutdown of terminal device 100 due to low temperature. In still other embodiments, when the temperature is below yet another threshold, terminal device 100 boosts the output voltage of battery 142 to prevent abnormal shutdown due to low temperature.
[0124] Touch sensor 180K, also known as a "touch panel," can be located on display screen 194. The touch sensor 180K and display screen 194 together form a touchscreen, also known as a "touch screen." Touch sensor 180K detects touch operations applied to or near it. The touch sensor can transmit the detected touch operation to the application processor to determine the type of touch event. Visual output related to the touch operation can be provided through display screen 194. In other embodiments, touch sensor 180K may also be located on the surface of terminal device 100, in a different position than display screen 194.
[0125] The bone conduction sensor 180M can acquire vibration signals. In some embodiments, the bone conduction sensor 180M can acquire vibration signals from the vibrating bone segments of the human vocal cords. The bone conduction sensor 180M can also contact the human pulse to receive blood pressure signals. In some embodiments, the bone conduction sensor 180M can also be incorporated into headphones to form bone conduction headphones. The audio module 170 can parse the voice signals from the vibrating bone segments of the vocal cords acquired by the bone conduction sensor 180M to realize voice functionality. The application processor can parse heart rate information from the blood pressure signals acquired by the bone conduction sensor 180M to realize heart rate detection functionality.
[0126] Buttons 190 include a power button, volume buttons, etc. Buttons 190 can be mechanical buttons or touch-sensitive buttons. Terminal device 100 can receive button input and generate key signal inputs related to user settings and function control of terminal device 100.
[0127] Motor 191 can generate vibration alerts. Motor 191 can be used for incoming call vibration alerts or for touch vibration feedback. For example, different vibration feedback effects can correspond to touch operations performed on different applications (such as taking photos, playing audio, etc.). Motor 191 can also correspond to different vibration feedback effects for touch operations performed on different areas of the display screen 194. Different application scenarios (such as time reminders, receiving messages, alarm clocks, games, etc.) can also correspond to different vibration feedback effects. The touch vibration feedback effect can also be customized.
[0128] Indicator 192 can be an indicator light, used to indicate charging status, power changes, or to indicate messages, missed calls, notifications, etc.
[0129] The SIM card interface 195 is used to connect a SIM card. The SIM card can be inserted into or removed from the SIM card interface 195 to make contact with and separate from the terminal device 100. The terminal device 100 can support one or N SIM card interfaces, where N is a positive integer greater than 1. The SIM card interface 195 can support Nano SIM cards, Micro SIM cards, SIM cards, etc. Multiple cards can be inserted into the same SIM card interface 195 simultaneously. The multiple cards can be of the same or different types. The SIM card interface 195 is also compatible with different types of SIM cards. The SIM card interface 195 is also compatible with external memory cards. The terminal device 100 interacts with the network through the SIM card to realize functions such as calls and data communication. In some embodiments, the terminal device 100 uses an eSIM, i.e., an embedded SIM card. The eSIM card can be embedded in the terminal device 100 and cannot be separated from the terminal device 100.
[0130] The software system of terminal device 100 can adopt a layered architecture, event-driven architecture, microkernel architecture, microservice architecture, or cloud architecture. This application embodiment uses the layered architecture Android system as an example to exemplify the system structure of terminal device 100.
[0131] Figure 2 is a system architecture block diagram of the terminal device 100 of this application. As shown in Figure 2, the layered architecture of the terminal device 100 divides the system into several layers, each with a clear role and division of labor. Layers communicate with each other through software interfaces. In some embodiments, the system is divided into three layers, from top to bottom: the application layer, the Android runtime and system libraries, and the kernel layer. Dynamic visual effects engine (also known as motion visual effects engine)
[0132] The application layer can include a series of application packages.
[0133] As shown in Figure 2, the application package can include applications such as themes, media, games, weather, browsers, cameras, calendars, galleries, calls, maps, and text messaging.
[0134] The Android Runtime consists of system libraries and a virtual machine. The Android runtime is responsible for the scheduling and management of the Android system.
[0135] The system library consists of two parts: one part is the function that needs to be called, and the other part is the Android system library.
[0136] The application layer runs in a virtual machine. The virtual machine executes the Java files in the application layer as binary files. The virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, security and exception management, and garbage collection.
[0137] The system library can include multiple functional modules. For example: surface manager, physics engine, 3D rendering engine (e.g., OpenGL ES), 2D rendering engine (e.g., SGL), procedural rendering engine, etc.
[0138] The Surface Manager is used to manage the display subsystem and provides the blending of 2D and 3D layers for multiple applications.
[0139] A 3D rendering engine is a graphics engine used for 3D drawing, image rendering, compositing, and layer processing.
[0140] A 2D rendering engine is a drawing engine for 2D graphics, used to implement two-dimensional graphics drawing, image rendering, compositing, and layer processing.
[0141] The programmatic rendering engine has the ability to call the underlying graphics driver and GPU to provide general dynamic visual rendering results for upper-layer applications.
[0142] The kernel layer is the layer between hardware and software. The kernel layer includes at least display drivers, sensor drivers, graphics drivers, sensors, and GPUs.
[0143] It is understood that the components included in the system framework layer, system library, and runtime layer shown in Figure 2 do not constitute a specific limitation on the terminal device 100. In other embodiments of this application, the terminal device 100 may include more or fewer components than shown in the figure, or combine some components, or split some components, or have different component arrangements.
[0144] This application relates to the application of procedural generation technology. To facilitate understanding, some terms or nouns used in procedural generation technology will be explained first, and these terms or nouns are also part of the content of the invention.
[0145] Procedural generation is a technology that creates digital content based on algorithms rather than manual methods. Its core is the functional mathematical expression and parameterized control of digital content. It can generate deterministic content in real time by controlling parameters, or it can combine random functions to generate diverse results.
[0146] Procedural rendering: Procedural rendering is a process based on procedural generation technology that expresses all modules of the traditional pipeline, including geometry, materials, lighting, physics, animation, camera, post-processing, etc., in the form of functions, and controls the input and output results of each module in real time through parameters to finally obtain the rendering output.
[0147] Screen Space: A 2D planar coordinate space constructed using screen pixels as the unit.
[0148] Material: In computer graphics, material specifically refers to the optical properties of an object. Based on the interaction characteristics between light and the object, materials are mainly divided into surface materials (such as metal, cloth, wood, plastic, etc.), intervening materials (such as glass, wax, water), and volumetric materials (clouds, fog, atmosphere, etc.).
[0149] This application also relates to the application of artificial intelligence (AI). For ease of understanding, some terms or nouns used in AI technology will be explained below, and these terms or nouns are also part of the invention.
[0150] (1) Neural Network
[0151] Neural Networks (NNs) are machine learning models. A neural network can be composed of neural units, which are computational units that take xs and an intercept of 1 as input. The output of such a computational unit can be:
[0152] Where s = 1, 2, ..., n, where n is a natural number greater than 1, Ws is the weight of xs, and b is the bias of the neural unit. f is the activation function of the neural unit, used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into the output signal. The output signal of this activation function can be used as the input of the next convolutional layer. The activation function can be the sigmoid function. A neural network is a network formed by connecting many of the above-mentioned individual neural units together, that is, the output of one neural unit can be the input of another neural unit. The input of each neural unit can be connected to the local receptive field of the previous layer to extract the features of the local receptive field, which can be a region composed of several neural units.
[0153] (2) Deep Neural Networks
[0154] Deep neural networks (DNNs), also known as multilayer neural networks, can be understood as neural networks with many hidden layers, though there's no specific metric for "many." DNNs can be categorized into three layers based on their position: input layers, hidden layers, and output layers. Generally, the first layer is the input layer, the last layer is the output layer, and the layers in between are hidden layers. All layers are fully connected, meaning that any neuron in the i-th layer is connected to any neuron in the (i+1)-th layer. Although DNNs appear complex, the operation of each layer is actually quite simple, resembling a linear relationship as follows: in, It is the input vector. It is the output vector. α is the offset vector, W is the weight matrix (also called coefficients), and α() is the activation function. Each layer is simply an adjustment of the input vector. The output vector is obtained through such a simple operation. Because DNNs have many layers, the coefficients W and the offset vector... The number of these parameters is therefore quite large. The definitions of these parameters in a DNN are as follows: Taking the coefficient W as an example: Assuming a three-layer DNN, the linear coefficient from the 4th neuron in the second layer to the 2nd neuron in the third layer is defined as... The superscript 3 represents the layer number where coefficient W resides, while the subscript corresponds to the output third layer index 2 and the input second layer index 4. In summary, the coefficients from the k-th neuron in layer L-1 to the j-th neuron in layer L are defined as follows: It's important to note that the input layer does not have a W parameter. In deep neural networks, more hidden layers allow the network to better represent complex real-world situations. Theoretically, the more parameters a model has, the higher its complexity and "capacity," meaning it can perform more complex learning tasks. Training a deep neural network is essentially the process of learning the weight matrix, with the ultimate goal of obtaining the weight matrix of all layers in the trained deep neural network (a weight matrix formed by the vectors W from many layers).
[0155] (3) Convolutional Neural Network
[0156] A convolutional neural network (CNN) is a deep neural network with convolutional structures. It is a deep learning architecture, which refers to learning at multiple levels of abstraction using machine learning algorithms. As a deep learning architecture, CNN is a feed-forward artificial neural network, where each neuron responds to an input image. A CNN contains a feature extractor consisting of convolutional layers and pooling layers. This feature extractor can be viewed as a filter, and the convolution process can be seen as performing convolution with a trainable filter and an input image or a convolutional feature map.
[0157] A convolutional layer is a layer of neurons in a convolutional neural network that performs convolution processing on the input signal. A convolutional layer can contain multiple convolution operators, also called kernels. In image processing, these operators act as filters, extracting specific information from the input image matrix. Essentially, a convolution operator can be a weight matrix, which is usually predefined. During the convolution operation, the weight matrix typically processes the input image pixel by pixel (or two pixels by two pixels, depending on the stride) along the horizontal direction, thus extracting specific features from the image. The size of the weight matrix should be related to the image size. It's important to note that the depth dimension of the weight matrix is the same as the depth dimension of the input image; during the convolution operation, the weight matrix extends to the entire depth of the input image. Therefore, convolving with a single weight matrix produces a single-depth convolutional output. However, in most cases, multiple weight matrices of the same size (rows × columns) are used instead of a single weight matrix. The outputs of each weight matrix are stacked to form the depth dimension of the convolutional image. This dimension can be understood as being determined by the "multiple" factors mentioned above. Different weight matrices can be used to extract different features from the image. For example, one weight matrix can be used to extract edge information, another to extract specific colors, and yet another to blur unwanted noise. These multiple weight matrices have the same size (rows × columns), and the feature maps extracted by these weight matrices also have the same size. These extracted feature maps are then merged to form the output of the convolution operation. The weight values in these weight matrices need to be obtained through extensive training in practical applications. The weight matrices formed by these trained weight values can be used to extract information from the input image, enabling the convolutional neural network to make correct predictions. When a convolutional neural network has multiple convolutional layers, the initial convolutional layers often extract more general features, which can also be called low-level features. As the depth of the convolutional neural network increases, the features extracted by later convolutional layers become increasingly complex, such as high-level semantic features. Features with higher semantic levels are more suitable for the problem being solved.
[0158] Because it's often necessary to reduce the number of training parameters, pooling layers are frequently introduced periodically after convolutional layers. This can be a single convolutional layer followed by a pooling layer, or multiple convolutional layers followed by one or more pooling layers. In image processing, the sole purpose of pooling layers is to reduce the spatial size of the image. Pooling layers can include average pooling and / or max pooling operators to sample the input image to obtain a smaller image size. Average pooling calculates the average value of pixel values within a specific range as the result of average pooling. Max pooling takes the pixel with the largest value within a specific range as the result of max pooling. Furthermore, just as the size of the weight matrix in a convolutional layer should be related to the image size, the operators in a pooling layer should also be related to the image size. The size of the output image after pooling can be smaller than the size of the input image of the pooling layer. Each pixel in the output image represents the average or maximum value of the corresponding sub-region of the input image of the pooling layer.
[0159] After processing by convolutional / pooling layers, a convolutional neural network (CNN) is still insufficient to output the required information. As mentioned earlier, convolutional / pooling layers only extract features and reduce the parameters introduced by the input image. However, to generate the final output information (the required class information or other relevant information), the CNN needs to utilize neural network layers to generate one or a set of desired classes in the output. Therefore, the neural network can include multiple hidden layers, the parameters of which can be pre-trained based on relevant data for a specific task type, such as image recognition, image classification, image super-resolution reconstruction, etc.
[0160] Optionally, after the multiple hidden layers in the neural network, there is also an output layer of the entire convolutional neural network. This output layer has a loss function similar to the classification cross-entropy, which is specifically used to calculate the prediction error. Once the forward propagation of the entire convolutional neural network is completed, the backpropagation will begin to update the weight values and biases of the aforementioned layers to reduce the loss of the convolutional neural network and the error between the result output by the convolutional neural network through the output layer and the ideal result.
[0161] (4) Recurrent Neural Network
[0162] Recurrent neural networks (RNNs) are used to process sequential data. In traditional neural network models, the layers from the input layer to the hidden layer and then to the output layer are fully connected, but the nodes within each layer are unconnected. While this type of neural network has solved many difficult problems, it remains inadequate for many others. For example, predicting the next word in a sentence generally requires using the preceding words because words in a sentence are not independent. RNNs are called recurrent neural networks because the current output of a sequence is related to the outputs of previous sequences. Specifically, the network memorizes previous information and applies it to the calculation of the current output; that is, nodes within the same hidden layer are no longer unconnected but connected, and the input to a hidden layer includes not only the output of the input layer but also the output of the hidden layer at the previous time step. Theoretically, RNNs can process sequential data of any length. Training an RNN is similar to training a traditional CNN or DNN. This algorithm also uses the backpropagation algorithm, but with one key difference: when an RNN is expanded, its parameters, such as W, are shared; however, this is not the case with traditional neural networks as illustrated above. Furthermore, in gradient descent, the output at each step depends not only on the network at the current step but also on the states of the network in previous steps. This learning algorithm is called Backpropagation Through Time (BPTT).
[0163] Since we already have convolutional neural networks (CNNs), why do we need recurrent neural networks (RNNs)? The reason is simple. CNNs rely on the fundamental assumption that elements are independent of each other, and that input and output are also independent—like a cat and a dog. However, in the real world, many elements are interconnected. For example, stock prices fluctuate over time. Or, imagine someone saying, "I love traveling, and my favorite place is Yunnan. I definitely want to go there someday." Humans know the answer to this question is "Yunnan." Humans can infer from context, but how can machines do the same? This is where RNNs come in. RNNs aim to give machines the ability to remember, just like humans. Therefore, the output of an RNN depends on both the current input information and historical memory information.
[0164] (5) Loss Function
[0165] In training a deep neural network, to ensure the output closely approximates the desired predicted value, we compare the network's prediction with the target value. Based on the difference, we update the weight vector of each layer (usually pre-configuring parameters before the initial update). For example, if the prediction is too high, the weight vector is adjusted to predict a lower value. This adjustment continues until the deep neural network predicts the target value or a value very close to it. Therefore, we need to predefine "how to compare the difference between the predicted and target values," which is the loss function or objective function. These are important equations used to measure the difference between the predicted and target values. Taking the loss function as an example, a higher output value (loss) indicates a greater difference, and training the deep neural network becomes a process of minimizing this loss.
[0166] (6) Backpropagation algorithm
[0167] Convolutional neural networks can employ backpropagation (BP) to correct the parameters in the initial super-resolution model during training, thereby reducing the reconstruction error loss. Specifically, forward propagation of the input signal to the output generates an error loss; this error loss information is then propagated back to update the parameters in the initial super-resolution model, leading to convergence of the error loss. The backpropagation algorithm is an error-loss-driven backpropagation process aimed at obtaining the optimal parameters of the super-resolution model, such as the weight matrix.
[0168] (7) Generative Adversarial Networks
[0169] Generative adversarial networks (GANs) are a type of deep learning model. This model comprises at least two modules: a generative model and a discriminative model. These two modules learn from each other through a game-like interaction, resulting in better outputs. Both the generative and discriminative models can be neural networks, specifically deep neural networks or convolutional neural networks. The basic principle of GANs is as follows: Taking an image-generating GAN as an example, suppose there are two networks, G (Generator) and D (Discriminator). G is a network that generates images by receiving random noise z and using this noise, denoted as G(z). D is a discriminative network used to determine whether an image is "real." Its input parameter is x, representing an image, and its output D(x) represents the probability that x is a real image. A value of 1 indicates that the image is 100% real, while a value of 0 indicates that the image is impossible to be real. During the training of this generative adversarial network (GAN), the goal of the generative network G is to generate realistic images to deceive the discriminator network D, while the goal of the discriminator network D is to distinguish the images generated by G from real images as much as possible. Thus, G and D constitute a dynamic "game," which is the "adversarial" aspect of the GAN. Ideally, the game will result in G generating images G(z) that are sufficiently realistic, while D struggles to determine whether the images generated by G are real or not, i.e., D(G(z)) = 0.5. This yields a superior generative model G that can be used to generate images.
[0170] Regardless of which AI technology is used, the network model in this embodiment can be pre-trained. For example, Figure 3 is a schematic diagram of the architecture of the network model acquisition system 300 in this embodiment. As shown in Figure 3, the data acquisition device 360 collects data and stores it in the database 330, and the training device 320 generates a target model / rule 301 based on the data maintained in the database 330. The following will describe in more detail how the training device 320 obtains the target model / rule 301 based on the data. The target model / rule 301 can perform related functions and output the required data or graphics.
[0171] The function of each layer in a deep neural network can be expressed mathematically. To describe it: From a physical perspective, the work of each layer in a deep neural network can be understood as transforming the input space (the set of input vectors) to the output space (i.e., from the row space to the column space of a matrix) through five operations on the input space. These five operations include: 1. Dimensionality increase / decrease; 2. Magnification / scaling; 3. Rotation; 4. Translation; 5. "Bending". Operations 1, 2, and 3 are... The operation 4 is completed using +b, and the operation 5 is implemented using a(). The term "space" is used here because the objects being classified are not individual things, but a class of things; space refers to the set of all individuals within this class of things. Here, W is the weight vector, where each value represents the weight of a neuron in that layer of the neural network. This vector W determines the spatial transformation from the input space to the output space, as described above; that is, the weights W of each layer control how the space is transformed. The purpose of training a deep neural network is to ultimately obtain the weight matrix of all layers of the trained neural network (a weight matrix formed by the vectors W from many layers). Therefore, the training process of a neural network is essentially learning how to control spatial transformation, more specifically, learning the weight matrix.
[0172] Because the goal is for the output of a deep neural network to be as close as possible to the actual predicted value, we can compare the current network's prediction with the desired target value and update the weight vector of each layer based on the difference. (Of course, there's usually an initialization process before the first update, where parameters are pre-configured for each layer in the deep neural network). For example, if the network's prediction is too high, the weight vector is adjusted to predict a lower value, and this adjustment continues until the neural network can predict the actual target value. Therefore, it's necessary to predefine "how to compare the difference between the predicted value and the target value," which is the loss function or objective function. These are important equations used to measure the difference between the predicted and target values. Taking the loss function as an example, a higher output value (loss) indicates a greater difference, so training a deep neural network becomes a process of minimizing this loss as much as possible.
[0173] The target model / rule 301 obtained from training device 320 can be applied to different systems or devices.
[0174] The execution device 310 is equipped with an I / O interface 312 for data interaction with external devices. The "user" can input data to the I / O interface 312 through the terminal device 340.
[0175] The execution device 310 can call data, code, etc. in the data storage system 350, and can also store data, instructions, etc. in the data storage system 350.
[0176] The calculation module 311 uses the target model / rule 301 to process the input data in order to realize the function of the network model in this application embodiment.
[0177] The associated function module 313 and associated function module 314 can respectively implement related functions in the training process, such as preprocessing and filtering.
[0178] Finally, the I / O interface 312 returns the processing result to the terminal device 340 for the user.
[0179] At a deeper level, the training device 320 can generate corresponding target models / rules 301 based on different data for different objectives, in order to provide users with better results.
[0180] In the scenario shown in Figure 3, the user can manually specify the data to be input into the execution device 310, for example, by operating through the interface provided by the I / O interface 312. Alternatively, the terminal device 340 can automatically input data into the I / O interface 312 and obtain the results. If the terminal device 340 requires user authorization to automatically input data, the user can set the corresponding permissions in the terminal device 340. The user can view the results output by the execution device 310 on the terminal device 340, which can be presented in various ways such as display, sound, or animation. The terminal device 340 can also act as a data acquisition terminal, storing the acquired data into the database 330.
[0181] It is worth noting that Figure 3 is merely a schematic diagram of a network model acquisition system provided in an embodiment of this application. The positional relationships between the devices, components, modules, etc. shown in the figure do not constitute any limitation. For example, in Figure 3, the data storage system 350 is an external memory relative to the execution device 310. In other cases, the data storage system 350 can also be placed in the execution device 310. As another example, in Figure 3, the terminal device 340 and the execution device 310 are two devices. In other cases, the terminal device 340 and the execution device 310 can also be integrated into one device.
[0182] A convolutional neural network (CNN) is a deep neural network with convolutional structures. It is a deep learning architecture, which refers to learning at multiple levels of abstraction using machine learning algorithms. As a deep learning architecture, CNN is a feed-forward artificial neural network where each neuron responds to overlapping regions in the input image.
[0183] For example, FIG4 is a schematic diagram of the structure of a CNN according to an embodiment of the present application. As shown in FIG4, the CNN 400 may include an input layer 410, a convolutional layer / pooling layer 420, wherein the pooling layer is optional, and a neural network layer 430.
[0184] Convolutional / pooling layers 420:
[0185] Convolutional layers:
[0186] As shown in Figure 4, the convolutional / pooling layer 420 may include layers 421-426 as in Examples 421. In one implementation, layer 421 is a convolutional layer, layer 422 is a pooling layer, layer 423 is a convolutional layer, layer 424 is a pooling layer, layer 425 is a convolutional layer, and layer 426 is a pooling layer. In another implementation, layers 421 and 422 are convolutional layers, layer 423 is a pooling layer, layers 424 and 425 are convolutional layers, and layer 426 is a pooling layer. That is, the output of the convolutional layer can be used as the input of a subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
[0187] Taking convolutional layer 421 as an example, it can include multiple convolution operators, also known as kernels. In image processing, a convolution operator acts as a filter, extracting specific information from the input image matrix. Essentially, a convolution operator can be a weight matrix, which is usually predefined. During the convolution operation, the weight matrix processes the input image pixel by pixel (or two pixels by two pixels, depending on the stride) along the horizontal direction, thus extracting specific features. The size of the weight matrix should be related to the image size. It's important to note that the depth dimension of the weight matrix is the same as the depth dimension of the input image; during convolution, the weight matrix extends to the entire depth of the input image. Therefore, convolution with a single weight matrix produces a single-depth convolutional output. However, in most cases, multiple weight matrices of the same dimension are applied instead of a single weight matrix. The outputs of each weight matrix are stacked to form the depth dimension of the convolutional image. Different weight matrices can be used to extract different features from an image. For example, one weight matrix can be used to extract image edge information, another weight matrix can be used to extract specific colors of the image, and yet another weight matrix can be used to blur unwanted noise in the image. These multiple weight matrices have the same dimension, and the feature maps extracted by these multiple weight matrices also have the same dimension. The extracted feature maps with the same dimension are then merged to form the output of the convolution operation.
[0188] The weight values in these weight matrices need to be obtained through extensive training in practical applications. The weight matrices formed by the weight values obtained through training can extract information from the input image, thereby helping CNN 400 to make correct predictions.
[0189] When a CNN 400 has multiple convolutional layers, the initial convolutional layers (e.g., 421) tend to extract more general features, which can also be called low-level features. As the depth of the CNN 400 increases, the features extracted by later convolutional layers (e.g., 426) become more and more complex, such as high-level semantic features. Features with higher semantic levels are more suitable for the problem to be solved.
[0190] Pooling layer:
[0191] Because it's often necessary to reduce the number of training parameters, pooling layers are frequently introduced periodically after convolutional layers, as illustrated in layers 421-426 of Figure 4 (420). This can be a convolutional layer followed by a pooling layer, or multiple convolutional layers followed by one or more pooling layers. In image processing, the sole purpose of pooling layers is to reduce the spatial size of the image. Pooling layers can include average pooling and / or max pooling operators to sample the input image to obtain a smaller image size. Average pooling calculates the average value of pixel values within a specific range. Max pooling takes the pixel with the largest value within a specific range as the result of max pooling. Furthermore, just as the size of the weight matrix in a convolutional layer should be related to the image size, the operators in a pooling layer should also be related to the image size. The size of the output image after pooling can be smaller than the size of the input image of the pooling layer. Each pixel in the output image represents the average or maximum value of the corresponding sub-region of the input image of the pooling layer.
[0192] Neural network layer 430:
[0193] After processing by the convolutional / pooling layers 420, the CNN 400 is still insufficient to output the required information. As mentioned earlier, the convolutional / pooling layers 420 only extract features and reduce the parameters introduced by the input image. However, to generate the final output information (the required class information or other relevant information), the CNN 400 needs to utilize neural network layers 430 to generate one or a set of required class numbers of output. Therefore, neural network layers 430 can include multiple hidden layers (431, 432 to 43n as shown in Figure 4) and an output layer 440. The parameters contained in these hidden layers can be pre-trained based on relevant data for a specific task type, such as image recognition, image classification, or image super-resolution reconstruction.
[0194] After the multiple hidden layers in neural network layer 430, which is the last layer of the entire CNN 400, is the output layer 440. The output layer 440 has a loss function similar to the classification cross-entropy, which is used to calculate the prediction error. Once the forward propagation of the entire CNN 400 is completed (as shown in Figure 4, the propagation from 410 to 440 is the forward propagation), the back propagation (as shown in Figure 4, the propagation from 440 to 410 is the back propagation) will begin to update the weight values and biases of the aforementioned layers, in order to reduce the loss of CNN 400 and the error between the result output by CNN 400 through the output layer and the ideal result.
[0195] It should be noted that the CNN 400 shown in Figure 4 is only an example of a convolutional neural network. In specific applications, convolutional neural networks can also exist in the form of other network models. For example, as shown in Figure 4, multiple convolutional / pooling layers can be parallelized, and the extracted features can be input into the full neural network layer 430 for processing.
[0196] Based on this, this application provides a method for generating dynamic visual effects to save storage space. In addition, by providing an efficient method for generating dynamic visual effects, it can not only meet the requirements of real-time performance, but also reduce power consumption.
[0197] Figure 5 is a flowchart of process 500 of the dynamic visual effects generation method provided in this application embodiment. Process 500 can be executed by the terminal device 100 described above. Process 500 is described as a series of steps or operations. It should be understood that process 500 can be executed in various orders and / or occur simultaneously, and is not limited to the execution order shown in Figure 5. Process 500 may include:
[0198] Step 501: Receive interactive instructions.
[0199] Interactive commands can be generated by a user's interaction with visual elements within the screen space. The screen space can refer to the screen space of a terminal device, including planar images displayed on the screen, or images with perspective effects. Visual elements are one or more digital elements within the aforementioned screen space, such as raindrops, snowflakes, or particles.
[0200] Users can perform interactive operations on visual elements in the screen space, such as clicking raindrops, smearing snowflakes, blowing away particles, etc. The aforementioned interactive operations can trigger the generation of corresponding interactive instructions, which may include coordinate information in the screen space, touch pressure (size and direction), volume (determined by the strength of the blowing), smearing range, etc.
[0201] It should be noted that users of this application can perform various interactive operations on visual elements. In addition to the aforementioned interactive operations, other operations may also be included, without specific limitations. Correspondingly, the interactive instructions generated by the interactive operations are also multi-mode, without specific limitations.
[0202] For example, the above interactive operations include at least one of the following:
[0203] (1) Flip-top terminal equipment;
[0204] For example, as shown in Figure 6a (Figure 6a is a schematic diagram of the gravity interaction operation of this application), the water flow emission surface is locked to the upper short side. At this time, the water flow changes its direction as the phone is tilted, and the flow direction conforms to the actual gravity direction. When the phone is changed from being laid flat to being picked up, the water flow speed increases; in the reverse process, the water flow speed decreases to stagnation; the water flow emission surface changes with the tilt of the phone (i.e., the water flow emission surface is always in the vertical direction), and at this time, the water flow direction is always in the vertical direction of the world coordinate system.
[0205] (2) Actions performed on the screen include clicking, pressing, or swiping;
[0206] For example, as shown in Figure 6b (Figure 6b is a schematic diagram of the click-touch interaction operation of this application), when touching the touch screen sensor of the mobile phone, including clicking, pressing, smearing, etc., the blocked water flow branches and the water droplets that are touched accelerate or disperse. As shown in Figure 6c (Figure 6c is a schematic diagram of the finger smearing touch interaction operation of this application), the area passed through becomes clear (eliminating fogging).
[0207] (3) Blow into the microphone;
[0208] For example, as shown in Figure 6d (Figure 6d is a schematic diagram of the blowing interaction operation of this application), the user blows into the mobile phone, and the phone's microphone records the real-time changes in volume. Control is then applied based on the volume of the blow, for example, controlling the speed at which water droplets disperse from a fixed point, or controlling the speed at which fog dissipates from a fixed point. If the mobile phone is equipped with multiple microphones, the position of the blowing point can also be controlled based on the volume of the sounds picked up by the multiple microphones.
[0209] (4) Eye gaze.
[0210] For example, the location of the water source can be controlled by capturing changes in the user's gaze point through the phone's camera.
[0211] (5) Gestures and gestures
[0212] For example, the spatial position changes of a user's hand gestures can be captured by the phone's camera, thereby controlling the water flow density and speed.
[0213] It should be noted that in this application, users can input interactive commands through any one of the above interactive operations or any combination of two or more of the above interactive operations. In addition, other interactive operations can be used to input interactive commands, and no specific limitation is made in this regard.
[0214] Step 502: Obtain the procedural rules corresponding to the interactive instructions.
[0215] Procedural generation technology is a technique for creating digital content based on algorithms rather than manual methods. Its core lies in the functional mathematical expression and parameterized control of digital content (which can be referred to as visual elements in this application, such as raindrops, snowflakes, fog, particles, quicksand, and gravity-sensing phenomena). It can generate deterministic content in real time by controlling parameters, or combine random functions to generate diverse results. Procedural rendering, based on procedural generation technology, expresses all modules of the traditional pipeline—including geometry, materials, lighting, physics, animation, camera, and post-processing—in functional form, and controls the input and output results of each module in real time through parameters, ultimately obtaining the rendered output.
[0216] In this application, procedural rules are used to generate dynamic visual effects after interactive operations are applied to visual elements. Procedural rules can be expressed as mapping relationships, such as functions. Functions typically include parameters and expressions. Controlling the parameters yields deterministic results, while random expressions yield diverse results. Procedural rules can generate dynamic visual effects (deterministic) in real-time after interactive operations are applied to visual elements based on the characteristics of the aforementioned functions, by controlling the parameters. Alternatively, they can combine random expressions to generate dynamic visual effects (diverse) in real-time after interactive operations are applied to visual elements. This is the main principle of the procedural generation technology in this application. Its purpose is to generate corresponding dynamic changes in visual elements, including pose changes, shape changes, and qualitative changes, after a user interacts with a visual element, and to render these dynamic changes as a real-time dynamic visual effect presented to the user. Interactive instructions can instruct the procedural parameters in the procedural rules, and / or instruct the matching of corresponding expressions from pre-set procedural rules. Therefore, based on the interactive instructions, the procedural rules required for subsequent processing can be determined, thereby generating the corresponding dynamic visual effects.
[0217] Compared to real-time rendering technology, this application eliminates the need to store materials during the intermediate process, saving storage space. Furthermore, the computational power required by procedural generation technology is far less than that of real-time rendering, which can improve the generation efficiency of dynamic visual effects and reduce the power consumption of terminal devices. Compared to other rendering technologies based on procedural generation technology, this application can respond to user interaction operations in real time, promptly presenting the dynamic visual effects generated by the user's interaction operations on visual elements, increasing the interactive enjoyment.
[0218] Step 503: Generate a first layer with dynamic visual effects according to procedural rules.
[0219] Dynamic visual effects include the visual characteristics of visual elements, which include geometric features (e.g., the shape and form of the visual element), material features (e.g., the texture, surface, and fill of the visual element), and motion features (e.g., at least one of speed, acceleration, or direction). It should be noted that, in addition to the aforementioned features, the visual characteristics of visual elements may also include other features, such as reflection, refraction, and light transmission. This application does not specifically limit the information included in the visual characteristics.
[0220] In one possible implementation, the method for generating a first layer with dynamic visual effects according to procedural rules may include: obtaining a first procedural parameter; defining procedural primitives corresponding to visual elements on a second layer according to the first procedural parameter, the second layer corresponding to the first layer; and performing procedural animation processing on the procedural primitives according to procedural rules to obtain the first layer (also known as a mask layer).
[0221] Optionally, the first procedural parameter can be set to the default value. For example, it can be set to the first procedural parameter corresponding to a rain motion effect, or the first procedural parameter corresponding to a particle occlusion motion effect.
[0222] Optionally, the first programmed parameter can be set by the user on the interactive interface. For example, Figures 7a-7e are schematic diagrams of the interactive interface of this application. As shown in Figures 7a and 7b, the interactive interface includes preset image settings, rainwater density settings, a gravity sensor switch, and a press feedback switch. Users can set these parameters according to their preferences on the interactive interface, thereby generating the corresponding first programmed parameter. Based on the user's settings, the effects shown in Figures 7c-7e can be obtained, with changes in the size and density of raindrops.
[0223] Optionally, the first programmed parameter can be obtained through system services. For example, the weather service in the system can provide real-time weather data, based on which the first programmed parameter corresponding to rain, cloudy, sunny, snow, etc., can be obtained.
[0224] It should be noted that, in addition to the methods described above, this application may also obtain the first procedural parameters through other means, without making any specific limitations.
[0225] In this application, the basic geometric primitives of the visual effect can be procedurally defined on the second layer (which can be the initial state of the first layer or the state of the first layer before the completion of this animation rendering, without specific limitations). This results in procedural primitives, which can use basic normal maps to express the geometric features of the corresponding visual elements (e.g., raindrops, fog, frost, etc.). That is, the screen space is divided into multiple layers of meshes in different ways. Each mesh carries several basic primitives based on local mesh coordinates. Different types of effects are expressed using one or more meshes. The mesh shape can be rectangular, polygonal, or any irregular shape, defined by function parameters. The smallest mesh can be a single pixel.
[0226] Optionally, this application can employ an efficient irregular primitive representation method based on regular primitive UV offset sampling. This method first calculates the normal vectors of regular primitives (e.g., a standard sphere) that are similar to the target irregular primitive (e.g., raindrop head, raindrop tail) using procedural generation techniques. Then, based on specific rules and noise, the UV sampling of the regular primitive normal vectors is offset during screen-space rendering, thereby directly obtaining the normal vectors of the irregular primitives and avoiding the recalculation of the irregular primitive normal vectors. For example, as shown in Figure 8a (Figure 8a is a schematic diagram of the fast irregular primitive representation method based on regular primitive normal UV sampling offset of this application), when generating a raindrop head, the algorithm samples the circle, keeping its lower half unchanged, while the UV of the upper half undergoes a secondary nonlinear stretching in the V direction, making the upper half of the circle approximately conical to simulate the deformation during the raindrop's fall.
[0227] Optionally, this application can employ a fast, random, irregular distribution generation algorithm based on nested primitives, using local grid coordinate calibration as the basic primitive. This method is based on regular grid division of the screen plane, combined with random offset of the local coordinate system of primitives within a single grid and random local merging of multiple grids, to achieve randomized generation of primitive distribution and size, simulating the real random distribution effect in nature (e.g., raindrops). For example, as shown in Figure 8b (Figure 8b is a schematic diagram of the fast, random, irregular distribution generation algorithm based on nested primitives of this application), the left side shows the raindrop generation result based on regular grids, where the white circular area represents the raindrop area, and the right side shows the raindrop generation result based on multi-layer nesting. That is, in the grid generation process, regular grids are first generated, and then based on specific noise, it is determined whether to further divide some regular grids into smaller grids and merge them into larger grids, thereby simulating the irregularity of raindrop distribution in nature.
[0228] Optionally, this application can employ a procedural, fast, low-power primitive intersection detection algorithm based on screen space. Unlike the serial method of performing intersection detection (whether a finger touches a primitive by traversing the screen one primitive at a time), the procedural method does not store the position and state information of individual primitive instances in list or numerical form. Therefore, the method of traversing primitives to find intersections is not applicable, and is inefficient and power-consuming. This application utilizes the mesh generated during the procedural primitive generation process to transform intersection detection into a distance mask calculation problem from a point to the mesh plane containing the primitive. This allows the intersection detection process to be parallelized in the fragment shader, avoiding traversal of the primitive list, and its computational cost is independent of the number of primitives. The algorithm logic of this method is very simple and easy to implement, with fast running speed and low power consumption. For example, as shown in Figure 8c (Figure 8c is a schematic diagram of parallel fast intersection detection in the fragment shader of this application), the example in the left figure is a screen with a width and height of 500 pixels and 700 pixels respectively. Its coordinate origin is located at the lower left corner. The algorithm divides the screen coordinates into a 5x7 grid of 100x100 pixels based on the grid generated by the primitives. Assuming that the screen coordinates of the finger touch are (340, 310), then its coordinates in the grid coordinates should be (340 / / 100, 310 / / 100). " / / " means rounding down to the grid coordinates (3,3). In the parallel calculation of screen pixels, the algorithm first converts the pixel coordinates to grid coordinates, and then calculates the distance to the grid coordinates (3,3) to generate a grid distance heatmap, as shown in the middle figure. Finally, the mask map of the clicked primitive can be obtained by a simple threshold calculation method (the area with a value of 1 is the area where interaction needs to be applied, and other areas are not changed), which is used to control the deformation control of the clicked raindrop.
[0229] Optionally, this application may employ a procedural lighting and material visual effect algorithm. Unlike physically based methods, procedural lighting and material processing does not calculate the complex reflection characteristics of light and materials based on optical physical properties. Instead, it calculates the brightness and darkness of different positions and the transition patterns of brightness and darkness based on primitive geometric information to quickly calculate the lighting layer. This layer is then overlaid with other layers in different ways to obtain lighting effects. During the overlay process, operations such as UV sampling offset can be added to obtain material effects such as refraction and reflection. For example, as shown in Figure 8d (Figure 8d is a schematic diagram of the procedural lighting generation of this application), the left image is a schematic diagram of raindrop highlights. In order to simulate the physical highlights of raindrops, the algorithm makes an efficient approximation of the highlight distribution of PBR (Physically Based Rendering) based on the distance from the pixel to the center of the raindrop during the calculation process. The middle image simulates the influence of the raindrop head and tail on the glass fog effect layer. The algorithm controls the effect of this layer by calculating the area mask of the raindrop and setting the transparency map. The greater the transparency, the clearer the background pixels are; the smaller the transparency, the larger the proportion of invalid layers. The right image is the fusion result of the left and middle images.
[0230] In this application, the method of performing procedural animation processing on procedural primitives according to procedural rules to obtain a first layer may include: obtaining a first visual feature corresponding to the procedural primitive at a first moment; obtaining a second visual feature corresponding to the procedural primitive at a second moment based on the first visual feature and in combination with procedural rules; and obtaining the first layer based on the second visual feature.
[0231] For example, the programmed geometric features, material features, and motion features (corresponding to the first visual features) at time T (the first time) are received. Combined with physical laws, the programmed geometric features, material features, and motion features (corresponding to the second visual features) at time T+1 (the second time) are calculated. The aforementioned first and second visual features can be referred to in the description of visual features above; the difference between the two is that they correspond to different times. Then, based on the programmed geometric features, material features, and motion features at time T+1, a series of time-series functions f(t) are provided to drive mesh movement and / or geometric primitive deformation for programmed primitives represented by one or more mesh layers, thereby obtaining a masking layer.
[0232] The aforementioned physical laws can be, for example, raindrops sliding along the direction of gravity, or icons falling along the direction of gravity. These laws conform to the motion characteristics of real elements in nature. Therefore, dynamic visual effects based on physical laws on the mask layer can enhance the user's sense of realism.
[0233] Optionally, random noise can be added to the definition of procedural primitives. This noise can be used for the procedural generation of irregular graphics, primitive distributions, and irregular motion patterns, so that procedural primitives and their motion or deformation have the randomness characteristics of real elements in nature.
[0234] Step 504: Blend the first layer with the background layer to obtain an image that includes the target visual effect.
[0235] Optionally, the background layer can be preset. For example, a default image provided by the system can be selected as the background layer, or a previously used background image can be selected as the background layer; there are no specific limitations on this. Optionally, the background layer can be entered by the user on the interactive interface. For example, as shown in Figure 7a, the user can select an image as the background layer on the interactive interface. In addition, the user can also upload a local image as the background layer on the interactive interface; there are no specific limitations on this.
[0236] For example, Figure 9 is a schematic diagram of layer blending in this application. As shown in Figure 9, the target visual effect includes a background layer, a mask layer, and a control layer. The control layer is optional. As a replacement control layer, it may also include icons, floating windows, and other content, without specific limitations.
[0237] By merging, dynamic visual effects can be displayed on the background layer, such as a rainy street scene or a frosty view outside a window. This makes the dynamic visual effects no longer monotonous and repetitive, but rather have a certain degree of randomness, making them more in line with real-world scenarios. Furthermore, by combining user interactions and displaying the real-time impact of those interactions on procedural primitives, the interactivity of the visual effects can be enhanced.
[0238] For example, Figures 10a-10c are schematic diagrams of the raindrop masking effect generated by the programmatic technology in the subject application of this application, respectively showing the raindrop masking effect on the always-on display (AOD) interface, the lock screen interface, and the desktop interface. The background layer is a street view image entered by the user in the interactive interface.
[0239] In one possible implementation, this application may also obtain the depth information of the background layer; based on this, a second procedural parameter is obtained, the second procedural parameter including the depth information; and the three-dimensional masking effect of the corresponding visual element is simulated on the layer in the corresponding screen space according to the second procedural parameter to obtain the masking layer.
[0240] In this application, the depth information of the background layer can be used as a parameter in the procedural visual effect masking to participate in the generation and fusion of volumetric visual effects, further generalizing the scenarios for visual effect generation and enhancing the dimensionality of layer fusion. During the procedural rendering stage, volumetric visual effects can utilize multi-frequency noise and depth to simulate the effect of 3D volumetric media in image space. In this process, depth information serves as the carrier of image space information, expanding the 2D image into a 2.5D space, allowing the generation of the volumetric visual effect mask and its fusion with the image to occur in 2.5D space. Since noise- and depth-based volumetric visual effects do not require structured primitives, this embodiment adopts field-based deformation and motion for its interaction method and logic. During interaction, users can not only change the overall movement direction and speed of the volumetric visual effect through clicks or drags, but also influence the rendering results of the volumetric visual effect within a certain distance based on the trajectory of the clicks or drags.
[0241] This application generates dynamic changes in corresponding visual elements—such as pose changes, shape changes, and qualitative changes—through procedural rules corresponding to user-input interactive commands, and renders these dynamic changes as real-time dynamic visual effects presented to the user. Compared to real-time rendering techniques, this application eliminates the need to store materials during the intermediate process, saving storage space. Furthermore, the computational power required by procedural generation technology is far less than that of real-time rendering, improving the efficiency of dynamic visual effect generation and reducing the power consumption of terminal devices. Compared to other rendering techniques based on procedural generation, this application can respond to user interactions in real time, promptly presenting the dynamic visual effects generated by the user's interactive operations on visual elements, increasing the interactive experience.
[0242] The technical solution of this application will be explained below using interactive raindrops as an example.
[0243] Figure 11 is a flowchart illustrating the process of generating an interactive mask layer using procedural technology in the application described in this application. As shown in Figure 11, the process includes: a preprocessing module, a noise module, a procedural geometry module, an interaction module, a physics module, a procedural motion effects module, and a rendering module. The background layer is user-inputted, and the interactive raindrops can generate interactive masks frame-by-frame at runtime, overlaying them with the background layer to output the final result.
[0244] Preprocessing Module: The input data for the preprocessing module is the background layer, and the output mainly consists of three items: 1) a depth map and the normal map calculated from it; 2) color correction or albedo mapping; 3) the blurred image. The results of depth, normal, and color correction will be used in subsequent lighting and material calculations.
[0245] Noise Module: This module provides the ability to generate continuous random noise in 2D or 3D space. It can be used for the procedural generation of irregular graphics, primitive distributions, and irregular motion patterns, so that the generated primitives and motion patterns have the randomness characteristics of real elements in nature.
[0246] Procedural geometry module: This module procedurally defines the basic geometric primitives that complete the visual effects in screen space, including raindrops, fog, frost, etc. It generates basic normal maps to express the geometric features of the primitives. The screen space is divided into multiple layers of grids in different ways. Each grid carries several basic primitives based on local grid coordinates. Different types of effects are expressed using one or more grids. The grid shape can be rectangular, polygonal, or any irregular shape, and is defined by function parameters. The smallest grid is one pixel.
[0247] Interaction module: Receives external interaction signals, such as IMU and finger touch, and translates them into commands such as collision, merging, disturbance, and smearing according to predefined logic. It converts the commands into executable function parameters of specific grids and primitives in the action screen space to control grid movement and primitive deformation and other animation effects in real time.
[0248] Physics module: Receives the programmed geometry and motion information at time T, combines it with the defined physical laws, calculates the programmed geometry and motion parameters at time T+1, and outputs them to the motion effects module to drive the motion of the graphics according to the physical laws.
[0249] The procedural animation module provides a series of time-series functions f(t) based on the input multi-layer mesh and primitives to drive mesh movement and geometric primitive deformation animation. It also accepts input from the interaction module to execute specific interactive effects (such as smearing, collision, splitting, etc.).
[0250] Rendering module: Obtains screen space geometry information for each frame or several frames, and combines it with specific lighting and material input parameters to generate corresponding pixel-by-pixel rendering information and complete the shading of the mask layer; overlays the motion mask layer with the background image for rendering and outputs the final result.
[0251] An example is the process of applying procedural generation technology to a mobile operating system theme. The mobile theme application resides in the system's application layer. By calling the procedural rendering engine service, it completes the rendering and drawing of the theme wallpaper and ultimately presents it to the system interface. As shown in Figure 12 (Figure 12 is a flowchart of the theme application of this application), the theme application contains a description of a theme, such as an XML configuration file, which defines the theme-related visual effects information, such as the duration of the effect, supported interaction types, and display resolution. The theme application also includes a theme customization interface (corresponding to the interactive interface mentioned above), where users can set personalized content, such as selecting an image from their phone's gallery as the background image for the masking effect, as shown in Figure 6a. Users can also adjust the procedurally generated parameters in the customization interface, such as selecting the density of the rendered raindrops, gravity sensing, and pressure feedback, as shown in Figure 6b. The theme application also needs to call the system's interactive input to pass to the procedural rendering part. For example, if the raindrop theme supports user clicks on raindrops, the click event received by the system needs to be passed down.
[0252] In the programmatic rendering section, the programmatic rendering engine receives the configuration items from the theme application and starts the rendering process. When an interaction command is received, it receives information about the interaction operation, such as the coordinates of the click. The interaction module locates the primitives affected by the click and their rendering parameters based on the click coordinates. It calculates the changes in rendering parameters under the influence of the interaction (such as deformation, displacement, deletion, etc.) through the primitive's physical properties and calculates the rendering effect at the corresponding location. The rendered result is passed to the final rendering service through the wallpaper management module and overlaid with the control layer in the rendering service, as shown in Figure 9.
[0253] Figure 13 is a flowchart of the animation effect generation process for the raindrop theme in this application. As shown in Figure 13, the mask layer includes three layers: a static layer, a dynamic layer, and a flowing layer. Evaporation simulation is achieved on the static layer using procedural generation technology, while motion simulation is performed on the dynamic and flowing layers using procedural generation technology, respectively realizing water droplet animation and flowing animation. The procedural parameters required for the flowing animation can come from user interaction (e.g., click interaction) and / or IMU gravity data. The three layers that realize the animation effect are merged to obtain the mask layer, which is then superimposed on the background layer to obtain the final effect. The background layer can be an image selected by the user in the custom interface or a preset image.
[0254] In one possible implementation, procedural generation technology can support continuous playback of motion graphics. For example, Figures 14a and 14b are schematic diagrams illustrating continuous playback of motion graphics according to this application. As shown in Figures 14a and 14b, they respectively present screenshots of the first and last frames of an animation transitioning a mobile phone theme interface. Procedural generation can generate coherent animation frames through continuous input time changes. For example, if the transition animation duration of the theme interface is 1 second, the control layer will complete the animation of brightening and zooming within this 1 second. Simultaneously, the procedurally generated mask layer can complete the visual effect animation of raindrops flowing and fog blurring.
[0255] The above embodiments provide convenient dynamic visual effects capabilities for system themes through procedural generation technology. Only images need to be input as background layers, which greatly reduces the reliance on 3D assets and significantly reduces the workload of maintenance. User customization functions can also be opened through the interactive interface. Interactive themes that support real-time rendering enhance the fun and playability of theme applications.
[0256] Figure 15 is a flowchart of the procedural volumetric visual effect control generation and fusion process based on image depth according to this application. As shown in Figure 15, this embodiment describes the process of applying procedural volumetric visual effect rendering capabilities to a mobile operating system theme. Compared with planar masking, this embodiment uses image depth information as a parameter of the procedural visual effect mask to participate in the generation and fusion of volumetric visual effects, further generalizing the scenarios for visual effect generation and improving the dimension of layer fusion. The aforementioned depth information can be obtained based on AI technology, which can be referred to in the relevant description above and will not be repeated here.
[0257] In the procedural rendering stage, volumetric visual effects differ from procedural algorithms based on structured primitives (such as circles, rectangles, and triangles) like raindrops. Their core principle is to utilize multi-frequency noise and depth to simulate the effect of 3D volumetric media in image space. In this process, depth information serves as the carrier of image spatial information, expanding the 2D image into a 2.5D space, allowing the generation of volumetric visual effect masks and their fusion with the image to occur in 2.5D space.
[0258] Noise- and depth-based volumetric visual effects do not use structured primitives; therefore, this embodiment employs field-based deformation and motion for its interaction methods and logic. During interaction, users can change the overall direction and speed of the volumetric visual effect through clicks or drags, and can also influence the rendering results of the volumetric visual effect within a certain distance based on the trajectory of the clicks or drags. For example, Figures 16a and 16b are schematic diagrams of the fog effect interaction of this application. Figure 16a shows the fog effect layer interaction result, and Figure 16b shows the fog effect layer interaction and layer fusion result. It can be seen that the depth information based on the image further enriches the interaction method of the procedural visual effect mask.
[0259] The above embodiments, based on image depth, simulate volumetric visual effects (fog, haze, etc.), enabling expensive volumetric material calculations to run on the device with extremely low power consumption and supporting interaction. In addition, images can not only serve as background layers for masks, but the application of depth information recovered by deep neural networks in the mask allows for a higher-dimensional fusion of procedural effects and image spatial information, enhancing realism.
[0260] Figure 17 is a structural schematic diagram of the dynamic visual effects generation device 1700 of this application. As shown in Figure 17, the dynamic visual effects generation device 1700 of this embodiment can be applied to the aforementioned terminal device. The dynamic visual effects generation device 1700 may include: a receiving module 1701, a processing module 1702, and a fusion module 1703, wherein...
[0261] The receiving module 1701 is used to receive an interaction instruction, which is generated by a user's interaction with a visual element within the screen space; the processing module 1702 is used to obtain a procedural rule corresponding to the interaction instruction, which is used to generate a dynamic visual effect after the interaction is applied to the visual element; and to generate a first layer with a dynamic visual effect according to the procedural rule, wherein the dynamic visual effect includes the visual features of the visual element, and the visual features include geometric features, material features, and motion features; the fusion module 1703 is used to fuse the first layer with the background layer to obtain an image containing the target visual effect.
[0262] In one possible implementation, the processing module 1702 is specifically used to obtain a first procedural parameter; define a procedural primitive corresponding to the visual element on a second layer according to the first procedural parameter, the second layer corresponding to the first layer; and perform procedural animation processing on the procedural primitive according to the procedural rules to obtain the first layer.
[0263] In one possible implementation, the processing module 1702 is specifically used to obtain a first visual feature corresponding to the procedural primitive at a first time, the first visual feature including a first geometric feature, a first material feature, and a first motion feature; obtain a second visual feature corresponding to the procedural primitive at a second time based on the first visual feature and in combination with the procedural rules, the second visual feature including a second geometric feature, a second material feature, and a second motion feature; and obtain the first layer based on the second visual feature.
[0264] In one possible implementation, the first programmatic parameter is set by default; or, the first programmatic parameter is set by the user on the interactive interface; or, the first programmatic parameter is obtained through a system service.
[0265] In one possible implementation, the interactive operation includes at least one of the following: flipping the terminal device; or, clicking, pressing, or swiping on the screen; or, blowing into the microphone; or, looking at the eyes; or, gestures.
[0266] In one possible implementation, the motion feature includes at least one of velocity, acceleration, or direction.
[0267] In one possible implementation, the background layer is preset; or, the background layer is entered by the user on the interactive interface.
[0268] In one possible implementation, the processing module 1702 is further configured to obtain the depth information of the background layer; obtain a second procedural parameter, the second procedural parameter including the depth information; and simulate a three-dimensional masking effect of the corresponding visual element on the second layer according to the second procedural parameter to obtain the first layer.
[0269] The apparatus in this embodiment can be used to execute the technical solution of the method embodiment shown in FIG5. Its implementation principle and technical effect are similar, and will not be described again here.
[0270] It is understood that, in order to achieve the above-mentioned functions, the terminal device includes hardware and / or software modules that perform the respective functions. Based on the algorithm steps of the examples described in the embodiments disclosed herein, this application can be implemented in hardware or a combination of hardware and computer software. Whether a function is executed by hardware or by computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application in conjunction with the embodiments, but such implementation should not be considered beyond the scope of this application.
[0271] In one example, FIG18 shows a schematic block diagram of an apparatus 1800 according to an embodiment of the present application. The apparatus 1800 may include a processor 1801 and a transceiver / transceiver pin 1802, and optionally, a memory 1803.
[0272] The various components of device 1800 are coupled together via bus 1804, which includes a data bus, a power bus, a control bus, and a status signal bus. However, for clarity, all buses are referred to as bus 1804 in the figure.
[0273] Optionally, the memory 1803 can be used for the instructions in the foregoing method embodiments. The processor 1801 can be used to execute the instructions in the memory 1803, control the receive pin to receive signals, and control the transmit pin to transmit signals.
[0274] Device 1800 may be a terminal device or a chip of a terminal device in the above method embodiments.
[0275] All relevant content of each step involved in the above method embodiments can be referenced from the functional description of the corresponding functional module, and will not be repeated here.
[0276] This embodiment also provides a computer storage medium storing computer instructions. When the computer instructions are executed on a terminal device, the terminal device performs the aforementioned method steps to implement the dynamic visual effect generation method in the above embodiment.
[0277] This embodiment also provides a computer program product that, when run on a computer, causes the computer to perform the aforementioned steps to realize the dynamic visual effect generation method in the above embodiment.
[0278] In addition, embodiments of this application also provide an apparatus, which may specifically be a chip, component or module. The apparatus may include a connected processor and a memory. The memory is used to store computer execution instructions. When the apparatus is running, the processor can execute the computer execution instructions stored in the memory to cause the chip to execute the dynamic visual effect generation method in the above method embodiments.
[0279] In this embodiment, the terminal device, computer storage medium, computer program product or chip are all used to execute the corresponding methods provided above. Therefore, the beneficial effects they can achieve can be referred to the beneficial effects in the corresponding methods provided above, and will not be repeated here.
[0280] Through the above description of the embodiments, those skilled in the art will understand that, for the sake of convenience and brevity, only the division of the above functional modules is used as an example. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above.
[0281] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another apparatus, or some features may be ignored or not executed. Furthermore, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0282] The units described as separate components may or may not be physically separate. A component shown as a unit can be one or more physical units; that is, it can be located in one place or distributed in multiple different locations. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0283] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.
[0284] Any content in the various embodiments of this application, as well as any content in the same embodiment, can be freely combined. Any combination of the above content is within the scope of this application.
[0285] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. Based on this understanding, the technical solutions of the embodiments of this application, in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, can be embodied in the form of a software product. This software product is stored in a storage medium and includes several instructions to cause a device (which may be a microcontroller, chip, etc.) or processor to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0286] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.
[0287] The steps of the methods or algorithms described in conjunction with the embodiments of this application can be implemented in hardware or by a processor executing software instructions. The software instructions can consist of corresponding software modules, which can be stored in random access memory (RAM), flash memory, read-only memory (ROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), registers, hard disks, portable hard disks, CD-ROMs, or any other form of storage medium well known in the art. An exemplary storage medium is coupled to a processor, enabling the processor to read information from and write information to the storage medium. Of course, the storage medium can also be a component of the processor. The processor and the storage medium can reside in an ASIC.
[0288] Those skilled in the art will recognize that the functions described in the embodiments of this application in one or more of the above examples can be implemented using hardware, software, firmware, or any combination thereof. When implemented using software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium. Computer-readable media include computer storage media and communication media, wherein communication media include any medium that facilitates the transfer of a computer program from one place to another. Storage media can be any available medium that can be accessed by a general-purpose or special-purpose computer.
[0289] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.
Claims
1. A method for generating dynamic visual effects, characterized in that, include: Receive interaction instructions, which are generated by the user's interaction with visual elements within the screen space; Obtain the programmatic rules corresponding to the interaction instructions, and the programmatic rules are used to generate the dynamic visual effects after the interaction operation is applied to the visual element; A first layer with dynamic visual effects is generated according to the procedural rules. The dynamic visual effects include the visual features of the visual elements, including geometric features, material features, and motion features. The first layer is blended with the background layer to obtain an image that includes the target visual effect.
2. The method according to claim 1, characterized in that, The step of generating a first layer with dynamic visual effects according to the procedural rules includes: Obtain the first procedural parameter; Based on the first procedural parameters, procedural primitives corresponding to the visual elements are defined on the second layer, and the second layer corresponds to the first layer; The procedural primitives are processed with procedural animation effects according to the procedural rules to obtain the first layer.
3. The method according to claim 2, characterized in that, The step of performing procedural animation processing on the procedural primitives according to the procedural rules to obtain the first layer includes: Obtain the first visual feature corresponding to the programmed primitive at the first moment, the first visual feature including the first geometric feature, the first material feature and the first motion feature; Based on the first visual feature, and combined with the procedural rules, the second visual feature corresponding to the procedural primitive at the second time is obtained. The second visual feature includes a second geometric feature, a second material feature, and a second motion feature. The first layer is obtained based on the second visual feature.
4. The method according to claim 2 or 3, characterized in that, The first programmed parameter is set by default; or, the first programmed parameter is set by the user on the interactive interface; or, the first programmed parameter is obtained through a system service.
5. The method according to any one of claims 1-4, characterized in that, The interactive operation includes at least one of the following: Flip the terminal device; or, Actions performed on the screen include tapping, pressing, or swiping; or, Blow into the microphone; or, Eye gaze; or, Gestures, gestures, and movements.
6. The method according to any one of claims 1-5, characterized in that, The motion characteristic includes at least one of velocity, acceleration, or direction.
7. The method according to any one of claims 1-6, characterized in that, The background layer is preset; or the background layer is entered by the user on the interactive interface.
8. The method according to any one of claims 1-7, characterized in that, Before generating the first layer with dynamic visual effects according to the procedural rules, the process also includes: Obtain the depth information of the background layer; The step of generating a first layer with dynamic visual effects according to the procedural rules includes: Obtain a second procedural parameter, the second procedural parameter including the depth information; Based on the second procedural parameters, simulate the three-dimensional masking effect of the corresponding visual elements on the second layer to obtain the first layer.
9. A dynamic visual effects generation device, characterized in that, include: A receiving module is used to receive interactive instructions, which are generated by the user's interactive operation on visual elements within the screen space. The processing module is used to obtain the procedural rules corresponding to the interaction instructions, the procedural rules are used to generate dynamic visual effects after the interaction operation is applied to the visual element; and to generate a first layer with dynamic visual effects according to the procedural rules, the dynamic visual effects including the visual features of the visual element, the visual features including geometric features, material features and motion features. The blending module is used to blend the first layer with the background layer to obtain an image containing the target visual effect.
10. The apparatus according to claim 9, characterized in that, The processing module is specifically used to obtain a first procedural parameter; define a procedural primitive corresponding to the visual element on a second layer according to the first procedural parameter, the second layer corresponding to the first layer; and perform procedural animation processing on the procedural primitive according to the procedural rules to obtain the first layer.
11. The apparatus according to claim 10, characterized in that, The processing module is specifically used to obtain the first visual feature corresponding to the programmed primitive at the first moment, the first visual feature including the first geometric feature, the first material feature and the first motion feature; Based on the first visual feature, and combined with the procedural rules, the second visual feature corresponding to the procedural primitive at the second time is obtained. The second visual feature includes a second geometric feature, a second material feature, and a second motion feature. The first layer is obtained based on the second visual feature.
12. The apparatus according to claim 10 or 11, characterized in that, The first programmed parameter is set by default; or, the first programmed parameter is set by the user on the interactive interface; or, the first programmed parameter is obtained through a system service.
13. The apparatus according to any one of claims 9-12, characterized in that, The interactive operation includes at least one of the following: Flip the terminal device; or, Actions performed on the screen include tapping, pressing, or swiping; or, Blow into the microphone; or, Eye gaze; or, Gestures, gestures, and movements.
14. The apparatus according to any one of claims 9-13, characterized in that, The motion characteristic includes at least one of velocity, acceleration, or direction.
15. The apparatus according to any one of claims 9-14, characterized in that, The background layer is preset; or the background layer is entered by the user on the interactive interface.
16. The apparatus according to any one of claims 9-15, characterized in that, The processing module is further configured to obtain the depth information of the background layer; obtain a second programmed parameter, the second programmed parameter including the depth information; and simulate a three-dimensional masking effect of the corresponding visual element on the second layer according to the second programmed parameter to obtain the first layer.
17. A terminal device, characterized in that, include: One or more processors; monitor; Memory, used to store one or more programs; The display is used to show the interactive interface; When the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-8.
18. A computer-readable storage medium, characterized in that, Includes a computer program, which, when executed on a computer, causes the computer to perform the method of any one of claims 1-8.
19. A computer program product, characterized in that, The computer program product includes computer program code that, when run on a computer, causes the computer to perform the method of any one of claims 1-8.