Method for processing skeletal animation data
By optimizing the storage structure in skeletal animation textures and utilizing bilinear filtering in graphics processors, the problem of smooth transitions in low frame rate animations was solved, computational and storage overhead was reduced, and the performance of large-scale character rendering was improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHENGDU LEGOU TECHNOLOGY CO LTD
- Filing Date
- 2026-03-31
- Publication Date
- 2026-06-19
Smart Images

Figure CN122244252A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of skeletal animation technology, and more specifically, to a method for processing skeletal animation data. Background Technology
[0002] Skeletal animation is an animation technique that uses virtual skeletons to move and deform character models. It is widely used in games, virtual reality, film and television production, and other fields. In scenes that require rendering a large number of characters, the traditional approach involves the central processing unit (CPU) calculating the skeletal animation and performing skinning for each character individually, resulting in severe performance bottlenecks.
[0003] To alleviate the performance bottleneck of the central processing unit (CPU) in scenarios involving the rendering of numerous characters, the industry has proposed a skeletal animation texture solution for graphics processing units (GPUs). This solution involves pre-baking animation data into texture images, which are then read by the GPU for skinning calculations. In the offline phase, the transformation matrices of all bones in each frame of the animation clip are written to the animation texture frame by frame in the order of "traversing all bones first, then traversing the next frame." Specifically, the data of all bones in frame zero is stored first, followed by the data of all bones in frame one. Because the GPU needs to precisely read the data of a specific frame and a specific bone, the filtering mode of the animation texture can only be set to point sampling.
[0004] In existing solutions, point sampling at low baking frame rates leads to a lack of smooth transitions between frames, resulting in stiff animation. However, improving smoothness by increasing the baking frame rate exponentially increases the storage space for animation textures, causing a sharp rise in memory usage and bandwidth pressure. Therefore, achieving smooth transitions in low frame rate animations without increasing storage overhead has become a pressing technical problem to be solved in this field. Summary of the Invention
[0005] The purpose of this application is to provide a method for processing skeletal animation data, which can achieve smooth transitions in low frame rate animations without increasing storage overhead.
[0006] This application is implemented as follows: This application provides a method for processing skeletal animation data, comprising the following steps: obtaining the identifier of an animation segment to be played and animation playback parameters of a character model, wherein the character model includes at least one bone. Based on the identifier of the animation segment to be played, determining the corresponding animation texture from texture resources, wherein the animation texture uses bilinear filtering and stores the transformation data of each bone of the corresponding animation segment in each frame, wherein the transformation data of each bone is split into multiple rows, each row corresponding to one texture pixel, and the data of the same row of the same bone is stored consecutively in adjacent pixel positions according to frame order. Based on the identifier of the animation segment to be played and the animation playback parameters, performing bilinear filtering and texture coordinate offset sampling on the animation texture using a graphics processor to obtain skeletal transformation data. Using the skeletal transformation data, performing skinning transformation on each bone node of the character model to generate the corresponding character animation.
[0007] Compared with the prior art, this application has at least the following advantages or beneficial effects: This application proposes a method for processing skeletal animation data. By changing the storage structure of animation textures, the transformation data of each bone is split into multiple rows, each row corresponding to a texture pixel. Data from the same row of the same bone is stored sequentially in adjacent pixel positions. This storage structure makes adjacent frame data physical neighbors, creating conditions for using bilinear filtering. This allows the graphics processor's texture filtering hardware to automatically perform linear blending when reading data from two adjacent frames, completing data reading and inter-frame interpolation simultaneously with a single sampling. Thus, without increasing texture storage space, smooth inter-frame transitions of low-frame-rate baked data can be achieved using graphics processor hardware. Furthermore, it reduces the number of vertex shader instructions and texture bandwidth consumption, thereby freeing up the computing resources of the shader cores and enabling the graphics processor to handle large-scale character rendering scenes more efficiently. In other words, the solution proposed in this application can achieve smooth transitions in low-frame-rate animations without increasing storage overhead. Attached Figure Description
[0008] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.
[0009] Figure 1 This is a flowchart of an embodiment of a method for processing skeletal animation data according to this application; Figure 2 This is a flowchart illustrating the baking process of animated textures in a texture resource according to one embodiment of this application; Figure 3This is a flowchart of the baking process for animated textures in texture resources in another embodiment of this application; Figure 4 This is a flowchart illustrating the specific process of obtaining skeletal transformation data in one embodiment of this application. Detailed Implementation
[0010] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. It should be understood that this application is not limited to the exemplary embodiments described herein.
[0011] In this document, relational terms such as first and second are used only to distinguish one entity or operation from another entity or operation, without necessarily requiring or implying any such actual relationship or order between these entities or operations.
[0012] In existing GPU-based skeletal animation texture schemes, during offline editing, the character's skeletal animation is sampled frame by frame. The transformation matrix of each bone in each frame is then written into the pixels of a 2D texture, following the order of traversing all bones in the current frame and then the next frame. At runtime, in the GPU's vertex shader, the texture coordinates of the corresponding pixel are calculated based on the currently playing animation frame number and bone number. The bone transformation matrix is then read from the texture, and skinning transformations are performed on the vertices. Because precise reading of data from specific frames and specific bones is required, the texture's filtering mode can only be set to point sampling, meaning it only returns the precise value of a single pixel without mixing with neighboring pixels.
[0013] In developing this application, the inventors discovered that in existing solutions, the lack of smooth transitions between frames due to the sequential storage of animation textures frame by frame results in noticeable frame skipping and stuttering when the baked frame rate is low (e.g., 15fps). To mitigate this, the baked frame rate must be increased from 15fps to 30fps or even 60fps, but this multiplies the size of the animation textures, leading to a sharp increase in memory usage, which is unsustainable on memory-constrained devices (e.g., mobile devices). It should be noted that when a game scene requires rendering a large number of characters simultaneously (e.g., large-scale legions or siege battles in SLG games, where hundreds or even thousands of characters can be on screen at the same time), existing solutions require storing a large amount of animation textures, and the GPU-side skinning calculations consume significant computational resources, resulting in a sharp drop in frame rate.
[0014] In response to the problem that the existing solution uses sequential frame-by-frame storage of animation textures, resulting in a lack of smooth transitions between frames, and that increasing the frame rate would cause a doubling of storage and computational overhead, the inventors further analyzed and found that the root cause lies in the texture storage method, which makes the data of adjacent frames of the same bone not physically adjacent in the texture. This causes the bilinear filtering function of the graphics processor to be unable to be used for inter-frame data mixing, and only point sampling can be used.
[0015] Based on this analysis, this application optimizes the storage structure of animation textures during the offline baking stage. Specifically, the transformation data of each bone is split into multiple rows, each row corresponding to a texture pixel. Data from the same row of the same bone is stored consecutively in adjacent pixel positions according to frame order, thus ensuring that the data of adjacent frames of the same bone are physically adjacent. Simultaneously, the filtering mode of the animation texture is set to bilinear filtering. Therefore, during GPU rendering, since adjacent frame data are adjacent in the texture, the graphics processor's texture filtering hardware can automatically complete the linear blending of adjacent frame data simply by offsetting the texture coordinates to place the sampling point between pixels of two frames. This process requires only one texture sampling, and the shader does not need to perform additional interpolation calculations, reducing both the number of samplings and the amount of computational instructions, resulting in significant performance advantages in large-scale character rendering scenarios. In this way, a smooth transition of low frame rate animations can be achieved without significantly increasing storage overhead.
[0016] After introducing the basic principles of this application, various non-limiting embodiments of this application will be described in detail below with reference to the accompanying drawings. Unless otherwise specified, the various embodiments and features described below can be combined with each other.
[0017] Please see Figure 1 The method for processing this skeletal animation data includes the following steps: Step S101: Obtain the identifier of the animation segment to be played and the animation playback parameters of the character model, wherein the character model includes at least one skeleton.
[0018] It's important to note that in rendering a character, the first step is to determine the character model to be rendered and the animation segment to be played for that model. Specifically, it's necessary to obtain the character model's animation segment identifier, which uniquely indicates the animation segment to be played, such as an identifier for a "walking" or "attack" animation. Simultaneously, it's necessary to obtain the animation playback parameters, which include at least the playback start time to determine when the animation begins playing. The character model itself contains at least one skeleton, which constitutes the skeletal system that drives the model's deformation.
[0019] Step S102: Based on the animation identifier to be played, determine the corresponding animation texture from the texture resource. The filtering mode of the animation texture is bilinear filtering, and it stores the transformation data of each bone of the corresponding animation segment in each frame. The transformation data of each bone is split into multiple rows, each row corresponding to a texture pixel. The same row of data of the same bone is stored continuously in adjacent pixel positions in frame order.
[0020] Based on the acquisition of the animation identifier to be played in step S101, step S102 determines the animation texture corresponding to the identifier from the pre-stored texture resources. This animation texture is pre-generated during the offline baking stage, and its filtering mode is set to bilinear filtering. This setting allows the graphics processor to automatically perform linear blending of adjacent pixels during sampling. It should be noted that in existing solutions, because adjacent frame data are not adjacent in the animation texture (they are far apart), even with bilinear filtering enabled, the data mixed is from other bones, resulting in errors. Therefore, existing solutions can only use point sampling.
[0021] To enable the graphics processor to automatically perform linear blending of adjacent pixels during sampling, the storage structure of animation textures has the following characteristics: For each bone, its transformation data is split into multiple rows, with each row corresponding to a pixel in the texture. More importantly, for the same row of data within the same bone, the pixel values in different frames are stored sequentially in adjacent pixel positions. In other words, a row of data in frame zero of the same bone is adjacent to the same row of data in frame one, and the first frame is adjacent to the second frame, and so on. This storage structure makes the data of adjacent frames of the same bone physically neighbors, laying the foundation for subsequent inter-frame interpolation using graphics processor hardware.
[0022] In other words, the differences between traditional animation textures and the animation textures of this application are as follows: The traditional method writes the texture pixels by first traversing all bones and then moving to the next frame. The texture pixel arrangement is as follows: [Bone0_Frame0_Row0][Bone0_Frame0_Row1][Bone0_Frame0_Row2] [Bone1_Frame0_Row0][Bone1_Frame0_Row1][Bone1_Frame0_Row2] ... (Frame0 data for all bones) [Bone0_Frame1_Row0][Bone0_Frame1_Row1][Bone0_Frame1_Row2] ... (Frame 1 data for all bones) In this approach, the data of adjacent frames on the same bone and in the same row are far apart. That is, the data of adjacent frames on the same bone are separated by the number of bones multiplied by the number of pixels per bone in the texture, making it impossible to utilize hardware bilinear interpolation. Here, Row represents a row.
[0023] This application uses a continuous inter-frame arrangement method, writing the data in the order of first traversing the skeleton, then traversing each row of each skeleton, and finally traversing each sampled frame. For a certain animation segment, let the frame number = F and the starting index = S, then the texture pixel arrangement is as follows: Bone 0 - Row 0: [S+0][S+1][S+2]... [S+(F-1)]……← Row 0 of frame F is arranged consecutively. Bone 0 - Row 1: [S+F][S+F+1]... [S+2F-1]……← Row 1 (first row) of frame F arranged consecutively. Bone 0 - Row 2: [S+2F] [S+2F+1]... [S+3F-1]……← Row 2 (second row) of frame F arranged consecutively. Bone 1 - Row 0: [S+3F] [S+3F+1]... [S+4F-1] ... The inter-frame continuous arrangement method adopted in this application makes the data of adjacent frames in the same row of the same bone adjacent to each other in the animation texture, which creates the conditions for directly using the GPU's texture filtering hardware to perform hardware bilinear interpolation to achieve smooth inter-frame transitions.
[0024] For ease of understanding, assume there are 2 skeletons, and the transformation data for each skeleton is split into 3 rows, totaling 3 frames of data. The difference between the two is as follows: The traditional layout is as follows: Skeleton 0, frame 0, line 0; Skeleton 0, frame 0, line 1; Skeleton 0, frame 0, line 2 Skeleton 1, frame 0, line 0; Skeleton 1, frame 0, line 1; Skeleton 1, frame 0, line 2 ... (0th frame data for all bones) Skeleton 0, frame 1, line 0; Skeleton 0, frame 1, line 1; Skeleton 0, frame 1, line 2 ... (First frame data of all bones) The inter-frame continuous arrangement method of this application is as follows: For a given animation segment, assuming 2 bones, 3 frames, and a starting index of 0: Skeleton 0 row 0: index 0, index 1, index 2 Skeleton 0 row 1: Index 3, Index 4, Index 5 Skeleton 0, line 2: Index 6, Index 7, Index 8 Skeleton 1 row 0: index 9, index 10, index 11 Skeleton 1 row 1: Index 12, Index 13, Index 14 Skeleton 1, row 2: Index 15, Index 16, Index 17 In this dataset, indices 0, 3, 6, 9, 12, and 15 correspond to the data in frame 0; indices 1, 4, 7, 10, 13, and 16 correspond to the data in frame 1; and indices 2, 5, 8, 11, 14, and 17 correspond to the data in frame 2. It can be seen that the same row of data for the same bone is arranged consecutively and adjacently in different frames. For example, the data for row 0 of bone 0 in frames 0, 1, and 2 is stored in indices 0, 1, and 2 respectively, and they are adjacent to each other. When the runtime sampling point is between indices 0 and 1, the graphics processor's bilinear filtering hardware can proportionally mix the data from frame 0 and frame 1 to achieve a smooth transition between frames.
[0025] Please continue reading. Figure 1 Step S103: Based on the identifier of the animation segment to be played and the animation playback parameters, the graphics processor is used to perform bilinear filtering and texture coordinate offset sampling on the animation texture to obtain skeletal transformation data.
[0026] It's important to note that existing solutions, due to their frame-by-frame sequential texture storage, result in significant gaps between adjacent frames of the same bone within the animation texture. This prevents the use of hardware bilinear interpolation for inter-frame blending, forcing the reliance on point sampling to precisely read single-frame data. To achieve smooth inter-frame transitions, existing solutions require two texture sampling operations in the shader, reading bone data from frame f and frame f+1 respectively, followed by linear interpolation calculations. The result is calculated as a product of one interpolation factor multiplied by the data from frame f, plus the interpolation factor multiplied by the data from frame f+1. Furthermore, since the transformation matrix for each bone is typically split into multiple rows, with each row corresponding to one sampling and interpolation operation, the actual number of sampling and interpolation operations increases exponentially with the number of matrix rows. This approach, requiring two texture sampling operations and one linear interpolation operation, increases the vertex shader's instruction count and texture bandwidth consumption.
[0027] In step S103 above, since the determined animation texture uses the improved storage structure from step S102, the operation is completely different when using the graphics processor to perform sampling to obtain skeletal transformation data: during sampling, only a horizontal offset needs to be applied to the sampling start position according to the inter-frame interpolation factor, so that the sampling point is located between pixels of two adjacent frames, and the texture filtering hardware of the graphics processor automatically completes the linear blending of data from two adjacent frames. A single texture sampling simultaneously completes data reading and inter-frame interpolation, and the shader does not need to perform additional interpolation calculations. In large-scale character rendering scenes, this optimization significantly reduces the number of texture samplings and the amount of computational instructions, effectively reducing the load on the graphics processor. That is, compared to existing solutions that require two samplings and manual interpolation calculations, this application reduces the number of samplings and transfers the interpolation calculation to the texture unit hardware, freeing up the computing resources of the shader core, resulting in a significant performance advantage in large-scale character rendering.
[0028] It's important to note that graphics processing units (GPUs) typically consist of two key hardware components: texture units (texture filtering hardware) and shader cores. Bilinear filtering is a fundamental capability of texture units, designed to address pixelation issues when images are magnified. When a texture image is magnified, a single screen pixel might correspond to a non-integer coordinate within the texture. Directly taking the nearest pixel value would result in noticeable jagged edges or a pixelated effect. Bilinear filtering, by reading the surrounding four pixels and weighted averaging, smooths the edges of the magnified image, creating a more natural visual effect. This application, through continuous inter-frame arrangement, ensures that adjacent frames of the same bone are physically adjacent within the texture, migrating bilinear filtering from image scaling scenarios to animation inter-frame interpolation scenarios, thus achieving cross-scenario reuse of hardware capabilities.
[0029] Please continue reading. Figure 1 Step S104: Use the bone transformation data to perform skinning transformation on each bone node of the character model to generate the corresponding character animation.
[0030] After obtaining the skeletal transformation data in step S103, the vertices of the character model can be skinned using this data. Specifically, for each vertex of the character model, the skeletal transformation data of at least one associated bone and the corresponding bone weights can be obtained. The transformation data of each bone is applied to the original position of the vertex to obtain the transformed position under the influence of each bone. Then, the transformed positions are weighted and summed according to the bone weights of each bone to obtain the skinned vertex position. After performing the above operations on all vertices, a smooth animation of the corresponding animation clip at the current moment can be generated.
[0031] In summary, this application simplifies the operation that requires two texture samplings and one manual interpolation in the prior art to one sampling by changing the storage structure of the animation texture and cooperating with the bilinear filtering capability of the graphics processor. This reduces the bandwidth and computational burden of the GPU, thereby enabling smooth transitions of low frame rate animations without increasing storage overhead.
[0032] Based on the aforementioned solution, please refer to Figure 2 In some implementations of this application, the animation texture in the texture resource is pre-generated through the following steps: Step S201: Obtain the mesh data of the target character model, the mesh data including the bone index and bone weight information associated with each vertex; Step S202: Traverse the bone weight information of all vertices, count the number of vertices actually affected by each bone, determine the bones with zero affected vertices as redundant bones and remove them, and obtain a list of effective bones; Step S203: Sample the animation clips of the target character model according to the list of effective bones at a first preset frame rate, and sample only the transformation data of each bone in the list of effective bones to generate the corresponding animation texture.
[0033] Understandably, this implementation reduces texture storage space while ensuring animation quality by eliminating redundant bones that do not affect model deformation, so that the subsequently generated animation textures only store the transformation data of the effective bones.
[0034] Specifically, the process begins by acquiring the mesh data of the target character model. This mesh data includes the bone index and bone weight information associated with each vertex. Then, the bone weight information of all vertices is traversed, and the actual number of vertices affected by each bone is counted. Bones with zero affected vertices are identified as redundant and removed, resulting in a list of valid bones. Finally, based on the list of valid bones, animation clips from the target character model are sampled at a first preset frame rate, sampling only the transformation data of each bone in the list of valid bones to generate the corresponding animation textures.
[0035] In practical applications, a character model typically contains dozens of bones, of which about 30% to 40% do not participate in any vertex skinning calculations. By eliminating these redundant bones using this implementation method, the size of the animation texture can be reduced proportionally, effectively reducing memory usage.
[0036] Based on the aforementioned solution, please refer to Figure 3In some implementations of this application, the animation texture in the texture resource is pre-generated through the following steps: Step S301: Sample the target animation segment at a second preset frame rate to obtain the transformation matrix of each bone under multiple sampling frames; Step S302: Divide the transformation matrix into multiple rows, with each row of data serving as the pixel value of a texture pixel; Step S303: Write each pixel value into a texture pixel array according to the same row of data of the same bone arranged sequentially in frame order; Step S304: Generate an animation texture corresponding to the target animation segment based on the texture pixel array, and set the filtering mode to bilinear filtering.
[0037] Understandably, this implementation method samples the animation data frame by frame and writes it into the texture in a sequential manner according to the same bone and the same row of data, so that adjacent frame data are physically adjacent in the texture, laying the data foundation for subsequent automatic interpolation of a single sample using the bilinear filtering of the graphics processor.
[0038] Specifically, the target animation clip is first sampled at a second preset frame rate to obtain the transformation matrix of each bone across multiple sampled frames. This transformation matrix represents the complete transformation of the bones from the initial pose of the character model to the current animation pose. The transformation matrix is then split into multiple rows, with each row representing a pixel value for a texture pixel. During this process, pixel values are written into the texture pixel array in a frame-by-frame sequential manner, corresponding to the same row of data for the same bone. For example, for the zeroth row of data for a bone, its pixel values in frames zero, one, two, and up to the last frame are consecutively adjacent in the texture pixel array. Finally, an animation texture corresponding to the target animation clip is generated based on the texture pixel array, and the filtering mode is set to bilinear filtering.
[0039] Based on the aforementioned scheme, in some implementations of this application, the step of splitting the transformation matrix into multiple rows includes: obtaining the analysis results of whether the target animation segment contains bone scaling transformation; if the target animation segment contains bone scaling transformation, splitting the transformation matrix of each bone into three rows; if the target animation segment does not contain bone scaling transformation, splitting the transformation matrix of each bone into two rows: rotation quaternion and translation vector.
[0040] Understandably, this implementation detects whether the animation clip contains skeletal scaling transformations and adaptively selects different storage formats. When there is no scaling, it adopts a more compact representation, thereby saving texture storage space.
[0041] Specifically, the analysis results for whether the target animation clip contains skeletal scaling transformations are first obtained. If the target animation clip contains skeletal scaling transformations, the transformation matrix of each bone is split into three rows, each corresponding to a pixel in the texture, and the three rows of matrix data are completely saved. If the target animation clip does not contain skeletal scaling transformations, the transformation data of each bone is split into two rows: rotation quaternions and translation vectors, each corresponding to a texture pixel. The rotation quaternion uses four values to represent the bone's rotation information, and the translation vector uses three values to represent the bone's position information.
[0042] It should be noted that many animation clips, such as walking and running, only involve rotation and translation, without scaling transformations. Traditional solutions uniformly use a matrix format for storage, with each bone occupying three pixels, resulting in data redundancy. This implementation uses a compact format of quaternions plus translation vectors for animations without scaling, with each bone occupying only two pixels, saving approximately 33% of texture storage space.
[0043] Based on the aforementioned scheme, in some implementations of this application, the step of sampling the target animation segment at a preset frame rate to obtain the transformation matrix of each bone in multiple sampling frames includes: acquiring the original animation data, effective bone list, and number of sampling frames of the target animation segment; sequentially traversing each sampling frame, and for each sampling frame, reading the transformation matrix of each bone in the effective bone list at the current sampling time to obtain the transformation matrix of each bone in each sampling frame. Wherein, if the target animation segment is a loop animation, the sampling time of the last sampling frame is set as the time corresponding to the first sampling frame.
[0044] Understandably, this implementation discretizes the continuous animation curve into sampling points with a fixed frame rate and performs start-end stitching for looping animations to ensure the continuity of inter-frame interpolation during loop playback. Specifically, if the target animation segment is a loop, the sampling time of the last sampling frame is set to the time corresponding to the first sampling frame, making the transformation matrix of the last sampling frame the same as that of the first sampling frame. This ensures that when the animation loops, the data of adjacent frames is identical during the transition from the last frame to the first frame, and the bilinear filtering mixture remains correct, guaranteeing smooth loop playback.
[0045] Based on the aforementioned scheme, in some implementations of this application, the step of writing each pixel value into the texture pixel array in the manner of arranging the same row of data of the same bone continuously in frame order includes: writing each pixel value into the texture pixel group in the order of first traversing the bones, then traversing each row of each bone, and finally traversing each sampling frame, so that the pixel values of the same row of data of the same bone under different sampling frames are arranged continuously adjacently in the texture pixel array, and the data of different bones and the data of different rows are separated from each other when writing.
[0046] It is understandable that this implementation specifically describes the writing order of pixel values. This writing order allows the pixel values of the same row of data for the same bone to be arranged consecutively and adjacently in the texture pixel array under different sampling frames, creating data conditions for subsequent inter-frame interpolation using the texture sampling hardware of the graphics processor.
[0047] Specifically, the writing process proceeds in the following order: first, iterates through the skeleton; then, iterates through each row of each skeleton; and finally, iterates through each sampled frame. Taking the first skeleton as an example, first, the pixel values of the zeroth row of that skeleton across all sampled frames are written, i.e., frame 0, frame 1, frame 2, and so on, sequentially. Next, the pixel values of the first row of that skeleton across all sampled frames are written, also sequentially. If it's a matrix mode, the second row is written first, and then the next skeleton is processed. During this process, data from different skeletons and different rows are written separately and do not mix. In simpler terms, for any row in the matrix of any skeleton, the data in frame 0 and frame 1 are adjacent in the pixel array, as are the data in frame 1 and frame 2, and so on.
[0048] Based on the aforementioned scheme, in some implementations of this application, the texture pixel array is a one-dimensional array; the step of generating an animation texture corresponding to the target animation segment based on the texture pixel array includes: determining the texture width and texture height based on the total number of pixels in the texture pixel array; creating a two-dimensional texture with the texture width and texture height; and filling the texture pixel array into the two-dimensional texture in row order to generate an animation texture corresponding to the target animation segment.
[0049] Understandably, this implementation maps a one-dimensional texture pixel array to a two-dimensional texture, enabling the graphics processor to efficiently access the stored bone transformation data in two-dimensional texture coordinates.
[0050] Specifically, the texture pixel array is a one-dimensional array, in which all pixel values are stored in the order of traversing the skeleton, then traversing each row of each skeleton, and finally traversing each sampling frame. When generating the two-dimensional texture, the texture width and texture height are first determined based on the total number of pixels in the texture pixel array. The texture width is usually a power of two to meet the hardware access requirements of the graphics processor, and the texture height is determined based on the ratio of the total number of pixels to the width. Then, a two-dimensional texture (formatted as a 16-bit floating-point number per channel (RGBAHalf)) with this texture width and texture height is created, and the one-dimensional texture pixel array is filled into the two-dimensional texture row by row, that is, after the first row is filled, the second row is filled, and so on. The texture filtering mode is set to bilinear filtering, and the boundary mode is set to clamp.
[0051] It should be noted that 2D textures are the standard format for graphics processor sampling. This implementation converts 1D data arranged according to continuous inter-frame rules into 2D textures, allowing the graphics processor to quickly locate and sample the required bone data at runtime using texture coordinates, providing hardware support for subsequent inter-frame interpolation using bilinear filtering.
[0052] Based on the aforementioned scheme, in some implementations of this application, after generating the animation texture corresponding to the target animation segment, the method further includes: extending a column of pixels to the right of the animation texture corresponding to the target animation segment to obtain an extended column, and filling the first pixel of the next row of each row into the extended column to generate the animation texture with extended boundaries.
[0053] It's important to note that when a one-dimensional texture pixel array is filled into a two-dimensional texture, previously continuous data may wrap to newer pixels. That is, when a row of pixels is full, the next pixel will be placed at the beginning of the next row. When the graphics processor uses bilinear filtering to sample the last pixel of a row, the texture filtering hardware attempts to blend it with the pixel to its right. However, this adjacent position may not have valid data initially, or it may have been processed by boundary clamping mode to repeat the value of the last pixel in the current row. In reality, the actual data for the next frame is located at the first pixel of the next row, not to the right of the last pixel in the current row.
[0054] Therefore, to address the issue of bilinear filtering reading erroneous data when the sampling point is at the end of a texture row, this implementation extends the right boundary of the animated texture. Thus, when the sampling point is at the end of a row, the extended right column provides the correct data for the first pixel of the next row, enabling bilinear filtering to correctly blend pixels from adjacent rows. This ensures accurate inter-frame interpolation at any position on the entire texture, thereby eliminating visual jump artifacts caused by texture line breaks.
[0055] Based on the aforementioned solution, please refer to Figure 4In some implementations of this application, the step of using a graphics processor to perform bilinear filtering and texture coordinate offset sampling on the animation texture according to the identifier of the animation segment to be played and the animation playback parameters to obtain bone transformation data includes: Step S401: Obtain the starting position and frame rate information from the animation configuration information of the corresponding animation segment according to the identifier of the animation segment to be played; Step S402: Obtain the global animation time, and calculate the current floating-point frame number according to the global animation time and the playback start time in the animation playback parameters, combined with the frame rate, and decompose the current floating-point frame number into an integer frame number and an inter-frame interpolation factor; Step S403: Determine the sampling starting position according to the integer frame number, the starting position, and the bone number of the target bone; Step S404: Apply a horizontal offset to the sampling starting position according to the inter-frame interpolation factor to obtain the offset sampling position; Step S405: Perform a single texture sampling at the offset sampling position, and use the texture filtering hardware of the graphics processor to linearly mix the data of two adjacent frames according to the ratio of the inter-frame interpolation factor to obtain weighted mixed bone transformation data.
[0056] Understandably, this implementation converts the inter-frame interpolation factor into a horizontal offset of the texture coordinates, ensuring that the sampling point falls between pixels in two adjacent frames. This triggers the graphics processor hardware to automatically complete inter-frame data blending, achieving simultaneous data reading and interpolation in a single sampling. In other words, this implementation completes the reading of data from two adjacent frames and inter-frame interpolation simultaneously with a single texture sampling, replacing the traditional approach that requires two texture samplings and one manual linear interpolation operation. This reduces the number of vertex shader instructions and texture bandwidth consumption.
[0057] Specifically, firstly, based on the identifier of the animation segment to be played, the starting position and frame rate information are obtained from the animation configuration information of the corresponding animation segment. The starting position indicates the storage starting point of the animation segment in the animation texture, and the frame rate is used to convert the playback time into a frame number. Then, the global animation time is obtained, and based on the global animation time and the playback start time in the animation playback parameters, the current floating-point frame number is calculated in combination with the frame rate. The calculation formula is: Current floating-point frame number = (Global animation time - Playback start time) × Frame rate. Next, the current floating-point frame number is decomposed into an integer frame number and an inter-frame interpolation factor. The integer frame number represents the current frame number, and the inter-frame interpolation factor represents the progress between the current frame and the next frame.
[0058] Based on this, the sampling start position is determined according to the integer frame number, the start position, and the bone number of the target bone (i.e., the texture pixel value corresponding to the target bone in that frame is calculated and converted into texture UV coordinates, i.e., two-dimensional coordinates in the range of 0-1 on the texture). Here, the sampling start position points to the pixel where the data of the bone is located in the frame corresponding to the integer frame number. Then, a horizontal offset is applied to the sampling start position according to the inter-frame interpolation factor (i.e., the inter-frame interpolation factor is multiplied by the width of one pixel in the texture to obtain a small UV horizontal offset), resulting in the offset sampling position (the UV horizontal offset is added to the previously calculated texture UV coordinates). Since the data of the same bone in the same row is stored continuously in frame order, and the data of adjacent frames are physically adjacent in the texture, the offset sampling position is exactly located between the pixels of two adjacent frames. Finally, a single texture sampling is performed at this offset sampling position. The texture filtering hardware of the graphics processor automatically performs linear mixing of the data of two adjacent frames according to the ratio of the inter-frame interpolation factor, returning the weighted mixed bone transformation data.
[0059] Based on the aforementioned scheme, in some implementations of this application, the processing method further includes: when switching from the first animation to the second animation, obtaining the first bone transformation data corresponding to the first animation and the second bone transformation data corresponding to the second animation; and performing linear interpolation on the vertex positions after skinning based on the first bone transformation data and the vertex positions after skinning based on the second bone transformation data according to a preset blending progress value to obtain the vertex positions after transition blending.
[0060] Understandably, this implementation uses linear interpolation of the skinning results of two animations during animation transitions to ensure smooth character movement and avoid visual abrupt changes caused by direct switching. In other words, this implementation provides a smooth, gradual transition when the character switches from one action to another, avoiding the visually unnatural abrupt changes in traditional methods and improving the quality of animation performance.
[0061] Specifically, when switching from the first animation to the second animation, the first bone transformation data corresponding to the first animation and the second bone transformation data corresponding to the second animation are first obtained. These two sets of bone transformation data are obtained through the aforementioned steps S102-S103, representing the poses of each bone in the first and second animations at the current moment. Then, a preset blending progress value is obtained, which gradually changes from zero to one over time to control the weight ratio of the two animations during the transition. Finally, linear interpolation is performed on the vertex positions after skinning based on the first bone transformation data and the vertex positions after skinning based on the second bone transformation data according to the preset blending progress value to obtain the vertex positions after the transition blending. When the blending progress value is zero, the vertex positions are completely determined by the second animation; when the blending progress value is one, the vertex positions are completely determined by the first animation; and at intermediate values, they are blended proportionally.
[0062] It will be apparent to those skilled in the art that this application is not limited to the details of the exemplary embodiments described above, and that this application can be implemented in other specific forms without departing from the spirit or essential characteristics of this application. Therefore, the embodiments should be considered illustrative and non-limiting in all respects, and the scope of this application is defined by the appended claims rather than the foregoing description. Thus, all variations falling within the meaning and scope of equivalents of the claims are intended to be included within this application. No reference numerals in the claims should be construed as limiting the scope of the claims.
Claims
1. A method for processing skeletal animation data, characterized in that, The method includes: Obtain the identifier of the animation segment to be played and the animation playback parameters of the character model, wherein the character model includes at least one skeleton; According to the animation identifier to be played, the corresponding animation texture is determined from the texture resource. The filtering mode of the animation texture is bilinear filtering, and it stores the transformation data of each bone of the corresponding animation segment in each frame. The transformation data of each bone is split into multiple rows, each row corresponds to a texture pixel, and the same row of data of the same bone is stored continuously in adjacent pixel positions in frame order. Based on the identifier of the animation segment to be played and the animation playback parameters, the graphics processor is used to perform bilinear filtering and texture coordinate offset sampling on the animation texture to obtain bone transformation data. The skeletal transformation data is used to perform skinning transformation on each bone node of the character model to generate the corresponding character animation.
2. The processing method according to claim 1, characterized in that, The animated textures in the texture resource are pre-generated through the following steps: Obtain the mesh data of the target character model, wherein the mesh data includes the bone index and bone weight information associated with each vertex; Iterate through the bone weight information of all vertices, count the number of vertices actually affected by each bone, and identify bones with zero affected vertices as redundant bones and remove them to obtain a list of valid bones. Based on the effective bone list, the animation clips of the target character model are sampled at a first preset frame rate, and only the transformation data of each bone in the effective bone list is sampled to generate the corresponding animation texture.
3. The processing method according to claim 1, characterized in that, The animated textures in the texture resource are pre-generated through the following steps: The target animation clip is sampled at a second preset frame rate to obtain the transformation matrix of each bone in multiple sampled frames; The transformation matrix is split into multiple rows, with each row containing the pixel value of a texture pixel; The pixel values are written into the texture pixel array in a frame-by-frame manner, with the data of the same row of the same bone arranged consecutively. An animation texture corresponding to the target animation clip is generated based on the texture pixel array, and the filtering mode is set to bilinear filtering.
4. The processing method according to claim 3, characterized in that, The step of splitting the transformation matrix into multiple rows includes: Obtain the analysis results of whether the target animation clip contains skeletal scaling transformation; If the target animation clip contains skeletal scaling transformations, the transformation matrix of each bone is split into three rows; If the target animation clip does not contain bone scaling transformation, the transformation matrix of each bone is split into two rows: rotation quaternion and translation vector.
5. The processing method according to claim 3, characterized in that, The step of sampling the target animation clip at a preset frame rate to obtain the transformation matrix of each bone in multiple sampled frames includes: Obtain the original animation data, valid bone list, and sampled frame number of the target animation clip; Each sampling frame is traversed sequentially. For each sampling frame, the transformation matrix of each bone in the list of valid bones at the current sampling time is read to obtain the transformation matrix of each bone in each sampling frame. If the target animation segment is a loop animation, then the sampling time of the last sampling frame is set to the time corresponding to the first sampling frame.
6. The processing method according to claim 3, characterized in that, The step of writing each pixel value into the texture pixel array according to the method of arranging the data of the same row of the same bone in frame order includes: Following the order of first traversing the skeleton, then traversing each row of each skeleton, and finally traversing each sampling frame, the pixel values are written to the texture pixel group in sequence. This ensures that the pixel values of the same row of data for the same skeleton are arranged consecutively and adjacently in the texture pixel array under different sampling frames, and that the data of different skeletons and different rows are separated from each other when being written.
7. The processing method according to claim 3, characterized in that, The texture pixel array is a one-dimensional array; the step of generating an animation texture corresponding to the target animation clip based on the texture pixel array includes: The texture width and texture height are determined based on the total number of pixels in the texture pixel array; Create a two-dimensional texture with the stated texture width and texture height, and fill the two-dimensional texture with the texture pixel array in row order to generate an animation texture corresponding to the target animation segment.
8. The processing method according to claim 7, characterized in that, After generating the animation texture corresponding to the target animation clip, the method further includes: extending a column of pixels to the right of the animation texture corresponding to the target animation clip to obtain an extended column, and filling the first pixel of the next row of each row into the extended column to generate the animation texture with extended boundaries.
9. The processing method according to claim 1, characterized in that, The step of obtaining skeletal transformation data by performing bilinear filtering and texture coordinate offset sampling on the animation texture using a graphics processor based on the identifier of the animation segment to be played and the animation playback parameters includes: Based on the identifier of the animation segment to be played, obtain the starting position and frame rate information from the animation configuration information of the corresponding animation segment; Obtain the global animation time, and calculate the current floating-point frame number based on the global animation time and the playback start time in the animation playback parameters, combined with the frame rate, and decompose the current floating-point frame number into an integer frame number and an inter-frame interpolation factor; The sampling start position is determined based on the integer frame number, the start position, and the bone number of the target bone; A horizontal offset is applied to the sampling start position based on the inter-frame interpolation factor to obtain the offset sampling position; A single texture sampling is performed at the offset sampling position, and the data of two adjacent frames are linearly mixed according to the ratio of the inter-frame interpolation factor using the texture filtering hardware of the graphics processor to obtain the weighted mixed skeleton transformation data.
10. The processing method according to claim 1, characterized in that, The processing method further includes: When switching from the first animation to the second animation, obtain the first bone transformation data corresponding to the first animation and the second bone transformation data corresponding to the second animation; Based on a preset blending progress value, linear interpolation is performed on the vertex positions after skinning based on the first bone transformation data and the vertex positions after skinning based on the second bone transformation data to obtain the vertex positions after transition blending.