Texture encoding method and apparatus, texture decoding method and apparatus, and device, storage medium and program product

By using a neural network model for texture encoding and decoding and reusing feature data and network parameters, the problem of low texture compression rate is solved, achieving more efficient texture data compression and reducing network transmission and storage requirements.

WO2026123805A1PCT designated stage Publication Date: 2026-06-18HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2025-09-02
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing texture compression technologies have limitations in improving compression rates, resulting in texture data consuming a large amount of bandwidth and space during network transmission and storage. In particular, the encoding efficiency is low and there is redundancy in multi-texture scenarios.

Method used

A neural network model is used for texture encoding and decoding. By reusing feature data and network parameters, and combining the data correlation within and between multiple textures, efficient texture data compression is achieved, reducing the bitstream size.

🎯Benefits of technology

While ensuring reconstruction reliability, the size of the texture data bitstream was significantly reduced, network transmission bandwidth and storage space usage were decreased, and coding efficiency was improved.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025118552_18062026_PF_FP_ABST
    Figure CN2025118552_18062026_PF_FP_ABST
Patent Text Reader

Abstract

The present application belongs to the technical field of encoding and decoding. Disclosed are a texture encoding method and apparatus, a texture decoding method and apparatus, and a device, a storage medium and a program product. The encoding method comprises: encoding one or more textures, so as to obtain feature data of the one or more textures, wherein the feature data includes first feature data of a first texture, and the first feature data is part of feature data of the first texture; and encoding the first feature data into a bitstream. The decoding method comprises: parsing a bitstream to obtain feature data of one or more textures, wherein the feature data includes first feature data of a first texture; and acquiring second feature data, and on the basis of the second feature data, decoding the first feature data, so as to obtain the first texture. By means of the method, part of feature data of a first texture is encoded into a bitstream, thereby improving a texture compression rate, and reducing the occupation of a network transmission bandwidth and a storage space by the bitstream.
Need to check novelty before this filing date? Find Prior Art

Description

Texture encoding and decoding methods, apparatus, devices, storage media, and program products

[0001] This application claims priority to Chinese Patent Application No. 202411855608.2, filed on December 13, 2024, entitled “Texture Encoding / Decoding Method, Apparatus, Device, Storage Medium and Program Product”, the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of encoding and decoding technology, and in particular to a method, apparatus, device, storage medium and program product for encoding and decoding textures. Background Technology

[0003] Currently, the applications of textures are becoming increasingly diverse. For example, in three-dimensional (3D) graphics and games, texture mapping can be used to render images and obtain high-quality rendering results. As the number of textures increases, the bandwidth required for network transmission and the storage space required also increase. Based on this, texture compression (also known as texture encoding / decoding) technology has emerged. The higher the texture compression ratio, the smaller the compressed bitstream, and the less bandwidth and storage space is required for network transmission. Therefore, improving texture compression ratio is a key research focus in the industry. Summary of the Invention

[0004] This application provides a texture encoding / decoding method, apparatus, device, storage medium, and program product, which can improve texture compression rate and reduce the bitstream's impact on network transmission bandwidth and storage space. The technical solution is as follows:

[0005] In a first aspect, a method for decoding a texture is provided, the method comprising: parsing a bitstream to obtain feature data of one or more textures, the one or more textures including a first texture, the feature data of the one or more textures including first feature data of the first texture; obtaining second feature data, and decoding the first feature data based on the second feature data to obtain the first texture.

[0006] In other words, even if the bitstream includes some feature data of the first texture, the decoding end can reconstruct the first texture by obtaining the second feature data. Thus, while ensuring reconstruction reliability, the bitstream is reduced, alleviating its impact on network transmission bandwidth and storage space.

[0007] In one possible implementation, the step of decoding the first feature data based on the second feature data to obtain the first texture includes: reconstructing the first feature data based on the second feature data to obtain reconstructed feature data of the first texture; and decoding the reconstructed feature data to obtain the first texture.

[0008] In one possible implementation, the feature data of the one or more textures mentioned above includes one or more of the following: feature map, feature block, weight feature data corresponding to the feature map or feature block, endpoint feature data corresponding to the feature map or feature block, and nonlinear transformation parameters.

[0009] For example, the second feature data is a portion of the feature data of the one or more textures, or the second feature data is a portion of the first feature data, or the first feature data is the feature data of the second texture in the one or more textures.

[0010] In one possible implementation, the second feature data is data obtained from the parsed feature data; or, the second feature data is data obtained after performing any one of a plurality of decoding operations on the parsed feature data; or, the second feature data is feature data in a feature dataset. That is, the reused data (i.e., the second feature data) can be parsed feature data, intermediate data in the decoding process, decoding results, or data in a feature dataset, such as locally generated feature data.

[0011] In one possible implementation, the method further includes: acquiring multiplexing indication information; acquiring the second feature data includes: acquiring the second feature data based on the multiplexing indication information. For example, the bitstream may also include multiplexing indication information to guide the decoding end in acquiring the second feature data.

[0012] In one possible implementation, the reuse indication information includes a reuse index, which includes an index of the second feature data.

[0013] In one possible implementation, the multiplexing indication information includes a multiplexing switch identifier and / or a confirmed multiplexing identifier, wherein the second feature data is obtained when the multiplexing switch identifier is on or the multiplexing indication information includes the confirmed multiplexing identifier. The multiplexing switch identifier indicates whether the first feature data can be decoded based on the second feature data, and the confirmed multiplexing identifier indicates whether decoding of the first feature data based on the second feature data has been confirmed.

[0014] In one possible implementation, the reuse indication information includes a reuse mode, which indicates the mode for reconstructing the first feature data. For example, from the perspective of reused data type, the reuse mode may include one or more of the following: feature map reuse, feature block reuse, endpoint feature data reuse corresponding to a feature map or feature block, weight feature data reuse corresponding to a feature map or feature block, and nonlinear transformation parameter reuse. From the perspective of the size of reused data, the reuse mode may indicate one or more of the following: partial reuse, full reuse, and post-transformation reuse. These are merely examples; the reuse mode may also be a combination of the above reuse modes.

[0015] In one possible implementation, the second feature data includes some or all of the feature data corresponding to the feature maps of the one or more textures mentioned above. That is, some or all of the feature data corresponding to the feature maps can be reused. The feature map can be one of one or more feature maps of the first texture, or it can be one of one or more feature maps of textures other than the first texture among the one or more textures mentioned above.

[0016] In one possible implementation, the second feature data includes some or all of the nonlinear transformation parameters corresponding to one or more of the aforementioned textures. That is, some or all of the nonlinear transformation parameter data can be reused.

[0017] In one possible implementation, the second feature data is either the feature data of the second texture or the feature data of the first texture. That is, data reuse can be performed between different textures.

[0018] In one possible implementation, reconstructing the first feature data based on the second feature data includes: transforming the second feature data to obtain transformed feature data; and reconstructing the first feature data based on the transformed feature data. That is, the data is transformed and then reused.

[0019] In one possible implementation, the transformation includes one or more of rotation, translation, and mirroring.

[0020] Secondly, a texture encoding method is provided, the method comprising: acquiring one or more textures, the textures including a first texture; encoding the one or more textures to obtain feature data of the one or more textures, the feature data of the one or more textures including first feature data of the first texture, the first feature data being partial feature data of the first texture; and encoding the first feature data into a bitstream.

[0021] In other words, the encoding end encodes some feature data of the first texture into the bitstream, which reduces the bitstream size, improves the texture compression rate, and reduces the bandwidth and storage space occupied by the bitstream during network transmission.

[0022] In one possible implementation, the method further includes: encoding multiplexing indication information into the bitstream, wherein the multiplexing indication information is used to acquire second feature data, and the second feature data is used to reconstruct the first feature data. That is, the multiplexing indication information guides the decoding end to decode using data multiplexing.

[0023] In one possible implementation, the reuse indication information includes a reuse index, which includes an index of the second feature data.

[0024] In one possible implementation, the second feature data is data from the feature data of the one or more textures; or, the second feature data is feature data from a feature dataset. That is, data that can be reused can be selected from the feature data of the currently encoded one or more textures, or data that can be reused can be selected from a feature dataset.

[0025] In one possible implementation, the multiplexing indication information includes a multiplexing switch identifier and / or a confirmed multiplexing identifier. The multiplexing switch indicates whether multiplexing is enabled to reconstruct the first feature data, and the confirmed multiplexing identifier indicates that the reconstruction of the first feature data has been confirmed through multiplexing.

[0026] In one possible implementation, the feature data of the one or more textures includes one or more of the following: weight feature data, endpoint feature data, and nonlinear transformation parameters.

[0027] Thirdly, a texture decoding apparatus is provided, which has the function of implementing the texture decoding method behavior described in the first aspect above. The decoding apparatus includes one or more modules for implementing the texture decoding method provided in the first aspect above.

[0028] Fourthly, a texture encoding apparatus is provided, which has the function of implementing the texture encoding method behavior described in the second aspect above. The encoding apparatus includes one or more modules for implementing the texture encoding method provided in the second aspect above.

[0029] Fifthly, a decoding device is provided, comprising a processor and a memory, the memory being used to store a computer program for executing the texture decoding method provided in the first aspect. The processor is configured to execute the computer program stored in the memory to implement the texture decoding method described in the first aspect.

[0030] In one possible implementation, the decoding device may further include a communication bus for establishing a connection between the processor and the memory.

[0031] In a sixth aspect, an encoding apparatus is provided, comprising a processor and a memory, the memory being used to store a computer program for executing the texture encoding method provided in the second aspect above. The processor is configured to execute the computer program stored in the memory to implement the texture encoding method described in the second aspect above.

[0032] In one possible implementation, the encoding device may further include a communication bus for establishing a connection between the processor and the memory.

[0033] In a seventh aspect, a computer-readable storage medium is provided, the computer-readable storage medium storing a computer program that, when the computer program is run on a computer or processor, causes the computer or processor to perform the steps of the texture decoding method described in the first aspect, and / or to perform the steps of the texture encoding method described in the second aspect.

[0034] Eighthly, a computer program product is provided, the computer program product comprising computer instructions that, when executed by a computer or processor, cause the computer or processor to perform the steps of the texture decoding method described in the first aspect, and / or the steps of the texture encoding method described in the second aspect. Alternatively, a computer program is provided that, when run on a computer or processor, causes the computer or processor to perform the steps of the texture decoding method described in the first aspect, and / or the steps of the texture encoding method described in the second aspect.

[0035] In a ninth aspect, an encoding and decoding system is provided, the encoding and decoding system comprising an encoding device and a decoding device, the encoding device being used to implement the steps of the texture encoding method described in the second aspect above, and the decoding device being used to implement the steps of the texture decoding method described in the first aspect above.

[0036] In a tenth aspect, an encoded bitstream is provided, the bitstream being generated according to the texture encoding method described in the second aspect above.

[0037] Eleventhly, a computer-readable storage medium is provided, the computer-readable storage medium storing a bitstream generated according to the texture encoding method described in the second aspect above.

[0038] In a twelfth aspect, an apparatus for storing a bitstream is provided, the apparatus comprising: a receiver and at least one storage medium, the receiver being configured to receive a bitstream generated according to the texture encoding method described in the second aspect above, and the at least one storage medium being configured to store the bitstream.

[0039] In a thirteenth aspect, an apparatus for transmitting a bitstream is provided, the apparatus comprising: a transmitter and a receiver, the receiver being configured to receive a bitstream generated according to the texture encoding method described in the second aspect above, and the transmitter being configured to transmit the bitstream to an end-side device via a transmission medium.

[0040] In a fourteenth aspect, an apparatus for transmitting a bitstream is provided, the apparatus comprising: a transmitter and at least one storage medium, the at least one storage medium being configured to store a bitstream generated according to the texture encoding method described in the second aspect above, the transmitter being configured to retrieve the bitstream from the storage medium and transmit the bitstream to an end-side device via the transmission medium.

[0041] In a fifteenth aspect, a system for distributing bitstreams is provided, the system comprising: at least one storage medium for storing bitstreams generated according to the texture encoding method described in the second aspect above; and a streaming media device for acquiring a target bitstream from the at least one storage medium and sending the target bitstream to an end-side device, wherein the streaming media device includes a content server or a content distribution server.

[0042] The technical effects achieved by the third to fifteenth aspects mentioned above are similar to those achieved by the corresponding technical means in the first and second aspects, and will not be repeated here. Attached Figure Description

[0043] Figure 1 is a schematic diagram of a texture encoding process provided in an embodiment of this application;

[0044] Figure 2 is a schematic diagram of determining the endpoints of a texture block according to an embodiment of this application;

[0045] Figure 3 is a flowchart of a neural texture coding method provided in an embodiment of this application;

[0046] Figure 4 is a flowchart of a neural texture decoding method provided in an embodiment of this application;

[0047] Figure 5 is an architecture diagram of an image rendering system provided in an embodiment of this application;

[0048] Figure 6 is a system architecture diagram of a texture encoding and decoding scheme provided in an embodiment of this application;

[0049] Figure 7 is a flowchart illustrating a texture encoding / decoding scheme provided in an embodiment of this application;

[0050] Figure 8 is a schematic diagram of the implementation environment of a texture encoding and decoding method provided in an embodiment of this application;

[0051] Figure 9 is a schematic diagram of the structure of a client provided in an embodiment of this application;

[0052] Figure 10 is a flowchart of a texture encoding method provided in an embodiment of this application;

[0053] Figure 11 is a flowchart of another texture encoding method provided in an embodiment of this application;

[0054] Figure 12 is a flowchart of another texture encoding method provided in an embodiment of this application;

[0055] Figure 13 is a flowchart of another texture encoding method provided in an embodiment of this application;

[0056] Figure 14 is a flowchart of a texture decoding method provided in an embodiment of this application;

[0057] Figure 15 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0058] Figure 16 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0059] Figure 17 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0060] Figure 18 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0061] Figure 19 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0062] Figure 20 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0063] Figure 21 is a schematic diagram of texture decoding provided in an embodiment of this application;

[0064] Figure 22 is a schematic diagram of data multiplexing after transformation in a texture decoding method provided in an embodiment of this application;

[0065] Figure 23 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0066] Figure 24 is a flowchart of another texture decoding method provided in an embodiment of this application;

[0067] Figure 25 is a schematic diagram of the structure of a texture decoding device provided in an embodiment of this application;

[0068] Figure 26 is a schematic diagram of the structure of a texture encoding device provided in an embodiment of this application. Detailed Implementation

[0069] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the implementation methods of this application will be further described in detail below with reference to the accompanying drawings.

[0070] To facilitate understanding, before providing a detailed explanation of the texture encoding and decoding method provided in the embodiments of this application, the terminology, application scenarios, and implementation environment involved in the embodiments of this application will be introduced first.

[0071] First, some terms used in the embodiments of this application will be introduced.

[0072] Bitstream / File: The embodiments of this application mainly involve the encoding and decoding methods of textures (such as texture images, texture maps, etc.). The bitstream generated by encoding can be cached data stored in memory or stored as a file. There are many possible forms of existence. In the following text, only "bitstream" is used as the description object, which can be understood as "cache" or "file" and its form of existence does not affect the description scope of the embodiments of this application.

[0073] Compression: This involves utilizing information redundancy within content and removing redundant information using certain methods. In texture compression, the texture is compressed into a bitstream, the size of which is smaller than the original texture data. This bitstream can be called the original bitstream (or file; for convenience, it will be referred to as "bitstream" hereafter). In some embodiments, the original bitstream can be further compressed to obtain a new bitstream, i.e., secondary compression. Generally, the new bitstream is smaller than the original bitstream. This process of reducing the size of texture data is called "compression."

[0074] Decompression: The process of reconstructing the original texture by performing the inverse operation on the compressed bitstream is called "decompression." Specifically, for bitstreams obtained through secondary compression, performing the inverse operation on the compressed bitstream to obtain a bitstream of the same size as the original bitstream is also called "decompression." Generally, "compression" is divided into "lossless compression" and "lossy compression." Lossy compression, after decompression, reconstructs data that is completely identical to the original data, while lossy compression, after decompression, reconstructs data that is not completely identical to the original data.

[0075] Encoding: A method of expressing one set using one set by adopting certain rules. Some encoding methods can be considered as "compression".

[0076] Decoding: The inverse operation of encoding; "decompression" can be considered a form of decoding.

[0077] Texture compression is a technique used to compress textures. It can be used to store texture data in 3D computer graphics rendering systems, thereby reducing the storage space occupied by texture data. Texture compression independently encodes multiple blocks of texture (also called texture chunks) to obtain feature data in a texture encoding format. This encoding method features random access and high decoding parallelism and is supported by the vast majority of graphics cards.

[0078] A pixel is the basic unit of image display; it is a point or square that appears inseparable at any scale in an image. Each pixel can have its own color value.

[0079] A texel is short for texture element, and it is the basic unit in the texture space of computer graphics. Just as an image is composed of pixels, a texture is represented by an arrangement of texels.

[0080] Overfitting: The process of fitting a dataset too closely or precisely to a given dataset.

[0081] Overfitting model: Compared to a limited set of data, a model with too many parameters or an overly complex structure can fit the training dataset well, but loses its generalization ability to data outside the training dataset.

[0082] Feature maps are higher-level representations extracted from raw data (such as texture). Texture feature maps characterize the texel features of the texture. Texture feature maps can be extracted using neural network models or other methods available in existing technologies.

[0083] Secondly, the application scenarios of the embodiments of this application will be introduced.

[0084] In computer graphics, texture generally refers to an image or data used to describe the details of an object's surface. This image or data can contain various information such as color, brightness, transparency, and normal vectors, which are mapped onto the surface of a 3D model (i.e., texture mapping) to simulate the realistic material and details of the object's surface.

[0085] Textures can be classified according to various factors such as their source, purpose, and storage method. They can generally include the following categories: color textures, normal textures, height textures (or displacement textures), transparency textures, gloss / reflection textures, etc.

[0086] Texture mapping is the process of applying textures to the surface of a 3D model. This typically involves associating the texture's coordinates (U, V) with the vertex coordinates (X, Y, Z) of the 3D model's surface. The U coordinates represent the texture's horizontal coordinates, usually ranging from [0, 1], indicating its horizontal position. The V coordinates represent the texture's vertical coordinates, also ranging from [0, 1], indicating its vertical position. Therefore, by specifying the U and V coordinates, an image can be precisely fitted onto the surface of a 3D object, achieving realistic material effects. Texture mapping ensures that the texture is correctly overlaid on the model's surface and maintains the correct position and proportion as the model deforms.

[0087] Therefore, textures play a crucial role in computer graphics, significantly enhancing the realism and detail of rendered scenes. Through different types of textures and texture mapping techniques, more realistic and vivid 3D models and scenes can be created.

[0088] In the field of image encoding and decoding, texture is a crucial processing object for video compression and image encoding. Effectively processing and encoding textures can improve the efficiency and quality of video compression and image encoding, thereby meeting the needs of various application scenarios. Specifically, texture encoding / decoding techniques or neural texture encoding / decoding techniques can be employed when encoding and decoding textures.

[0089] Next, we will give a brief introduction to these two encoding and decoding technologies.

[0090] 1. Texture encoding and decoding technology

[0091] Image encoding and decoding technologies (such as the Joint Photographic Experts Group (JPEG, a lossy image compression format) and Portable Network Graphics (PNG, a lossless image compression format) primarily focus on the compression and decompression of the overall image, rather than emphasizing fast random access to individual pixels. In other words, image encoding and decoding algorithms typically prioritize overall image quality, compression ratio, and decoding speed over the efficiency of accessing individual pixels. Textures, on the other hand, are usually mapped onto the surface of 3D models to simulate the details and materials of object surfaces. Therefore, during rendering, fast random access to any pixel in the texture (also known as a texel in computer graphics) is often required. This means that encoding and decoding algorithms need to support the efficient retrieval of arbitrary texel values ​​from compressed data.

[0092] In short, compared to traditional image encoding and decoding technologies, texture encoding and decoding technologies need to support fast random access to any pixel (i.e., texel) in the texture.

[0093] To meet the requirements of fast random access, texture encoding techniques typically employ a block-based fixed-length encoding method. Referring to Figure 1, the original texture is divided into multiple fixed-size (e.g., M*N texels) texture blocks, and each block is then encoded into compressed data of a fixed number of bits to encode the original texture. Based on this, during decoding, the offset can be calculated based on the texture block number to be accessed and the compressed size of each block. This allows for quick location and decoding of the required texture block through simple offset addressing, thus achieving fast random access to any texel value of the original texture data.

[0094] For each texture block, existing texture encoding methods (such as adaptive scalable texture compression (ASTC), block compression (BC), etc.) perform intra-block encoding by storing endpoints, weights, and other encoding patterns within the texture block. Endpoints typically represent the maximum and minimum values ​​of texels within the texture block (e.g., extreme values ​​of color or brightness), while weights describe the changes in texel values ​​at different locations within the texture block (e.g., changes in color or brightness). The weights, combined with the endpoints, can recover each texel value of the original texture.

[0095] It should be noted that for each texture block, the color or brightness change of the texels within the block is usually close to a linear change. Therefore, color endpoints and interpolation weights can be recorded during encoding.

[0096] Taking texel color variation as an example, as shown in Figure 2, for each texture block, two or more key colors can be selected as endpoints based on its color variation. Then, using the endpoint colors as a reference, the weight value of each texel in the texture block is determined according to the relationship between the endpoint colors and each texel. The weight can be a scalar value, representing how close the actual texel value is to a certain endpoint; or it can be a vector value, representing the distribution of pixels across multiple color dimensions (e.g., RGB).

[0097] As an example, for a texture block containing a brick wall texture, by analyzing the color distribution within the texture block, four endpoint colors can be selected: khaki, light brown, gray, and dark brown. For each texel within the texture block, its weight value relative to these four endpoints (khaki, light brown, gray, and dark brown) is calculated. For example, a texel closer to light brown has a higher light brown weight, and a texel closer to dark brown has a higher dark brown weight.

[0098] Furthermore, the decoding process for the texture block can be as follows: First, decode the endpoint and weight information. Then, based on the weight and endpoint information, calculate the value of each texel of the texture block through interpolation. Finally, reconstruct the complete texture block.

[0099] As an example, for the texels of a texture block, the decoded texel value can be determined by the following formula (1): t=weight*endpoint1+(c-weight)*endpoint2 (1)

[0100] Where t is the decoded texel value, weight is the interpolation weight corresponding to the texel value, endpoint1 and endpoint2 are the two color endpoints of the texture block, and c is a constant or another parameter value related to the weight, such as c = 1.

[0101] Based on the above explanation, the texture encoding techniques described above have the following limitations when encoding textures:

[0102] (1) Since the interpolation encoding within a block is relatively simple, the compression rate of texture encoding technology is limited. Under the same image quality, more storage space may be needed to save the compressed texture data.

[0103] (2) The above texture encoding techniques can usually only encode each texture independently. When multiple textures need to be encoded at the same time, the encoding efficiency is relatively low. Moreover, there is redundancy and unnecessary overhead when storing or transmitting these texture data.

[0104] 2. Neural texture encoding and decoding technology

[0105] Neural texture encoding / decoding is a technique that uses neural network models to compress and decompress texture data, suitable for processing multiple textures with similar properties. Based on the idea of ​​neural network overfitting, this technique aims to overcome the shortcomings of traditional texture encoding / decoding techniques and provide a more efficient and accurate encoding / decoding solution.

[0106] The data to be encoded can be one or more textures, and the encoding result is one or more feature maps and one or more sets of neural network parameters. A texture can have one or more feature maps and one set of neural network parameters. Since the total data volume of the feature maps and neural network parameters is less than the data volume of the data to be encoded, compression of the data to be encoded is achieved through neural texture encoding.

[0107] Referring to Figure 3, the neural texture encoding process may include the following steps (11)-(13).

[0108] (11) Initialize feature data and network parameters.

[0109] Based on the encoding task settings and the structure of the neural network model, feature data and network parameters are initialized. The selection of feature data depends on the specific texture type and encoding requirements. During initialization, these feature data are typically set to random values ​​or initial values ​​based on some prior knowledge. Network parameters define the architecture of the neural network model and the parameters of the network layers. During initialization, the network layer parameters of the neural network model are typically set to small random values ​​to ensure the diversity and stability of the network during the learning process. The architecture of the neural network model (e.g., the number of network layers, the number of neurons per layer, etc.) is designed according to the specific encoding task.

[0110] (12) Neural network training.

[0111] During the training phase of a neural network model, feature data and network parameters are used as inputs. The neural network model then performs inference to obtain the inference result, which is a representation or prediction of the data to be encoded.

[0112] During inference, the neural network model performs nonlinear transformations on the input feature data through its internal connections and network parameters, progressively extracting higher-level feature representations. These representations are then transformed into the inference result in the final layer of the neural network model. The difference between the inference result and the data to be encoded is then measured using a loss function. The loss function is a mathematical expression used to calculate the error between the inference result and the actual data to be encoded; common loss functions include mean absolute error (MAE), mean squared error (MSE), and cross-entropy loss. Further, based on the calculated loss, gradient backpropagation and parameter updates are performed. That is, based on the calculated loss, the gradient of each network parameter in the neural network model is calculated using the gradient backpropagation algorithm (also known as the backpropagation algorithm). These gradients indicate how the parameters should be adjusted to reduce the loss. Then, optimization algorithms (such as stochastic gradient descent, adaptive moment estimation (Adam) algorithm, etc.) are used to update the network parameters and feature data of the neural network model to minimize the loss. The training process repeats the above steps until a certain exit condition is met. This exit condition can be that the loss is less than a certain threshold, or that the loop is repeated a certain number of times (called an epoch).

[0113] (13) Generate the bitstream.

[0114] After training, the training results are exported as the encoded bitstream. These training results include the trained feature data and network parameters. After training, the feature data already contains important information about the data to be encoded and is represented in a more compact and efficient way; the network parameters define the neural network architecture and network layer parameters used to reconstruct the data to be encoded. During the decoding phase, these parameters will be used to guide the inference process of the neural network model to recover the original texture.

[0115] It should be noted that the bitstream generated by the aforementioned neural network model contains sufficient information to allow for the recovery of all or part of the information of the data to be encoded through appropriate decoding steps during the decoding phase. The decoding process typically involves using the same neural network architecture and trained network parameters to infer the feature data in the bitstream in order to reconstruct the original texture data.

[0116] Referring to Figure 4, the neural texture decoding process may include the following steps (21)-(23).

[0117] (21) Bitstream splitting.

[0118] First, two key data components are extracted from the received bitstream (i.e., the encoded data stream): feature data and network parameters. As explained earlier, feature data is a compact representation of the texture obtained after some form of encoding. It contains enough information to recover the original texture details during decoding. Network parameters contain the structure and layer parameters of the neural network model used for decoding the feature vectors. Since the neural network model is optimized by continuously learning the texture features during the training phase, its network parameters can reflect the statistical regularities and structural characteristics of the texture data.

[0119] (22) Generate feature vectors.

[0120] During the decoding process, in order to reconstruct the texture or each texel in the texture, it is necessary to extract the corresponding feature values ​​from the feature data based on the coordinates of the sampling point (i.e., texel). These extracted feature values ​​are combined into a feature vector, which is then used as the input to the neural network model.

[0121] It should be understood that when extracting feature values ​​from feature data based on the coordinates of sampling points, a spatial mapping or indexing mechanism is involved to map the coordinates of the sampling points into the space of the feature data.

[0122] (23) Determine the decoding value of the texel through neural network model reasoning.

[0123] The decoding end obtains a neural network model identical to the neural network model trained by the encoding end according to the network parameters in the bitstream, and inputs the feature vector into the neural network model to perform inference (or backpropagation) on the input feature vector to calculate the decoded value of each texel. The decoded value represents the color, brightness or other attributes of the corresponding texel in the original texture.

[0124] In summary, neural texture encoding and decoding technology, by combining the learning capabilities of neural network models with the characteristics of texture data, can achieve efficient texture data compression and reconstruction.

[0125] As an example, suppose there is an image set containing multiple texture maps that need to be compressed and stored. Neural texture encoding / decoding techniques can be used to transform these multiple texture maps into a set of feature maps and a set of network parameters. The total data volume of these transformed data is smaller than the original data stream of the multiple texture maps. Furthermore, the encoded feature maps and neural network parameters are stored or transmitted. Due to the smaller data volume, storage space or transmission time can be saved. When needed, the encoded feature maps can be decoded based on the network parameters to reconstruct the original multiple texture maps.

[0126] The texture encoding and decoding method provided in this application can combine the overfitting ability of the neural network model and the data correlation within the texture or between multiple textures to encode one or more textures into a set of feature data and network parameters, and further encode the feature data and network parameters, thereby achieving compression of texture data, saving data loading bandwidth and data storage space.

[0127] It should be noted that, in addition to using a neural network model to perform nonlinear transformation on the feature vector to reconstruct the texture, other methods can also be used to perform nonlinear transformation on the feature vector, such as using a nonlinear function. The embodiments of this application do not limit the specific implementation of the nonlinear transformation.

[0128] The texture encoding / decoding scheme provided in this application can be applied to at least image rendering and image display scenarios. For ease of understanding, the following sections will describe these two application scenarios and the implementation logic of the technical solution in these scenarios.

[0129] 1. Image rendering scene

[0130] Image rendering primarily refers to the process of generating visual effects in 3D games and various graphics applications. During rendering, texture mapping technology is widely used to give objects rich surface details and realism. Texture maps act like the "skin" of an object; by mapping them onto the object's geometric surface, they can simulate the appearance of various materials (such as wood, metal, and fabric), as well as complex patterns and color variations. In other words, high-quality rendering results often rely on a large number of detailed texture maps containing rich detail information, making the final image more realistic.

[0131] Because texture maps typically contain a large amount of texel data, loading texture maps during the image rendering process requires significant data transfer bandwidth, especially in high-resolution and complex scenes, which can lead to prolonged loading times and negatively impact user experience. Furthermore, high-quality texture maps also mean larger file sizes, which undoubtedly increases the storage burden for games and applications that require installation packages. Particularly for mobile games, excessively large installation packages may reduce user download willingness and affect the product's market performance. Therefore, texture encoding methods (such as ASTC, BC, etc.) typically encode and store texture maps, i.e., storing compressed texture data.

[0132] Referring to Figure 5, during the image rendering process, the texture compression data needs to be loaded from the storage device (such as a hard disk or solid-state disk / drive, SSD) into memory, and then transferred to the graphics processing unit (GPU) for texture decoding to obtain texture data. Then, based on the geometric data and texture data, rendering is performed to obtain the rendering result, i.e., the rendered image.

[0133] Based on this, the texture encoding and decoding scheme provided in this application embodiment can be used to perform texture encoding and decoding on texture maps. Referring to Figure 6, the texture encoding and decoding system of this application embodiment includes a texture generation platform and a client. The texture generation platform can be a cloud-based server or terminal, used to generate / acquire texture maps, encode textures, and encapsulate them into a bitstream for transmission. The client is a chip with storage and processing capabilities, or a computer device, used to perform decoding tasks, rendering tasks, and display the results. This client can be a mobile phone, personal computer (PC), virtual reality (VR) glasses, augmented reality (AR) glasses, or other media products.

[0134] The texture generation platform can use any of the texture encoding methods shown in Figures 10 to 13 below to encode the texture map; at the same time, the client can use any of the texture decoding methods shown in Figures 14 to 24 below to decode the received bitstream to obtain texture data.

[0135] In one possible implementation, referring to Figure 7, the texture encoding, texture decoding, and rendering processes described above can be implemented on the client side. That is, the texture encoding system only includes the client, which independently completes the texture encoding scheme without transmitting the bitstream between the client and the texture generation platform. The implementation logic of the texture encoding, texture decoding, and rendering processes is the same as that in Figures 5 and 6, and will not be repeated here.

[0136] 2. Image display scenario

[0137] Image display refers to the process of processing image data and then transmitting it to a display screen for display. This can be applied in scenarios such as application (APP) image display, wallpaper display, video surveillance, and gaming. In the APP image display scenario, users typically view and browse images in mobile applications (such as social media and image browsers). The APP loads and decodes the image data and transmits it to the display screen to meet the user's visual needs. In the wallpaper display scenario, if a user wants to set a personalized wallpaper on their computer or mobile device desktop, the wallpaper image is decoded and processed before being transmitted to the display screen for display, providing the user with an aesthetically pleasing visual experience.

[0138] For the image encoding and decoding process described above, traditional image encoding and decoding technologies such as JPEG, PNG, and WebP (an image file format that provides both lossy and lossless / reversible compression) can typically be used to store image data. However, when these images need to be displayed in an app, since the GPU usually does not directly support hardware decoding of these compressed formats, the decoding process must be performed externally to the GPU, i.e., on the central processing unit (CPU), before the decoded image data is transmitted to the GPU for rendering and display. Therefore, traditional image encoding and decoding technologies are often computationally complex and require a long processing time, resulting in a high end-to-end latency from user request to actual display on the screen. In some extreme cases, users may see a blank screen, which significantly degrades the user experience. Moreover, since the decoding process is performed on the CPU, which is generally less efficient than the GPU in processing image data, it increases the device's power consumption. Furthermore, the need to upload the decoded image data to the GPU also consumes a significant amount of data upload bandwidth, further increasing power consumption.

[0139] To reduce CPU load and bandwidth consumption during data uploads, texture compression formats designed specifically for GPUs, such as ASTC and the BC series, can be used to encode images. These formats allow image data to be uploaded directly to the GPU in compressed form, where it is decoded and rendered without CPU assistance. While texture compression formats are more efficient for decoding and rendering on the GPU, their compression ratios are typically lower than common image compression formats like JPEG, PNG, and WebP. This means that images stored using texture compression formats will occupy more storage space, increasing the app's file size. For users, a larger file size may reduce their willingness to download and use the app. Furthermore, if image quality is compromised due to compression, it can also negatively impact the user's visual experience.

[0140] Based on this, the texture encoding and decoding scheme provided in this application embodiment can be used to perform texture encoding and decoding on the image to be displayed. The texture encoding and decoding method provided in this application embodiment can be executed on the CPU, or on the GPU, or partly on the CPU and partly on the GPU; this application embodiment does not limit this. When performing texture encoding, texture decoding, and rendering on the image to be displayed, the system architecture can refer to Figures 5-7 above, the only difference being that the encoding object "texture map" in Figures 5-7 is replaced with "the image to be displayed," the rest of the implementation logic is the same, and therefore will not be repeated here.

[0141] Finally, the implementation environment of the embodiments of this application will be described.

[0142] Referring to Figure 8, the implementation environment of the texture encoding / decoding scheme provided in this embodiment includes: a source device 10, a destination device 20, a link 30, and a storage device 40. The source device 10 can generate an encoded image, i.e., a bitstream. Therefore, the source device 10 can also be called an encoding device. The destination device 20 can decode the bitstream generated by the source device 10. Therefore, the destination device 20 can also be called a decoding device. The link 30 can receive the encoded image generated by the source device 10 and transmit the encoded image to the destination device 20. The storage device 40 can receive the encoded image generated by the source device 10 and store the encoded image. Under these conditions, the destination device 20 can directly obtain the encoded image from the storage device 40. Alternatively, the storage device 40 can correspond to a file server or another intermediate storage device that can store the encoded image generated by the source device 10. Under these conditions, the destination device 20 can stream or download the encoded image stored in the storage device 40.

[0143] Both the source device 10 and the destination device 20 may include one or more processors and memory coupled to the one or more processors. This memory may include random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory, or any other media that can be used to store desired program code in the form of computer-accessible instructions or data structures. For example, the source device 10 may be a server cluster or distributed system composed of multiple physical servers, or it may be a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs), and big data and artificial intelligence platforms, or a cloud computing service center. Destination device 20 may include mobile phones, smartphones, personal digital assistants (PDAs), wearable devices, pocket PCs (PPCs), tablets, smart car systems, smart TVs, smart speakers, desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, handsets such as so-called "smart" phones, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, and other terminals or similar devices.

[0144] Link 30 may include one or more media or devices capable of transmitting encoded images from source device 10 to destination device 20. In one possible implementation, link 30 may include one or more communication media enabling source device 10 to directly transmit encoded images to destination device 20 in real time. In this embodiment, source device 10 may modulate the encoded image based on a communication standard, such as a wireless communication protocol, and transmit the modulated image to destination device 20. The one or more communication media may include wireless and / or wired communication media, such as radio frequency (RF) spectrum or one or more physical transmission lines. The one or more communication media may form part of a packet-based network, such as a local area network, wide area network, or global network (e.g., the Internet). The one or more communication media may include routers, switches, base stations, or other devices facilitating communication from source device 10 to destination device 20, etc., which are not specifically limited in this embodiment.

[0145] In one possible implementation, storage device 40 can store the received encoded image sent by source device 10, and destination device 20 can directly retrieve the encoded image from storage device 40. Under such conditions, storage device 40 can include any of a variety of distributed or locally accessed data storage media. For example, any of these distributed or locally accessed data storage media can be a hard disk drive, Blu-ray disc, digital versatile disc (DVD), compact disc read-only memory (CD-ROM), flash memory, volatile or non-volatile memory, or any other suitable digital storage media for storing bitstreams.

[0146] In one possible implementation, storage device 40 may correspond to a file server or another intermediate storage device that can store the bitstream generated by source device 10, and destination device 20 may stream or download the images stored on storage device 40. The file server can be any type of server capable of storing encoded images and sending them to destination device 20. In one possible implementation, the file server may include a web server, a file transfer protocol (FTP) server, a network attached storage (NAS) device, or a local disk drive, etc. Destination device 20 can acquire the encoded images via any standard data connection (including an Internet connection). Any standard data connection may include a wireless channel (e.g., Wi-Fi connection), a wired connection (e.g., digital subscriber line (DSL), cable modem, etc.), or a combination of both suitable for acquiring encoded images stored on a file server. The transmission of encoded images from storage device 40 may be streaming, downloading, or a combination of both.

[0147] It should be noted that the implementation environment shown in Figure 8 is only one possible implementation method, and the technology of this application embodiment can be applied not only to the source device 10 that can encode images and the destination device 20 that can decode encoded images shown in Figure 8, but also to other devices that can encode images and decode bitstreams. This application embodiment does not specifically limit this.

[0148] In the implementation environment shown in Figure 8, source device 10 includes a data source 120, an encoder 100, and an output interface 140. In some embodiments, output interface 140 may include a modem / demodulator and / or a transmitter, wherein the transmitter may also be referred to as a transmitter. Data source 120 may include an image capture device (e.g., a camera, etc.), an archive containing previously captured images, a feed interface for receiving images from an image content provider, and / or a computer graphics system for generating images, or a combination of these sources of images.

[0149] In this embodiment, the data source 120 can send images to the encoder 100, which can encode the received images to obtain an encoded image. The encoder can then send the encoded image to an output interface. In some embodiments, the source device 10 directly sends the encoded image to the destination device 20 via the output interface 140. In other embodiments, the encoded image can also be stored on the storage device 40 for later retrieval by the destination device 20 for decoding and / or display.

[0150] In the implementation environment shown in Figure 8, the destination device 20 includes an input interface 240, a decoder 200, and a display device 220. In some embodiments, the input interface 240 includes a receiver and / or a modem. The input interface 240 may receive encoded images via link 30 and / or from storage device 40, and then send them to the decoder 200, which may decode the received encoded images to obtain decoded images. The decoder may send the decoded images to the display device 220. The display device 220 may be integrated with the destination device 20 or may be external to the destination device 20. Generally, the display device 220 displays the decoded images. The display device 220 may be any type of display device, for example, a liquid crystal display (LCD), a plasma display, an organic light-emitting diode (OLED) display, or other types of display devices.

[0151] It should be understood that, although not shown in Figure 8, in some respects, encoder 100 and decoder 200 may be integrated with each other and may include appropriate multiplexer-demultiplexer (MUX-DEMUX) units or other hardware and software for encoding both audio and video in a common data stream or separate data streams. In some embodiments, the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol, or other protocols such as User Datagram Protocol (UDP), if applicable.

[0152] Encoder 100 and decoder 200 may each be any of the following circuits: one or more microprocessors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), discrete logic, hardware, or any combination thereof. If the techniques of the embodiments of this application are implemented in part in software, the apparatus may store instructions for software in a suitable non-volatile computer-readable storage medium, and one or more processors may be used to execute the instructions in hardware to implement the techniques of the embodiments of this application. Any of the foregoing (including hardware, software, combinations of hardware and software, etc.) may be considered as one or more processors. Each of encoder 100 and decoder 200 may be included in one or more encoders or decoders, and either encoder or decoder may be integrated as part of a combined encoder / decoder (encoder-decoder) in the respective apparatus.

[0153] In this application embodiment, encoder 100 may be generally referred to as an apparatus that “signals” or “sends” certain information to, for example, decoder 200. The terms “signals” or “sends” may generally refer to the transmission of syntax elements and / or other data for decoding a compressed image. This transmission may occur in real time or nearly in real time. Alternatively, this communication may occur after a period of time, for example, during encoding when syntax elements are stored in a computer-readable storage medium in a encoded bitstream, and the decoding apparatus may then retrieve the syntax elements at any time after they have been stored in this medium.

[0154] The texture encoding / decoding method provided in this application embodiment can be applied to various scenarios and system architectures. Taking the system architecture shown in Figure 5 as an example, the images to be encoded and decoded can be textures in image files or textures in video files. It should be noted that, in conjunction with the implementation environment shown in Figure 8, any texture encoding method described below can be executed by the encoder 100 in the source device 10. This encoder 100 is implemented by software, hardware, or a combination of both, becoming part or all of the texture generation platform in this application embodiment. Similarly, any texture decoding method described below can be executed by the decoder 200 in the destination device 20. This decoder is implemented by software, hardware, or a combination of both, becoming part or all of the client in this application embodiment.

[0155] For example, the texture generation platform can be a server cluster or distributed system composed of multiple physical servers, or it can be a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs), and big data and artificial intelligence platforms, or a cloud computing service center. This cloud platform can implement the texture encoding method provided in any of the embodiments shown in Figures 10 to 13 below.

[0156] Please refer to Figure 9, which is a schematic diagram of a client according to an embodiment of this application. The client includes at least one processor 901, a communication bus 902, a memory 903, and at least one communication interface 904. The terminal has a certain image rendering capability and can render scene data to obtain intermediate rendering results.

[0157] The processor 901 can be a general-purpose central processing unit (CPU), GPU, network processor (NP), microprocessor, or one or more integrated circuits for implementing the solutions of this application, such as application-specific integrated circuits (ASICs), programmable logic devices (PLDs), or combinations thereof. The aforementioned PLD can be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.

[0158] The communication bus 902 is used to transmit information between the aforementioned components. The communication bus 902 can be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is used to represent it in Figure 9, but this does not mean that there is only one bus or one type of bus.

[0159] The memory 903 may be a read-only memory (ROM), a random access memory (RAM), an electrically erasable programmable read-only memory (EEPROM), an optical disc (including a compact disc read-only memory (CD-ROM), a compressed optical disc, a laser disc, a digital versatile optical disc, a Blu-ray disc, etc.), a magnetic disk storage medium, or other magnetic storage device, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures that can be accessed by a computer, but not limited thereto. The memory 903 may exist independently and be connected to the processor 901 via a communication bus 902. Alternatively, the memory 903 may be integrated with the processor 901.

[0160] Communication interface 904 uses any transceiver-like device for communicating with other devices or communication networks. Communication interface 904 includes a wired communication interface and may also include a wireless communication interface. The wired communication interface may be, for example, an Ethernet interface. The Ethernet interface may be an optical interface, an electrical interface, or a combination thereof. The wireless communication interface may be a wireless local area network (WLAN) interface, a cellular network communication interface, or a combination thereof.

[0161] In a specific implementation, as one example, processor 901 may include one or more CPUs, such as CPU0 and CPU1 as shown in FIG9.

[0162] In a specific implementation, as one example, the client may include multiple processors, such as processor 901 and processor 905 as shown in FIG9. Each of these processors may be a single-core processor or a multi-core processor. Here, a processor may refer to one or more devices, circuits, and / or processing cores used to process data (such as computer program instructions).

[0163] In a specific implementation, as one example, the client may further include an output device 906 and an input device 907. The output device 906 communicates with the processor 901 and can display information in various ways. For example, the output device 906 may be a liquid crystal display (LCD), a light-emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector, etc. The input device 907 communicates with the processor 901 and can receive user input in various ways. For example, the input device 907 may be a mouse, a keyboard, a touchscreen device, or a sensing device, etc.

[0164] In some embodiments, memory 903 is used to store program code 910 for executing the scheme of this application, and processor 901 can execute the program code 910 stored in memory 903. The program code 910 may include one or more software modules, and the client can use processor 901 and program code 910 in memory 903 to implement the texture decoding method provided in any of the embodiments of FIG14 to FIG24 below, and / or implement the texture encoding method provided in any of the embodiments of FIG10 to FIG13.

[0165] The application scenarios and system architectures described in this application are for the purpose of more clearly illustrating the technical solutions of this application, and do not constitute a limitation on the technical solutions provided in this application. As those skilled in the art will know, with the emergence of new business scenarios, as well as the development of hardware devices and the upgrading of system architecture, the technical solutions provided in this application are also applicable to similar technical problems.

[0166] Next, the texturing encoding and decoding methods provided in this application will be explained in detail. The main idea of ​​the encoding and decoding methods provided in this application is data reuse. That is, during the encoding of the texture, some feature data is encoded (for example, some feature data is discarded), and during the decoding process, the feature data not encoded into the bitstream is reconstructed through data reuse, thereby reconstructing the original texture. In other words, data reuse improves the compression ratio, reduces the bitstream size, and alleviates the occupation of network bandwidth and storage space by texture data.

[0167] Figure 10 is a flowchart of a texture encoding method provided in an embodiment of this application. The method is applied to the encoding end. Please refer to Figure 10. The method includes the following steps.

[0168] Step 1001: Obtain one or more textures, including the first texture.

[0169] In this embodiment, the encoding end can encode one or more textures. First, one or more textures are obtained. These one or more textures may include texture images or texture maps. This embodiment does not limit the specific textures.

[0170] Step 1002: Encode the one or more textures to obtain feature data of the one or more textures, including first feature data of the first texture, wherein the first feature data is a partial feature data of the first texture.

[0171] The following describes the implementation process of encoding the first texture. For the textures other than the first texture in the one or more textures, they can be encoded using the same or similar methods.

[0172] In the embodiments of this application, the feature data of the first texture can be determined based on neural texture encoding and decoding technology, or based on traditional texture encoding and decoding technology, or by other possible methods. The embodiments of this application do not limit this.

[0173] Taking the determination of the feature data of the first texture based on neural network encoding and decoding technology as an example, similar to the encoding process shown in Figure 3, the encoding end determines the feature data of the first texture through iterative updates. This feature data includes the fourth feature data and the network parameters of the neural network model, which are the nonlinear transformation parameters. The feature data of the first texture can also be determined based on other possible methods similar to neural texture encoding and decoding technology, such as replacing the neural network model with other nonlinear functions and determining the fourth feature data and the parameters of the nonlinear function through iterative updates. The parameters of the nonlinear function are the nonlinear transformation parameters.

[0174] In this embodiment, the first texture is encoded through iterative updates to obtain fourth feature data and nonlinear transformation parameters. That is, iteratively updating the fourth feature data makes it more accurately represent the first texture, and updating the nonlinear transformation parameters improves the texture reconstruction effect. This will be described in detail below.

[0175] Before the first iteration, the fourth feature data and the nonlinear transformation parameters are initialized, i.e., the initial values ​​of the fourth feature data and the nonlinear transformation parameters are determined. The initialization method can be referred to the relevant content in Figure 3. As an example, the fourth feature data can be initialized as random noise, or the fourth feature data can be obtained through other encoding methods, such as processing the first texture using a neural network model to obtain the initial value of the fourth feature data. Similarly, the initial values ​​of the nonlinear transformation parameters can also be obtained through initialization, such as initializing the nonlinear transformation parameters as random noise, or the nonlinear transformation parameters can be obtained through other methods, such as manual setting. This application does not limit the initialization method of the fourth feature data and the nonlinear transformation parameters.

[0176] In the first iteration, the encoder reconstructs the first texture based on the initialized fourth feature data and nonlinear transformation parameters. The encoding loss for this iteration is determined based on the reconstructed first texture and the original first texture. In each subsequent iteration, the encoder updates the fourth feature data and nonlinear transformation parameters of the first texture based on the encoding loss from the previous iteration. Then, it continues to reconstruct the first texture based on the current fourth feature data and nonlinear transformation parameters. The encoding loss for this iteration is determined again based on the reconstructed first texture and the original first texture. The iteration ends when the exit condition is met. The fourth feature data and nonlinear transformation parameters from the final iteration become the feature data of the first texture.

[0177] In one implementation, the iteration exit condition includes an iteration count threshold (referred to as the count threshold). If the number of updates to the fourth feature data (i.e., the number of iterations) reaches the count threshold after the current update of the fourth feature data and the nonlinear transformation parameters, the iteration ends.

[0178] In another implementation, the iteration exit condition includes a loss threshold; if the encoding loss obtained during the current iteration does not exceed the loss threshold, the iteration ends.

[0179] In another implementation, the iteration exit condition includes a loss threshold and an iteration count threshold. If the number of updates to the fourth feature data (i.e., the number of iterations) after the current update of the fourth feature data and nonlinear transformation parameters has not reached the iteration count threshold, then the first texture is reconstructed based on the current fourth feature data and nonlinear transformation parameters. Based on the reconstructed first texture and the original first texture, the encoding loss of this iteration is determined. If the encoding loss obtained in this iteration does not exceed the loss threshold, the iteration ends. That is, the iteration can be terminated early before reaching the iteration count threshold.

[0180] In addition to the above-mentioned iteration exit conditions, there may be other iteration exit conditions, which are not limited in the embodiments of this application.

[0181] In some embodiments, a secondary compression step can be added to the above iteration process to further improve the compression ratio. This will be explained in conjunction with Figure 11.

[0182] Figure 11 is a flowchart of another texture encoding method provided in an embodiment of this application. Referring to Figure 11, the encoding end reconstructs the first texture based on the fourth feature data and nonlinear transformation parameters in the current iteration process, including: encoding (i.e., compressing) the fourth feature data and nonlinear transformation parameters in the current iteration process to obtain the fifth feature data and nonlinear transformation compression parameters in the current iteration process; decoding the fifth feature data and nonlinear transformation compression parameters in the current iteration process to obtain feature texture data (also called texture feature data, or texture feature map, etc.) and reconstructed nonlinear transformation parameters; and reconstructing the first texture based on the feature texture data and reconstructed nonlinear transformation parameters. It should be understood that if the fourth feature data is regarded as feature data obtained through one encoding, then the fifth feature data can be regarded as feature data obtained through two encodings. The fifth feature data and nonlinear transformation compression parameters obtained through the last compression are the feature data of the first texture.

[0183] There are various ways to encode the fourth feature data and the nonlinear transformation parameters, such as one or more of entropy coding, vector quantization, and wavelet transform. This application embodiment does not limit this method. The method of encoding the fourth feature data can be the same as or different from the method of encoding the nonlinear transformation parameters. This application embodiment does not limit this method. The method of decoding the fourth feature data and the nonlinear transformation parameters corresponds to (i.e., matches) the method of encoding them. If the encoding method is entropy coding, then the corresponding decoding method is entropy decoding. If the encoding method is vector quantization, then the corresponding decoding method is inverse vector quantization.

[0184] The process of reconstructing the first texture based on feature texture data and reconstructed nonlinear transformation parameters includes: decoding the feature texture data to obtain reconstructed feature data; constructing feature vectors from the reconstructed feature data to obtain multiple feature vectors; and performing a nonlinear transformation on the multiple feature vectors according to the reconstructed nonlinear transformation parameters to reconstruct the first texture. This nonlinear transformation can be implemented based on a neural network model or a nonlinear transformation function, etc.

[0185] The texture decoding method can be determined based on the specific texture encoding format of the feature texture data. If the feature texture data is in ASTC format, the texture decoding method is ASTC decoding; if the feature texture data is in BC format, the texture decoding method is BC decoding. The feature vector can be constructed in the manner described in step (22) of the decoding process in the neural texture encoding and decoding technology, or in other possible ways. This application embodiment does not limit this method.

[0186] In some embodiments, a secondary compression step can be added after the iteration. In the texture encoding process, the first texture is reconstructed based on the fourth feature data and nonlinear transformation parameters obtained in the current iteration. This includes: decoding the fourth feature data in the current iteration to obtain reconstructed feature data of the first texture; constructing multiple feature vectors based on the reconstructed feature data using feature vectors; and performing a nonlinear transformation on these multiple feature vectors according to the nonlinear transformation parameters to reconstruct the first texture. After the iteration, the current fourth feature data is encoded to obtain fifth feature data, and the current nonlinear transformation parameters are encoded to obtain nonlinear transformation compression parameters. The fifth feature data and the nonlinear transformation compression parameters constitute the feature data of the first texture.

[0187] The aforementioned encoding loss can be obtained by calculating the mean absolute error (MAE), mean square error (MSE), or cross-entropy loss between the reconstructed first texture and the original first texture, or by other means. This application does not limit this method.

[0188] It should be understood that the fourth feature data can be data in a texture-encoded format. The fourth feature data may include endpoint feature data and weight feature data corresponding to each feature map in the multiple feature maps of the first texture. The endpoint feature data represents the texel value range of the first texture, and the weight feature data is used to determine the texel value of the first texture by combining this texel value range. Of course, the fourth feature data can also be data in other formats, such as feature map format data; this embodiment of the application does not limit this.

[0189] Taking the use of traditional texture encoding and decoding techniques to determine the feature data of the first texture as an example, Figure 12 is a flowchart of another texture encoding method provided in an embodiment of this application. Referring to Figure 12, the encoding end performs texture encoding on the first texture to obtain feature data in a texture encoding format (referred to as texture format data). This texture encoding format feature data can be used as the feature data of the first texture. Alternatively, the texture format data can be encoded a second time (i.e., texture super-compression) to obtain texture format compressed data, which is the feature data of the first texture.

[0190] The texture encoding method for the first texture can be ASTC, BC, or other texture encoding methods, and this application embodiment does not limit this. The texture super-compression method can be entropy coding, vector quantization, or other compression methods, and this application embodiment does not limit this.

[0191] Step 1003: Encode the first feature data into the bitstream.

[0192] The first feature data includes a portion of the feature data from the first texture. That is, the encoder encodes a portion of the first texture's feature data into the bitstream, while omitting the remaining feature data, thus reducing the bitstream size. The following section will explain which portion of the data can be omitted from the bitstream.

[0193] In this embodiment of the application, the feature data encoded into the bitstream is referred to as the first feature data, and the feature data not encoded into the bitstream is referred to as the third feature data. That is, the feature data of the first texture includes the first feature data and the third feature data.

[0194] Based on the data content, in some embodiments, the feature data of the first texture includes endpoint feature data and weight feature data corresponding to each of the N feature maps of the first texture, or it may also include nonlinear transformation parameters (or nonlinear transformation compression parameters, which are essentially also a type of nonlinear transformation parameters; for ease of description, the uncompressed nonlinear transformation parameters and the compressed nonlinear transformation compression parameters will be uniformly referred to as nonlinear transformation parameters in the following text). Wherein, N is a positive integer.

[0195] If we consider the feature data of the first texture at the feature map granularity, it can be divided into data corresponding to N feature maps. Each feature map further includes multiple feature blocks. If we consider the feature data of the first texture at the feature block granularity, it can be divided into feature data corresponding to many feature blocks. Furthermore, if we consider the data type, the feature data of the first texture can be divided into weighted feature data, endpoint feature data, and possibly nonlinear transformation parameters.

[0196] Based on this, the data reuse processing objects in the embodiments of this application can have multiple granularities, the range of data that can be reused can be multiple, the types of data in the processing objects can also be multiple, and the forms of reused data can also be multiple, which will be introduced separately below.

[0197] 1. The granularity of the data reuse processing objects.

[0198] In the embodiments of this application, a texture has one or more feature maps, and each feature map includes multiple feature blocks. The granularity of data reuse can be the feature block granularity, the feature map granularity, or the granularity of feature map + feature block.

[0199] If the granularity of data reuse is at the feature map level, then the feature data corresponding to any feature map can be checked (e.g., redundancy detection or other detection methods; for ease of description, this will be referred to as redundancy detection below), or the feature data corresponding to the feature map can be excluded from the bitstream according to the data reuse rules. If the granularity of data reuse is at the feature block level, then the feature data corresponding to any feature block can be checked for redundancy, or the feature data corresponding to the feature block can be excluded from the bitstream according to the data reuse rules. If the granularity of data reuse is at the feature map + feature block level, then the feature data corresponding to any feature map can first be checked for redundancy, or the feature data corresponding to the feature map can be directly discarded (i.e., excluded from the bitstream) according to the data reuse rules. If the feature data corresponding to the feature map cannot be directly discarded, then redundancy detection is performed on each feature block in the feature map, or the feature data corresponding to the feature block can be discarded according to the data reuse rules.

[0200] 2. The range of data that can be reused.

[0201] In summary, data reuse can be performed between feature maps (or feature blocks) of the same texture, or between feature maps (or feature blocks) of different textures.

[0202] For the aforementioned redundancy detection method, the feature data that can be discarded (i.e., not included in the bitstream) can refer to data in the encoded feature data (and / or feature dataset) that has a similarity exceeding a similarity threshold with the discardable feature data. The encoded feature data can be the encoded feature data of the texture corresponding to the feature data, or the encoded feature data of other textures in the image set to which the texture to which the feature data belongs. The feature dataset can include feature data generated by artificial intelligence (AI) (e.g., generated in the cloud or on the client side), or it can include collected feature data, or feature data obtained through other means. Taking all data corresponding to the first feature map of the first texture as an example, the feature data corresponding to the first feature map has a similarity exceeding a similarity threshold with the feature data corresponding to another encoded feature map of the first texture, or a similarity exceeding a similarity threshold with the feature data corresponding to an encoded feature map of another texture in the image set to which the first texture belongs, or a similarity exceeding a similarity threshold with the feature data corresponding to a feature map in the feature dataset. Taking the discardable feature data as all feature data corresponding to the first feature block of the first texture as an example, the feature data corresponding to the first feature block has a similarity exceeding the similarity threshold with the feature data corresponding to another encoded feature block of the first texture, or has a similarity exceeding the similarity threshold with the feature data corresponding to a feature block of another texture in the image set to which the first texture belongs, or has a similarity exceeding the similarity threshold with a feature block in the feature dataset.

[0203] This data reuse rule approach allows specifying data reuse rules between feature maps (or feature blocks) of the same texture, or between feature maps of different textures. For example, specifying data reuse between feature maps (or feature blocks) of the same texture allows specifying that the i-th feature map of the first texture reuses all feature data corresponding to the j-th feature map of the first texture, or specifying that the k-th feature block of each feature map reuses the feature data corresponding to the f-th feature block of that feature map. Here, i and j have different values, and k and f have different values. Similarly, specifying data reuse between feature maps of different textures allows specifying that the p-th feature map of the first texture reuses all feature data corresponding to the q-th feature map of the second texture, or specifying that the b-th feature block of the a-th feature map of the first texture reuses the feature data corresponding to the d-th feature block of the c-th feature map of the second texture. Here, p and q can be the same or different; a and c can be the same or different; b and d can be the same or different.

[0204] 3. Data types of the data objects being processed for data reuse.

[0205] In this embodiment, the feature data of a texture includes two types: weight feature data and endpoint feature data. The data reuse processing object can be weight feature data, endpoint feature data, or both.

[0206] Taking redundancy detection as an example, redundancy detection can be performed on a portion of the feature data corresponding to the feature map, or on all feature data. Here, the portion of feature data can be part or all of the weight feature data corresponding to the feature map, or part or all of the endpoint feature data. In one possible implementation, if redundancy detection is performed on the weight feature data corresponding to the feature map, and it is determined that there is no data in the bitstream with a similarity exceeding a similarity threshold to the weight feature data corresponding to the feature map, then the weight feature data corresponding to the feature map cannot be discarded (i.e., it needs to be included in the bitstream). In another possible implementation, if redundancy detection is performed on the weight feature data corresponding to the feature map, and it is determined that there is no data in the bitstream with a similarity exceeding a similarity threshold to the weight feature data corresponding to the feature map, then redundancy detection is further performed on the weight feature data corresponding to each feature block in the feature map to determine whether there is data in the bitstream with a similarity exceeding a similarity threshold to the weight feature data corresponding to each feature block. Based on the detection results, it is determined whether the weight feature data corresponding to each feature block can be discarded.

[0207] Taking data reuse rules as an example, data reuse rules can specify which weight feature data and / or endpoint feature data of the first texture can be excluded from the bitstream, and which weight feature data and / or endpoint feature data will be reused during the decoding process for the data not included in the bitstream. As an example, a data reuse rule can specify discarding the weight feature data corresponding to the 2nd to Nth feature maps of each texture, encoding the weight feature data of the 1st feature map and all endpoint feature data corresponding to the Nth feature maps, and reusing the weight feature data corresponding to the reconstructed 1st feature map during the decoding process to obtain the weight feature data corresponding to the reconstructed 2nd to Nth feature maps. Here, N is a positive integer greater than 1.

[0208] In some embodiments, the texture feature data further includes nonlinear transformation parameters (described in detail below). Therefore, the data reuse processing object can be some or all of the nonlinear transformation parameters. Taking the nonlinear transformation parameters as parameters of a neural network as an example, the data reuse processing object can be all the parameters of the neural network, or it can be the parameters of some network layers of the neural network.

[0209] 4. The format of reused data.

[0210] In the embodiments of this application, the decoding process of each texture includes multiple decoding operations executed sequentially. The data obtained by any one of the multiple decoding operations can be reused. For ease of description, the data obtained by the last decoding operation is called the final decoding result, and the data obtained by the other decoding operations before the last decoding operation is called intermediate data (or intermediate results). Therefore, the reused data can be the final decoding result or intermediate data.

[0211] As an example, these multiple decoding operations include parsing, entropy decoding, and inverse vector quantization. Therefore, the data obtained from parsing can be reused directly, the entropy decoding result can be reused, and the data obtained from inverse vector quantization can be reused.

[0212] The data reuse concept in the encoding and decoding methods mentioned in the embodiments of this application has been introduced from multiple perspectives. It should be understood that the various implementation methods from the above perspectives can be used in combination or individually as needed, and the embodiments of this application do not limit this.

[0213] Based on the above, in the embodiments of this application, the third feature data of the first texture (i.e., data not encoded into the bitstream) may include one or more of the following: feature map, feature block, weighted feature data, endpoint feature data, and nonlinear transformation parameters.

[0214] The method for determining the third feature data will be described in detail below.

[0215] The first implementation method determines the third feature data through redundancy detection.

[0216] 1. Implementation methods for feature map granularity reuse.

[0217] In this embodiment, the feature data of the first texture includes N feature maps of the first texture, and the similarity threshold includes a first similarity threshold. If there is a feature map in the bitstream (including encoded feature data) or feature dataset whose similarity to the feature data corresponding to the first feature map in the N feature maps exceeds the first similarity threshold, then the feature data corresponding to the first feature map is determined to be data in the third feature data, wherein the first feature map is any one of the N feature maps.

[0218] 2. Implementation methods for reusing feature maps and feature blocks at the granularity level.

[0219] In this embodiment of the application, each feature map includes multiple feature blocks, and the similarity threshold also includes a second similarity threshold. Based on the feature map granularity, if there is no feature map in the bitstream or feature dataset whose similarity to the feature data corresponding to the first feature map exceeds the first similarity threshold, and there is a feature block in the bitstream or feature dataset whose similarity to the feature data corresponding to the first feature block in the first feature map exceeds the second similarity threshold, then the feature data corresponding to the first feature block is determined to be data in the third feature data, wherein the first feature block is any feature block in the first feature map.

[0220] 3. Implementation methods for feature block granularity reuse.

[0221] If there is a feature block in the bitstream or feature dataset whose similarity to the feature data corresponding to the first feature block in the above N feature maps exceeds the second similarity threshold, then the feature data corresponding to the first feature block is determined to be the data in the third feature data. Here, the first feature block is any feature block in the first feature map, and the first feature map is any feature map in the N feature maps.

[0222] The similarity between the feature data corresponding to the first feature map and the feature data is calculated by performing a similarity calculation on the feature data corresponding to the first feature map. Alternatively, the similarity between the feature data corresponding to the first feature map and the feature data corresponding to the second feature map is a transformed feature map of the feature data corresponding to the first feature map; that is, the feature data corresponding to the first feature map is transformed before the similarity calculation. Alternatively, the feature data corresponding to the feature maps in the encoded feature data and / or the feature data corresponding to the feature maps in the feature dataset can be transformed before calculating the similarity with the first feature map. Or, the feature data corresponding to the feature maps in the encoded feature data and / or the feature data corresponding to the feature maps in the feature dataset, as well as the feature data corresponding to the first feature map, can all be transformed before the similarity calculation. In short, various transformations are used to find as much reusable feature data as possible corresponding to the feature maps. There are many specific implementation methods, which will not be listed here.

[0223] Similarly, the similarity of the feature data corresponding to the first feature block is obtained by calculating the similarity of the first feature block. Alternatively, the similarity of the feature data corresponding to the first feature block includes the similarity of the feature data corresponding to the second feature block, where the feature data corresponding to the second feature block is a transformed feature block of the feature data corresponding to the first feature block; that is, the feature data corresponding to the first feature block is transformed before similarity calculation. Alternatively, the feature data corresponding to the feature blocks in the encoded feature data and / or the feature data corresponding to the feature blocks in the feature dataset can be transformed before calculating the similarity with the feature data corresponding to the first feature block. Or, the feature data corresponding to the feature blocks in the encoded feature data and / or the feature data corresponding to the feature blocks in the feature dataset, as well as the feature data corresponding to the first feature block, can all be transformed before similarity calculation. In other words, various transformations are used to find feature data corresponding to reusable feature blocks as much as possible. There are many specific implementation methods, which will not be listed here.

[0224] The above transformations include, but are not limited to, one or more of rotation, translation, and mirroring. It should be understood that different transformations or the same transformation can be performed on different data, and the embodiments of this application do not limit this.

[0225] 4. Implementation methods for reusing weighted feature data.

[0226] In this embodiment of the application, the feature data of the first texture includes the weight feature data and endpoint feature data corresponding to each of the N feature maps of the first texture.

[0227] In some embodiments, the similarity threshold includes a third similarity threshold. If there is data in the bitstream or feature dataset whose similarity to the first weighted feature data corresponding to the first feature map in the N feature maps exceeds the third similarity threshold, then the first weighted feature data is determined to be data in the third feature data, wherein the first feature map is any one of the N feature maps, and the first weighted feature data is part or all of the weighted feature data corresponding to the first feature map.

[0228] In one possible implementation, if there is no data in the bitstream or feature dataset whose similarity to the weight feature data corresponding to the first feature map in the N feature maps exceeds the third similarity threshold, then redundancy detection is performed on the weight feature data corresponding to one or more feature blocks in the first feature map to determine whether there is data in the bitstream or feature dataset whose similarity to the weight feature data corresponding to one or more feature blocks exceeds the third similarity threshold. If so, the weight feature data corresponding to the one or more feature blocks is determined to be data in the third feature data.

[0229] The number of one or more feature blocks here can be set to a fixed number, such as 1, 2, 3, or a number less than S, where S represents the total number of feature blocks in a feature map. Taking a fixed number of 1 as an example, redundancy detection is performed on the weight feature data corresponding to each feature block. Taking a fixed number of 2 as an example, the weight feature data corresponding to any two adjacent feature blocks can be treated as a whole, and redundancy detection is performed on this whole. For example, redundancy detection is performed on the weight feature data corresponding to the 1st and 2nd feature blocks, on the 2nd and 3rd feature blocks, and so on, until redundancy detection is performed on the weight feature data corresponding to the S-1 to Sth feature blocks.

[0230] Alternatively, the number of one or more feature blocks can be set as a variable, with a maximum range of [1, S-1]. The encoder can treat the weight feature data corresponding to any adjacent feature blocks of this variable as a whole and perform redundancy detection on this whole. The encoder can iterate through this variable from largest to smallest. When iterating to a certain value, if there is data in the bitstream or feature dataset whose similarity to the weight feature data corresponding to adjacent feature blocks of this variable exceeds the third similarity threshold, then the weight feature data corresponding to adjacent feature blocks of this variable can be identified as data in the third feature data. In the subsequent redundancy detection process, the encoder needs to remove the data already identified as third feature data and continue to perform redundancy detection on the weight feature data corresponding to the remaining feature blocks. It can be seen that by using variables, more data can be compared, thereby finding redundant data to a greater extent.

[0231] The above implementation first performs redundancy detection on the weighted feature data corresponding to the feature map. If no highly similar data is detected, the size is reduced, and redundancy detection is performed on the reduced-size weighted feature data. Of course, in some other implementations, redundancy detection can also be performed directly on the weighted feature data according to a fixed number of feature blocks.

[0232] In other embodiments, redundancy detection is performed directly on the weighted feature data corresponding to the feature block. That is, if there is data in the bitstream or feature dataset whose similarity to the second weighted feature data corresponding to the first feature block in the N feature maps exceeds a fifth similarity threshold, then the second weighted feature data is determined to be the third feature data. Here, the first feature block is any feature block in the first feature map, the first feature map is any feature map in the N feature maps, and the first weighted feature data is part or all of the weighted feature data corresponding to the first feature block.

[0233] In one implementation, redundancy detection is directly performed on the entire weighted feature data corresponding to the first feature block. In another implementation, redundancy detection is first performed on the entire weighted feature data corresponding to the first feature block. If there is no data in the bitstream or feature dataset whose similarity to the entire weighted feature data corresponding to the first feature block exceeds the fifth similarity threshold, then redundancy detection is performed on the weighted feature data corresponding to one or more feature points in the first feature block to determine whether there is data in the bitstream or feature dataset whose similarity to the weighted feature data of one or more feature points exceeds the fifth similarity threshold. If so, the weighted feature data of the one or more feature points is determined to be data in the third feature data.

[0234] The number of one or more feature points here can be set to a fixed number, such as 1, 2, 3, or a number less than P, where P represents the total number of feature points in a feature block. Taking a fixed number of 1 as an example, redundancy detection is performed on the weight feature data of each feature point. Taking a fixed number of 2 as an example, the weight feature data of any two adjacent feature points can be treated as a whole, and redundancy detection is performed on this whole. For example, redundancy detection is performed on the weight feature data of the 1st to 2nd feature points, on the 2nd to 3rd feature points, and so on, until redundancy detection is performed on the weight feature data of the (P-1)th to Pth feature points.

[0235] Alternatively, the number of one or more feature points can be set as a variable, with a maximum range of [1, S-1]. The encoder can treat the weight feature data of any adjacent feature points of this variable as a whole and perform redundancy detection on this whole. The encoder can iterate through this variable from largest to smallest. When iterating to a certain value, if there is data in the bitstream or feature dataset whose similarity to the weight feature data of adjacent feature points of this variable exceeds the fifth similarity threshold, then the weight feature data of adjacent feature points of this variable can be identified as data in the third feature data. In the subsequent redundancy detection process, the encoder needs to remove the data already identified as third feature data and continue to perform redundancy detection on the weight feature data of the remaining feature points. It can be seen that by using variables, more data can be compared, thereby finding redundant data to a greater extent.

[0236] The above implementation first performs redundancy detection on the weighted feature data corresponding to the feature block. If no highly similar data is detected, the size is reduced, and redundancy detection is performed on the reduced-size weighted feature data. Of course, in some other implementations, redundancy detection can also be performed directly on the weighted feature data using a fixed number of feature points.

[0237] Based on the reuse of weighted feature data, in one possible implementation, the third feature data does not include the endpoint feature data corresponding to each of the N feature maps, while the first feature data encoded into the bitstream includes the endpoint feature data corresponding to each of the N feature maps. That is, redundancy detection is performed on the weighted feature data to find the parts that can be omitted from the bitstream, while all endpoint feature data is encoded into the bitstream.

[0238] In one possible implementation, the similarity of the weighted feature data is calculated directly by performing a similarity calculation on the corresponding data. In another possible implementation, the similarity of the weighted feature data can also be calculated after transforming the corresponding data. These transformations include, but are not limited to, one or more of rotation, translation, and mirroring. It should be understood that different transformations or the same transformation can be applied to different data; this application does not limit this approach.

[0239] 5. Implementation of endpoint feature data reuse.

[0240] In one possible implementation, the similarity threshold also includes a fourth similarity threshold. If there is data in the bitstream or feature dataset whose similarity to the first endpoint feature data corresponding to the first feature map exceeds the fourth similarity threshold, then the first endpoint feature data is determined to be data in the third feature data. The first endpoint feature data is part or all of the endpoint feature data corresponding to the first feature map. That is, redundancy detection can also be performed on the endpoint feature data.

[0241] In one implementation, redundancy detection is directly performed on the entire endpoint feature data corresponding to the first feature map. In another implementation, redundancy detection is first performed on the entire endpoint feature data corresponding to the first feature map. If no data in the bitstream or feature dataset has a similarity exceeding a fourth similarity threshold to the entire endpoint feature data corresponding to the first feature map, then redundancy detection is performed on the endpoint feature data corresponding to one or more feature blocks in the first feature map to determine whether there is data in the bitstream or feature dataset that has a similarity exceeding a fourth similarity threshold to the endpoint feature data corresponding to one or more feature blocks. If such data exists, then the endpoint feature data corresponding to those one or more feature blocks is determined to be data in the third feature data.

[0242] The number of one or more feature blocks here can be set to a fixed number, such as 1, 2, 3, or a number less than S, where S represents the total number of feature blocks in a feature map. Taking a fixed number of 1 as an example, redundancy detection is performed on the endpoint feature data corresponding to each feature block. Taking a fixed number of 2 as an example, the endpoint feature data corresponding to any two adjacent feature blocks can be treated as a whole, and redundancy detection is performed on this whole. For example, redundancy detection is performed on the endpoint feature data corresponding to the 1st and 2nd feature blocks, on the 2nd and 3rd feature blocks, and so on, until redundancy detection is performed on the endpoint feature data corresponding to the (P-1)th to Pth feature blocks.

[0243] Alternatively, the number of one or more feature blocks can be set as a variable, with a maximum range of [1, S-1]. The encoder can treat the endpoint feature data corresponding to any adjacent feature blocks of this variable as a whole and perform redundancy detection on this whole. The encoder can iterate through this variable from largest to smallest. When iterating to a certain value, if there is data in the bitstream or feature dataset whose similarity to the endpoint feature data corresponding to adjacent feature blocks of this variable exceeds the fourth similarity threshold, then the endpoint feature data corresponding to adjacent feature blocks of this variable can be identified as data in the third feature data. In the subsequent redundancy detection process, the encoder needs to remove the data already identified as third feature data and continue to perform redundancy detection on the endpoint feature data corresponding to the remaining feature blocks. It can be seen that by using variables, more data can be compared, thereby finding redundant data to a greater extent.

[0244] The above implementation first performs redundancy detection on the endpoint feature data corresponding to the feature map. If no highly similar data is detected, the size is reduced, and redundancy detection is performed on the reduced-size endpoint feature data. Of course, in some other implementations, redundancy detection can also be performed directly on the endpoint feature data according to a fixed number of feature blocks.

[0245] Building upon endpoint feature data reuse, in one possible implementation, the third feature data does not include the weight feature data of each corresponding feature map in the N feature maps, while the first feature data encoded into the bitstream includes the weight feature data corresponding to each feature map in the N feature maps. That is, redundancy detection is performed on the endpoint feature data to find the parts that can be omitted from the bitstream, while all weight feature data is encoded into the bitstream. Of course, in another possible implementation, the aforementioned weight feature data reuse and endpoint feature data reuse can be combined; that is, redundancy detection is performed on both the weight feature data and the endpoint feature data.

[0246] In one possible implementation, the similarity of endpoint feature data is calculated directly by performing similarity calculations on the corresponding data. In another possible implementation, the similarity of endpoint feature data can also be calculated after transforming the corresponding data. These transformations include, but are not limited to, one or more of rotation, translation, and mirroring. It should be understood that different transformations or the same transformation can be applied to different data; this application does not limit this approach.

[0247] 6. Implementation of nonlinear transformation parameter reuse.

[0248] In some embodiments, the feature data of the first texture further includes nonlinear transformation parameters used to reconstruct the first texture, and the aforementioned similarity threshold also includes a seventh similarity threshold. If there is data in the bitstream or feature dataset whose similarity to part or all of the nonlinear transformation parameters of the first texture exceeds the seventh similarity threshold, then part or all of the nonlinear transformation parameters are determined to be data in the third feature data.

[0249] In one possible implementation, redundancy detection is performed directly on the entire nonlinear transformation parameter. In another possible implementation, redundancy detection is first performed on the entire nonlinear transformation parameter. If no data in the bitstream or feature dataset has a similarity exceeding the seventh similarity threshold with the nonlinear transformation parameter, then redundancy detection is performed on a subset of the nonlinear transformation parameter. Taking the nonlinear transformation parameter as an example, which includes parameters from multiple network layers in a neural network model, this subset of data can be parameters from a subset of network layers. These subsets can be specific network layers or one or more network layers currently being traversed. In yet another possible implementation, redundancy detection is performed directly on the parameters of a specific network layer within the nonlinear transformation parameter. Other implementations are also possible, but they will not be listed here.

[0250] It should be understood that the reuse of nonlinear transformation parameters can be combined with the reuse of other data or applied alone, and the embodiments of this application do not limit this.

[0251] The first to seventh similarity thresholds mentioned above can be the same or different. If two similarity thresholds are the same, a single parameter can be set as that threshold. Alternatively, two parameters with equal values ​​can be set as two thresholds.

[0252] The similarity mentioned above can be obtained by directly calculating the Euclidean distance, Hamming distance, etc. between two data points, or by clustering all feature maps (or feature blocks, or weighted feature data, or endpoint feature data) to obtain multiple sets of data. The similarity of each set of data obtained by clustering is usually relatively high. Based on this, one data point in each set is retained, the remaining data is discarded, the retained data is encoded into the bitstream, and the discarded data is the third feature data.

[0253] Besides calculating similarity results such as Euclidean distance and Hamming distance, and clustering methods, other methods can also be used to determine the third feature data. For example, other deep learning methods besides clustering can be used to determine the third feature data. This application does not limit this approach.

[0254] The second approach is to determine the third feature data according to data reuse rules.

[0255] In other words, redundancy detection is not performed; instead, the data that does not need to be included in the bitstream is determined directly according to the data reuse rules.

[0256] There are many ways to implement data reuse rules. Different scenarios can have different data reuse rules, and different texture sets can also have different data reuse rules. Data reuse rules can be flexibly set according to the actual situation. Here are some examples of data reuse rules.

[0257] Example 1: The data reuse rule indicates that the weight feature data corresponding to the second to Nth feature maps is the third feature data, and the second to Nth feature maps can reuse the weight feature data corresponding to the first feature map.

[0258] Example 2: The data reuse rule indicates that the endpoint feature data corresponding to the second to Nth feature maps are the third feature data, and the second to Nth feature maps can reuse the endpoint feature data corresponding to the first feature map.

[0259] Example 3: The data reuse rule indicates that the feature data corresponding to the 2nd, 4th, 6th... feature blocks in each feature map is the third feature data, the 2nd feature block can reuse the feature data corresponding to the 1st feature block, the 4th feature block can reuse the feature data corresponding to the 3rd feature block, and so on.

[0260] Example 4: The data reuse rule indicates that the feature data corresponding to the second row of the feature block in the feature map is the third feature data, and the second row of the feature block can reuse the feature data corresponding to the first feature block.

[0261] After determining the third feature data according to the redundancy detection in the first implementation method described above, the encoder also needs to encode the multiplexing indication information into the bitstream. The multiplexing indication information is used to obtain the second feature data, and the second feature data is used to reconstruct the first feature data to reconstruct the feature data of the first texture. That is, the multiplexing indication information is used to instruct the decoder to accurately perform data multiplexing during the decoding process to decode the texture.

[0262] In one implementation, the multiplexing indication information indicates the second feature data, which is used to reconstruct the third feature data. The second feature data is the data multiplexed at the decoding end.

[0263] As an example, the reuse indication information includes a reuse index, which is an index of the second feature data, used to indicate the second feature data. Taking the second feature data as including all feature data corresponding to a feature block in a feature map, the index includes the image identifier (optional) to which the feature map belongs, the identifier of the feature map to which the feature block belongs (e.g., which feature map in the image), and the identifier of the feature block (e.g., which feature block in the feature map). Taking the second feature data as including weighted feature data corresponding to a feature map, the index includes the image identifier (optional) to which the feature map belongs and the index of the weighted feature data corresponding to the feature map.

[0264] In one implementation, the multiplexing instruction information also includes a transformation method, which indicates how the second feature data should be transformed. That is, it instructs the decoding end to transform the second feature data according to this transformation method before multiplexing. It should be understood that the transformation method is optional.

[0265] In one implementation, the multiplexing indication information includes a multiplexing switch identifier, which indicates whether multiplexing is enabled to reconstruct the first feature data. When the multiplexing switch identifier is enabled, the multiplexing indication information may further include a multiplexing index. When the multiplexing switch identifier is disabled, the multiplexing indication information does not include a multiplexing index. In another implementation, the multiplexing indication information includes a confirmed multiplexing identifier, which indicates that it has been confirmed that the first feature data needs to be reconstructed through multiplexing.

[0266] In one implementation, the reuse indication information includes a reuse mode, which indicates whether to reuse a portion of the data of the reuse object or to reuse all the data of the reuse object. The reuse index includes the index of the reuse object. Here, the reuse object can be a feature map, a feature block, or a nonlinear transformation parameter. Taking a feature map as an example, if the reuse mode indicates that all feature data corresponding to the feature map should be reused, the reuse index includes the index of the feature map, and all feature data corresponding to the feature map constitutes the second feature data; the index of the feature map is also the index of the second feature data. If the reuse mode indicates that only a portion of the feature data corresponding to the feature map should be reused, the reuse index includes the index of that portion of the feature data.

[0267] In one possible implementation, the above transformation method is part of the reuse pattern. In another possible implementation, the above transformation method is not part of the reuse pattern.

[0268] The multiplexing switch identifier, confirmed multiplexing identifier, multiplexing mode, multiplexing index, and transformation method mentioned above can be in the packet header of the bitstream or in other locations, and this application embodiment does not limit this. For example, the multiplexing switch identifier is located in the packet header, and the multiplexing mode and multiplexing index are in the corresponding fields of the corresponding feature map (or feature block).

[0269] As an example, taking the case where the transformation method is not part of the multiplexing mode, the multiplexing indication information includes a multiplexing switch identifier. When the multiplexing switch identifier is on, it also includes the index (or identifier) ​​of the processing object (i.e., the object that needs to perform data multiplexing), the index (or identifier) ​​of the multiplexing object, the multiplexing mode, and the transformation method, where the multiplexing mode and the transformation method are both optional. If all data of the multiplexing object is reused directly by default, then the multiplexing mode and the transformation method are not included; otherwise, the multiplexing mode and / or the transformation method can be included. If the multiplexing mode indicates that part of the data of the multiplexing object is reused, then the index of that part of the data is also included. If the multiplexing data needs to be transformed before reuse, then the transformation method is also included.

[0270] Based on the above introduction, examples of multiplexing indication information in two bitstream structures are given here.

[0271] Example 1:

[0272] Multiplexing object identifier: that is, the identifier of the current decoding object, for example, 1 represents the feature map, 2 represents the feature block, 3 represents the nonlinear transformation parameter, etc.

[0273] Multiplexing switch identifier: Each multiplexing processing object (such as a feature map) has a bit identifier. The identifier is 1 to indicate multiplexing, and the identifier is 0 to indicate no multiplexing.

[0274] If (reuse):

[0275] Reuse object index: Get the index of the reuse object, such as using 003 to indicate the reuse of the 3rd feature map;

[0276] Reuse modes: full reuse, partial reuse, reuse + transformation;

[0277] If (partial reuse)

[0278] Partial index;

[0279] If (transformation)

[0280] Transformation method.

[0281] Example 2:

[0282] Reuse processing object identifiers: for example, use 1 to represent feature map, 2 to represent feature block, 3 to represent nonlinear transformation parameters, etc.

[0283] Multiplexing switch identifier: There is a multiplexing table. The values ​​in the table indicate the index of the multiplexing processing object. For example, (001,007,017) means that the first, 7th, and 17th images need to be multiplexed.

[0284] If (reuse):

[0285] Reuse object index: Get the index of the reuse object, such as using 003 to indicate the reuse of the 3rd feature map;

[0286] Reuse modes: full reuse, partial reuse, reuse + transformation;

[0287] If (partial reuse)

[0288] Partial index;

[0289] If (transformation)

[0290] Transformation method.

[0291] The aforementioned multiplexing switch identifier can be set for one or more textures, one for each texture, one for each feature map, or one for each feature block. The specific implementation method can be flexibly determined according to the situation. Furthermore, the aforementioned multiplexing switch identifier is also optional; it is not necessary to encode the multiplexing switch identifier in the bitstream. Information such as the multiplexing mode can also indicate whether data multiplexing is required. The above is only an example of multiplexing indication information; other implementation methods are also possible. For example, the bitstream can include an enable switch for the multiplexing mode in the aforementioned multiplexing mode set, and the decoding end determines whether the corresponding multiplexing mode is available based on the enable switch.

[0292] Figure 13 is a flowchart of another texture encoding method provided in an embodiment of this application. Referring to Figure 13, during the encoding of the data related to the i-th feature map, the encoding end encodes the weight feature data corresponding to the i-th feature map to obtain the weight encoded data (also known as weight compressed data) corresponding to the feature map. When encoding the data related to the next feature map, the encoding end confirms the multiplexing switch flag (referred to as multiplexing switch) for the feature map. If the multiplexing switch flag is yes (i.e., on), it means that the weight feature data corresponding to the feature map does not need to be included in the bitstream. Then, the encoding end includes multiplexing indication information about the feature map (such as the index of the weight feature data corresponding to the i-th feature map) in the bitstream. If the multiplexing switch flag is no (i.e., off), the encoding end independently encodes the weight feature data corresponding to the feature map to obtain the weight encoded data corresponding to the feature map.

[0293] After determining the third feature data according to the data multiplexing rules in the second implementation method described above, the encoding end does not need to encode multiplexing indication information into the bitstream, or it can encode multiplexing indication information to ensure that the decoding end can decode accurately.

[0294] In summary, in this embodiment of the application, the encoding end encodes some feature data of the first texture into the bitstream, which reduces the bitstream size, improves the texture compression rate, and reduces the bandwidth occupation and storage space occupation of the bitstream during network transmission.

[0295] Next, the texture decoding method provided in the embodiments of this application will be introduced.

[0296] Figure 14 is a flowchart of a texture decoding method provided in an embodiment of this application. This method is applied to the decoding end, and the texture decoding method matches the texture encoding method shown in Figure 10. Referring to Figure 14, the method includes the following steps.

[0297] Step 1401: Parse the bitstream to obtain feature data of one or more textures, the one or more textures including a first texture, and the feature data of the one or more textures including the first feature data of the first texture.

[0298] It should be understood that, corresponding to the encoding method shown in Figure 10, the bitstream in step 1401 can also be a bitstream of one or more textures. The decoding end parses the bitstream to obtain the feature data of the one or more textures. The obtained feature data includes the first feature data of the first texture, that is, part of the feature data of the first texture. The decoding end can reconstruct the first texture through steps 1402 and 1403.

[0299] Step 1402: Obtain the second feature data.

[0300] That is, the decoding end obtains the second feature data used to complete the feature data of the first texture.

[0301] The second feature data is data obtained from the parsed feature data; or, the second feature data is data obtained after performing any one of multiple decoding operations on the parsed feature data; or, the second feature data is feature data in the feature dataset (such as feature data generated in the cloud and stored locally). The parsed feature data here can be the parsed feature data of the bitstream in step 1401, or it can be the parsed feature data of other bitstreams; this embodiment of the application does not limit this. The various implementation methods for obtaining the second feature data will be described below.

[0302] In the first implementation, the decoding end obtains the second feature data from the parsed feature data.

[0303] As an example, Figure 15 is a flowchart of another texture decoding method provided in an embodiment of this application. The decoding method shown in Figure 15 matches the encoding method shown in Figure 12. Referring to Figure 15, after the decoding end obtains the bitstream, it parses the texture secondary encoding data from the bitstream, performs texture secondary decoding on the texture secondary encoding data (i.e., decoding corresponding to texture supercompression), obtains texture format data, and performs texture decoding on the texture format data to obtain the texture. The bitstream can be the bitstream in step 1401 or other bitstreams, and the decoding end can obtain the second feature data from the parsed texture secondary encoding data.

[0304] As another example, Figure 16 is a flowchart of another texture decoding method provided in an embodiment of this application. The decoding method shown in Figure 16 can be matched with the encoding method shown in Figure 11. Referring to Figure 16, taking the reuse of feature data corresponding to feature map 1 in feature map 2 as an example, after the decoding end obtains the bitstream, it performs data parsing (referred to as parsing) on ​​the bitstream to obtain the endpoint feature data to be decoded (i.e., endpoint compressed data) and the weight feature data to be decoded (i.e., weight compressed data) corresponding to feature map 1. The obtained data is then subjected to entropy decoding and inverse vector quantization in sequence to obtain the endpoint and weight decoding results (i.e., endpoint feature data and weight feature data, which are texture format data) corresponding to feature map 1. In the process of reconstructing feature map 2, the endpoint compressed data and weight compressed data corresponding to feature map 1 parsed from the bitstream can be directly reused (this implementation method is not shown in the figure) to obtain the endpoint compressed data and weight compressed data corresponding to feature map 2. The second feature data includes the endpoint compressed data and weight compressed data corresponding to feature map 1.

[0305] It should be understood that the entropy decoding and inverse vector quantization shown in Figure 16 correspond to the feature data encoding operation in Figure 11, that is, the feature data encoding operation in Figure 11 can include vector quantization and entropy encoding.

[0306] As can be seen from Figures 15 and 16, directly reusing the parsed feature data can improve decoding efficiency. For example, after parsing the endpoint compressed data and weight compressed data corresponding to feature map 1, feature map 2 can be decoded in parallel. Furthermore, since the second feature data can be obtained from the cache, and the amount of parsed feature data (such as endpoint compressed data and weight compressed data) is much smaller than the amount of data after decoding, reusing the parsed feature data can also reduce the memory (including cache) usage during the decoding process.

[0307] The second implementation involves obtaining the second feature data at the decoding end after performing any one of multiple decoding operations on the parsed feature data.

[0308] Referring again to Figure 15, in the second implementation, after parsing the data of the texture secondary encoding, the decoding end performs texture secondary decoding and texture decoding to obtain the texture. Then, the second feature data can be obtained from the data obtained by performing texture secondary decoding. The obtained data is, for example, the texture format data corresponding to a certain feature map, or the texture format data corresponding to a certain feature block. Alternatively, it can be obtained from the data obtained after performing texture decoding, for example, the image data of a certain texture block.

[0309] Referring again to Figure 16, in the second implementation, during the reconstruction of feature map 2, the decoding end can directly reuse the entropy decoding result corresponding to feature map 1 (this method is shown in the figure), or it can reuse the texture format data corresponding to feature map 1 obtained by inverse vector quantization to reconstruct the texture format data of feature map 3.

[0310] It should be understood that if parsed feature data is reused, or if data obtained from multiple decoding operations (not the last one), then at least one decoding operation still needs to be performed on the reused data. For example, after reusing the entropy decoding result corresponding to feature map 1 as the entropy decoding result corresponding to feature map 2, inverse vector quantization needs to be performed on the entropy decoding result corresponding to feature map 2 to obtain the texture format data corresponding to feature map 2. However, if decoded feature data is reused (i.e., the data obtained from the last decoding operation), then no decoding operation needs to be performed on the current decoding object. In other words, the implementation method of reusing decoded feature data reduces decoding operations and helps to reduce decoding time.

[0311] The third implementation involves the decoder obtaining the second feature data from the feature dataset.

[0312] The feature dataset can be stored locally at the decoding end or elsewhere, such as in the cloud. During the encoding process, if the encoding end determines that there is data in the feature dataset that is similar to the feature data not included in the bitstream, the decoding end can obtain the second feature data from the feature dataset to reconstruct the feature data not included in the bitstream.

[0313] The above three implementation methods introduce the location for obtaining the second feature data. Next, we will introduce the basis for obtaining the second feature data.

[0314] In one implementation, the decoding end obtains multiplexing indication information before obtaining the second feature data, and then obtains the second feature data based on the multiplexing indication information.

[0315] Among them, the reuse instruction information indicates the second feature data.

[0316] For example, the multiplexing indication information includes a multiplexing index, which in turn includes an index of the second feature data. The decoding end can retrieve the second feature data based on the multiplexing index.

[0317] In one possible implementation, the multiplexing indication information also includes a multiplexing switch identifier, and the second feature data is obtained when the multiplexing switch is indicated to be on. For example, the decoder parses the multiplexing switch identifier from the packet header, and if the multiplexing switch identifier is on, the decoder then obtains the multiplexing index to obtain the second feature data.

[0318] In one possible implementation, the multiplexing indication information also includes a multiplexing mode, which indicates whether partial or complete data of the multiplexing object is reused. The multiplexing index includes the index of the multiplexing object. The decoding end can obtain the second feature data from the relevant data of the multiplexing object based on the multiplexing mode. For example, the decoding end obtains the multiplexing mode to determine whether to obtain partial or complete data of the multiplexing object, thereby obtaining the second feature data. If partial data is reused, the multiplexing index also includes the index of this partial data, and the decoding end then accurately obtains the second feature data based on the index of this partial data.

[0319] For various cases where feature data is not encoded into the bitstream at the encoding end, the second feature data may include one or more of the following: feature map, feature block, weighted feature data, endpoint feature data, and nonlinear transformation parameters.

[0320] The second feature data can be either the feature data of the second texture or the feature data of the first texture. The second texture can be one of the textures in step 1401, or it can be another texture, such as a texture in another bitstream. This application embodiment does not limit this.

[0321] In another implementation, the decoding end obtains the second feature data according to the data multiplexing rules. Details regarding the data multiplexing rules can be found in the relevant description of the texture encoding method shown in Figure 10, and will not be repeated here.

[0322] Step 1403: Decode the first feature data based on the second feature data to obtain the first texture.

[0323] After obtaining the second feature data, the decoding end can reconstruct the first texture based on the second feature data and the first feature data.

[0324] The decoding end decodes the first feature data based on the second feature data to obtain the first texture, including: reconstructing the first feature data based on the second feature data to obtain reconstructed feature data of the first texture; and decoding the reconstructed feature data to obtain the first texture.

[0325] It should be understood that, for the first texture, independently decoding the first feature data cannot reconstruct all the feature data of the first texture. Therefore, the decoding end combines the second feature data with the first feature data to decode the first feature data, thereby obtaining all the feature data of the first texture (referred to as reconstructed feature data).

[0326] There are many specific ways to reconstruct the first feature data based on the second feature data, which are related to what the second feature data specifically includes. The embodiments of this application will not list them one by one.

[0327] As an example, taking a first texture with three feature maps and serially decoding these three feature maps, if the first feature data includes the weight feature data corresponding to feature map 1 and the endpoint feature data corresponding to these three feature maps, then the decoder first decodes the weight feature data and endpoint feature data corresponding to feature map 1 from the bitstream. During the reconstruction of feature map 2, the decoder decodes the endpoint feature data corresponding to feature map 2 from the bitstream and obtains the weight feature data corresponding to feature map 1 to obtain the second feature data. The obtained weight feature data is then combined with the endpoint feature data corresponding to feature map 2 to obtain the complete feature data corresponding to feature map 2. During the reconstruction of feature map 3, the decoder decodes the endpoint feature data of feature map 3 from the bitstream and obtains the weight feature data corresponding to feature map 1 to obtain the second feature data. The obtained weight feature data is then combined with the endpoint feature data of feature map 3 to obtain the complete data of feature map 3. After obtaining the complete feature data corresponding to these three feature maps, the decoder obtains the first texture based on this complete feature data.

[0328] In one possible implementation, the decoder transforms the second feature data and / or the first feature data to obtain the reconstructed feature data of the first texture based on the transformed data. This transformation includes, but is not limited to, one or more of rotation, translation, and mirroring. For example, the decoder transforms the relevant data based on the transformation method in the multiplexing indication information.

[0329] After obtaining the reconstructed feature data of the first texture, the decoding end decodes the reconstructed feature data to obtain the first texture.

[0330] The decoding end can perform neural texture decoding on the reconstructed feature data to obtain the first texture. That is, based on the reconstructed feature data, the decoding end obtains multiple feature vectors (i.e., feature vector construction), and performs a nonlinear transformation on these feature vectors according to nonlinear transformation parameters to obtain the first texture. Alternatively, the decoding end can perform texture format decoding on the reconstructed feature data, i.e., traditional texture decoding, to obtain the first texture. The specific decoding method used for the reconstructed feature data is related to the specific texture encoding method used by the encoding end; the texture encoding method and decoding method should match.

[0331] The text decoding method provided in the embodiments of this application will now be explained by way of example with reference to Figures 17 to 24.

[0332] Figure 17 is a flowchart of another texture decoding method provided in an embodiment of this application. Referring to Figure 17, data reuse is inserted during the decoding process. The data reuse processing objects can be different feature maps (or feature blocks) of the same texture, different feature maps (or feature blocks) of different textures, weight feature data and / or endpoint feature data corresponding to feature maps, weight feature data and / or endpoint feature data corresponding to feature blocks, or nonlinear transformation parameters. These processing objects can exist independently, and some of them can coexist. During the decoding process, the decoded results can be reused based on the reuse mode to obtain the decoding result of the current data to be decoded, or the current data to be decoded can be decoded independently to obtain the decoding result.

[0333] Figure 18 is a flowchart of another texture decoding method provided in an embodiment of this application. Referring to Figure 18, during the decoding process, the decoding end obtains a multiplexing switch identifier. When the multiplexing switch identifier is off (i.e., no), the data to be decoded is decoded independently. When the multiplexing switch identifier is on (i.e., yes), a first multiplexing index is obtained, which is the index of the multiplexing object corresponding to the data to be decoded. A multiplexing mode is obtained and the multiplexing mode is confirmed. If the multiplexing mode indicates that all data of the multiplexing object is reused, then all data of the multiplexing object is obtained based on the first multiplexing index and reused directly, or, a transformation method is further obtained, and all data of the multiplexing object is transformed according to the transformation method before reuse. If the multiplexing mode indicates that part of the data of the multiplexing object is reused, then a second multiplexing index is obtained, which is the index of this part of the data. This part of the data is obtained based on the second multiplexing index and reused directly, or, a transformation method is further obtained, and this part of the data is transformed according to the transformation method before reuse.

[0334] Figure 19 is a flowchart of another texture decoding method provided in an embodiment of this application. The decoding method shown in Figure 19 matches the encoding method shown in Figure 13. Referring to Figure 19, during the decoding of the relevant data of the i-th feature map, the decoding end decodes the weight feature data corresponding to the i-th feature map to obtain the weight feature data (also called the weight decoding result) corresponding to the feature map. When decoding the relevant data of the next feature map, the decoding end checks the multiplexing switch for the feature map. If the multiplexing switch is marked as yes, it means that the weight feature data corresponding to the feature map needs to be reconstructed through data multiplexing. Then, the decoding end reuses the weight decoding result corresponding to the i-th feature map as the weight decoding result corresponding to the current feature map based on the multiplexing index in the bitstream (such as the index of the weight feature data corresponding to the i-th feature map). If the multiplexing switch is marked as no, the decoding end independently decodes the weight feature data corresponding to the feature map to obtain the weight decoding result corresponding to the feature map.

[0335] Figure 20 is a flowchart of another texture decoding method provided in this application embodiment, and Figure 21 is an example of the decoding method shown in Figure 20. Referring to Figures 20 and 21, the decoding end independently decodes the weight feature data corresponding to feature map 1 based on the multiplexing switch identifier to obtain the weight decoding result corresponding to feature map 1 (e.g., including the weight feature data corresponding to the five rows and four columns of feature blocks in feature map 1 shown in Figure 21). In the process of decoding the weight feature data corresponding to feature map 2, the decoding end obtains part of the weight encoded data corresponding to feature map 2 from the bitstream (e.g., the weight feature data corresponding to the first column of feature blocks in feature map 2 shown in Figure 21), confirms that data multiplexing needs to be performed on feature map 2 based on the multiplexing switch identifier, and then reuses part of the weight decoding result corresponding to feature map 1 based on the multiplexing mode (e.g., the weight feature data corresponding to the last three columns of feature blocks in feature map 1 shown in Figure 21), and combines the part of the weight encoded data corresponding to feature map 2 with the part of the weight decoding result corresponding to feature map 1 to obtain the weight decoding result corresponding to feature map 2 (e.g., the weight feature data corresponding to the five rows and four columns of feature blocks in feature map 2 shown in Figure 21).

[0336] Figure 22 is a schematic diagram of data transformation and reuse in a texture decoding method provided in this application embodiment. Referring to Figure 22, after the decoding end obtains the weight decoding result corresponding to feature map 1, in the process of reconstructing the weight feature data corresponding to feature map 2, the weight decoding result corresponding to the feature map is reused, that is, the weight decoding result corresponding to feature map 1 is obtained (for example, the weight feature data corresponding to the four-row, four-column feature block shown in Figure 22), and the weight decoding result corresponding to feature map 1 is transformed (for example, rotated as shown in Figure 22) to obtain the weight decoding result corresponding to feature map 2.

[0337] Figure 23 is a flowchart of another texture decoding method provided in an embodiment of this application. Referring to Figure 23, during the decoding process, the weighted feature data corresponding to feature map 1 and feature map 2 of texture image 1 are decoded independently, while feature map 1 and feature map 2 of texture image 2 directly reuse the weighted decoding result corresponding to feature map 1 of texture image 1. That is, this solution supports the reuse of weighted feature data among multiple texture images.

[0338] Figure 24 is a flowchart of another texture decoding method provided in an embodiment of this application. Referring to Figure 24, during the decoding process, the decoding end decodes the bitstream of texture image 1 to obtain the weight encoded data and endpoint encoded data corresponding to feature map 1 of texture image 1, the weight encoded data and endpoint encoded data corresponding to feature map 2, and the encoded data of nonlinear transformation parameters. The decoding end decodes the weight encoded data and endpoint encoded data corresponding to these two feature maps to obtain the texture format data corresponding to feature map 1 and the texture format data corresponding to feature map 2 of texture image 1. The decoding end decodes the encoded data of nonlinear transformation parameters to obtain the nonlinear transformation parameters corresponding to texture image 1. The decoding end obtains the weight encoded data and endpoint feature data corresponding to feature map 1 of texture image 2 from the bitstream and decodes them to obtain the texture format data corresponding to feature map 1 of texture image 2. The decoding end reuses the texture format data corresponding to feature map 2 of texture image 1 to obtain the texture format data corresponding to feature map 2 of texture image 2, and reuses the nonlinear transformation parameters of texture image 1 to obtain the nonlinear transformation parameters corresponding to texture image 2.

[0339] In summary, in this embodiment, even if the bitstream includes some feature data of the first texture, the decoding end can reconstruct the first texture by obtaining the second feature data. Thus, while ensuring reconstruction reliability, the bitstream is reduced, alleviating the bitstream's impact on network transmission bandwidth and storage space.

[0340] Figure 25 is a schematic diagram of a texture decoding device provided in an embodiment of this application. This decoding device can be implemented as part or all of a decoding apparatus (such as the decoder described above) by software, hardware, or a combination of both. Referring to Figure 25, the decoding device includes: a parsing module 2501 and a decoding module 2502.

[0341] The parsing module 2501 is used to parse the bitstream to obtain feature data of one or more textures, the one or more textures including a first texture, and the feature data of the one or more textures including first feature data of the first texture;

[0342] The decoding module 2502 is used to acquire the second feature data and decode the first feature data based on the second feature data to obtain the first texture.

[0343] In one possible implementation, the decoding module 2502 includes:

[0344] The feature reconstruction unit is used to reconstruct the first feature data based on the second feature data to obtain the reconstructed feature data of the first texture;

[0345] The decoding unit is used to decode the reconstructed feature data to obtain the first texture.

[0346] In one possible implementation, the second feature data is data obtained from the parsed feature data; or,

[0347] The second feature data is the data obtained after performing any one of multiple decoding operations on the parsed feature data; or,

[0348] The second feature data is the feature data in the feature dataset.

[0349] In one possible implementation, the decoding device further includes:

[0350] The acquisition module is used to acquire reuse instruction information;

[0351] Decoding module 2502 includes:

[0352] The first acquisition unit is used to acquire the second feature data based on the multiplexing instruction information.

[0353] In one possible implementation, the reuse indication information includes a reuse index, which includes an index of the second feature data.

[0354] In one possible implementation, the multiplexing indication information includes a multiplexing switch identifier and / or a confirmed multiplexing identifier, and the second feature data is obtained when the multiplexing switch identifier is on or the multiplexing indication information includes a confirmed multiplexing identifier.

[0355] In one possible implementation, the reuse indication information includes a reuse pattern, which indicates the pattern for reconstructing the first feature data.

[0356] In one possible implementation, the second feature data includes some or all of the feature data corresponding to the feature maps of one or more of the aforementioned textures.

[0357] In one possible implementation, the second feature data includes some or all of the nonlinear transformation parameters corresponding to one or more of the textures mentioned above.

[0358] In one possible implementation, the second feature data is the feature data of the second texture among the one or more textures mentioned above, or it is the feature data of the first texture.

[0359] In one possible implementation, the feature reconstruction unit is specifically used for:

[0360] The second feature data is transformed to obtain the transformed feature data;

[0361] The first feature data is reconstructed based on the transformed feature data.

[0362] In one possible implementation, the transformation includes one or more of rotation, translation, and mirroring.

[0363] In this embodiment, even if the bitstream includes partial feature data of the first texture, the decoding end can reconstruct the first texture by acquiring the second feature data. Thus, while ensuring reconstruction reliability, the bitstream is reduced, alleviating the bitstream's impact on network transmission bandwidth and storage space.

[0364] It should be noted that the texture decoding device provided in the above embodiments is only illustrated by the division of the above functional modules when decoding the texture bitstream. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. In addition, the texture decoding device and the texture decoding method embodiments provided in the above embodiments belong to the same concept, and their specific implementation process can be found in the method embodiments, which will not be repeated here.

[0365] Figure 26 is a schematic diagram of a texture encoding device provided in an embodiment of this application. The encoding device can be implemented by software, hardware, or a combination of both as part or all of an encoding device (such as the encoder described above). Referring to Figure 26, the encoding device includes: an acquisition module 2601, a first encoding module 2602, and a second encoding module 2603.

[0366] Acquisition module 2601 is used to acquire one or more textures, including a first texture;

[0367] The first encoding module 2602 is used to encode the one or more textures to obtain feature data of the one or more textures, the feature data of the one or more textures including first feature data of the first texture, the first feature data being partial feature data of the first texture;

[0368] The second encoding module 2603 is used to encode the first feature data into the code stream.

[0369] In one possible implementation, the encoding device further includes:

[0370] The third encoding module is used to encode the multiplexing indication information into the code stream. The multiplexing indication information is used to obtain the second feature data, and the second feature data is used to reconstruct the first feature data.

[0371] In one possible implementation, the reuse indication information includes a reuse index, which includes an index of the second feature data.

[0372] In one possible implementation, the second feature data is data from the feature data of one or more textures; or...

[0373] The second feature data is the feature data in the feature dataset.

[0374] In one possible implementation, the multiplexing indication information includes a multiplexing switch identifier and / or a confirmed multiplexing identifier. The multiplexing switch indicates whether multiplexing is enabled to reconstruct the first feature data, and the confirmed multiplexing identifier indicates that the reconstruction of the first feature data has been confirmed through multiplexing.

[0375] In one possible implementation, the feature data of the one or more textures mentioned above includes one or more of the following: weight feature data, endpoint feature data, and nonlinear transformation parameters.

[0376] In this embodiment, the encoding end encodes some feature data of the first texture into the bitstream, which reduces the bitstream size, improves the texture compression rate, and reduces the bandwidth occupation and storage space occupation of the bitstream during network transmission.

[0377] It should be noted that the texture encoding device provided in the above embodiments is only illustrated by the division of the above functional modules. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. In addition, the texture encoding device and texture encoding method embodiments provided in the above embodiments belong to the same concept, and their specific implementation process can be found in the method embodiments, which will not be repeated here.

[0378] This application also provides a computer-readable storage medium storing a computer program that, when run on a computer or processor, causes the computer or processor to perform the steps of the texture encoding method or the steps of the texture decoding method shown in the above embodiments.

[0379] This application also provides a computer program product comprising computer instructions that, when executed by a computer or processor, cause the computer or processor to perform the steps of the texture encoding method shown in the above embodiments, or to perform the steps of the texture decoding method shown in the above embodiments.

[0380] This application also provides a computer program that, when run on a computer or processor, causes the computer or processor to perform the steps of the texture encoding method shown in the above embodiments, or to perform the steps of the texture decoding method shown in the above embodiments.

[0381] This application also provides an encoding / decoding system, which includes an encoding device and a decoding device. The encoding device is used to implement the steps of the texture encoding method shown in the above embodiments, and the decoding device is used to implement the steps of the texture decoding method shown in the above embodiments.

[0382] This application also provides an encoded bitstream, which is generated according to the texture encoding method shown in the above embodiments.

[0383] This application also provides a computer-readable storage medium storing a bitstream generated according to the texture encoding method shown in the above embodiments.

[0384] This application also provides an apparatus for storing a bitstream, the apparatus including a receiver and at least one storage medium, the receiver being used to receive a bitstream generated according to the texture encoding method shown in the above embodiments, and the at least one storage medium being used to store the bitstream.

[0385] This application also provides an apparatus for transmitting a bitstream, which includes a transmitter and a receiver. The receiver is used to receive a bitstream generated according to the texture encoding method shown in the above embodiments, and the transmitter is used to transmit the bitstream to an end-side device through a transmission medium.

[0386] This application also provides an apparatus for transmitting a bitstream, the apparatus including a transmitter and at least one storage medium, the at least one storage medium being used to store a bitstream generated according to the texture encoding method shown in the above embodiments, the transmitter being used to obtain the bitstream from the storage medium and transmit the bitstream to an end-side device through the transmission medium.

[0387] This application also provides a system for distributing bitstreams. The system includes at least one storage medium for storing bitstreams generated according to the texture encoding method shown in the above embodiments. The streaming media device is used to obtain a target bitstream from the at least one storage medium and send the target bitstream to an end-side device. The streaming media device includes a content server or a content distribution server.

[0388] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer, or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., digital versatile disc (DVD)), or a semiconductor medium (e.g., solid state disk (SSD)). It is worth noting that the computer-readable storage medium mentioned in the embodiments of this application can be a non-volatile storage medium; in other words, it can be a non-transient storage medium.

[0389] It should be understood that "at least one" as mentioned herein refers to one or more, and "multiple" refers to two or more. In the description of the embodiments of this application, unless otherwise stated, " / " means "or," for example, A / B can mean A or B; "and / or" in this document is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. In addition, in order to clearly describe the technical solutions of the embodiments of this application, the terms "first," "second," etc., are used in the embodiments of this application to distinguish identical or similar items with substantially the same function and effect. Those skilled in the art will understand that the terms "first," "second," etc., do not limit the quantity or execution order, and the terms "first," "second," etc., are not necessarily different.

[0390] It should be noted that the information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.), and signals involved in the embodiments of this application are all authorized by the user or fully authorized by all parties, and the collection, use, and processing of related data must comply with the relevant laws, regulations, and standards of the relevant countries and regions. For example, the textures involved in the embodiments of this application were all obtained under full authorization.

[0391] The above descriptions are embodiments provided in this application and are not intended to limit this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the protection scope of this application.

Claims

1. A method for decoding textures, characterized in that, The method includes: The bitstream is parsed to obtain feature data of one or more textures, wherein the one or more textures include a first texture, and the feature data of the one or more textures includes the first feature data of the first texture; The second feature data is obtained, and the first feature data is decoded based on the second feature data to obtain the first texture.

2. The method as described in claim 1, characterized in that, The step of decoding the first feature data based on the second feature data to obtain the first texture includes: Based on the second feature data, the first feature data is reconstructed to obtain the reconstructed feature data of the first texture; The reconstructed feature data is decoded to obtain the first texture.

3. The method as described in claim 1 or 2, characterized in that, The second feature data is data obtained from the parsed feature data; or, The second feature data is the data obtained after performing any one of a plurality of decoding operations on the parsed feature data; or, The second feature data is the feature data in the feature dataset.

4. The method according to any one of claims 1-3, characterized in that, The method further includes: Obtain reuse instruction information; The acquisition of the second feature data includes: The second feature data is obtained based on the reuse indication information.

5. The method as described in claim 4, characterized in that, The reuse indication information includes a reuse index, which includes an index of the second feature data.

6. The method as described in claim 4 or 5, characterized in that, The multiplexing indication information includes a multiplexing switch identifier and / or a confirmed multiplexing identifier, and the second feature data is obtained when the multiplexing switch identifier is on or the multiplexing indication information includes the confirmed multiplexing identifier.

7. The method of any one of claims 4-6, wherein the multiplexing indication information includes a multiplexing mode, the multiplexing mode being used to indicate a mode for reconstructing the first feature data.

8. The method according to any one of claims 1-7, characterized in that, The second feature data includes some or all of the feature data corresponding to the feature maps of the one or more textures.

9. The method according to any one of claims 1-8, characterized in that, The second feature data includes some or all of the data in the nonlinear transformation parameters corresponding to the one or more textures.

10. The method according to any one of claims 1-9, characterized in that, The second feature data is the feature data of the second texture in the one or more textures, or it is the feature data of the first texture.

11. The method according to any one of claims 1-10, characterized in that, The reconstruction of the first feature data based on the second feature data includes: The second feature data is transformed to obtain the transformed feature data; Based on the transformed feature data, the first feature data is reconstructed.

12. The method as described in claim 11, characterized in that, The transformation includes one or more of rotation, translation, and mirroring.

13. A texture encoding method, characterized in that, The method includes: Acquire one or more textures, including a first texture; The one or more textures are encoded to obtain feature data of the one or more textures, wherein the feature data of the one or more textures includes first feature data of the first texture, and the first feature data is a partial feature data of the first texture; The first feature data is encoded into the bitstream.

14. The method as described in claim 13, characterized in that, The method further includes: Multiplexing indication information is encoded into the bitstream. The multiplexing indication information is used to obtain second feature data, and the second feature data is used to reconstruct the first feature data.

15. The method as described in claim 14, characterized in that, The reuse indication information includes a reuse index, which includes an index of the second feature data.

16. The method as described in claim 14 or 15, characterized in that, The second feature data is data from the feature data of the one or more textures; or, The second feature data is the feature data in the feature dataset.

17. The method according to any one of claims 14-16, characterized in that, The multiplexing indication information includes a multiplexing switch identifier and / or a confirmed multiplexing identifier. The multiplexing switch indicates whether multiplexing is enabled to reconstruct the first feature data, and the confirmed multiplexing identifier indicates that the reconstruction of the first feature data has been confirmed through multiplexing.

18. The method according to any one of claims 13-17, characterized in that, The feature data of the one or more textures includes one or more of the following: weight feature data, endpoint feature data, and nonlinear transformation parameters.

19. A texture decoding device, characterized in that, The device includes: A parsing module is used to parse the bitstream to obtain feature data of one or more textures, wherein the one or more textures include a first texture, and the feature data of the one or more textures includes the first feature data of the first texture; A decoding module is used to acquire second feature data and decode the first feature data based on the second feature data to obtain the first texture.

20. A texture encoding device, characterized in that, The device includes: The acquisition module is used to acquire one or more textures, the textures including a first texture; A first encoding module is used to encode the one or more textures to obtain feature data of the one or more textures, wherein the feature data of the one or more textures includes first feature data of the first texture, and the first feature data is a partial feature data of the first texture; The second encoding module is used to encode the first feature data into the bitstream.

21. A decoding device, characterized in that, The decoding device includes a memory and a processor; The memory is used to store computer programs; The processor is configured to execute the computer program to implement the steps of the method according to any one of claims 1-12.

22. An encoding device, characterized in that, The encoding device includes a memory and a processor; The memory is used to store computer programs; The processor is configured to execute the computer program to implement the steps of the method according to any one of claims 13-18.

23. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed on a computer or processor, causes the computer or processor to perform the steps of the method according to any one of claims 1-12, or to perform the steps of the method according to any one of claims 13-18.

24. A computer program product, characterized in that, The computer program product includes computer instructions that, when executed by a computer or processor, cause the steps of the method as described in any one of claims 1-12 to be performed, or the steps of the method as described in any one of claims 13-18 to be performed.

25. An encoded bitstream, characterized in that, The bitstream is generated according to the method described in any one of claims 13-18.

26. An encoded bitstream, characterized in that, The bitstream includes first feature data, which is partial feature data of a first texture. The first feature data is obtained by encoding the first texture. The bitstream is a bitstream of one or more textures, including the first texture.

27. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores the bitstream as described in any one of claims 13-18.

28. An apparatus for storing a bitstream, characterized in that, The device includes: a receiver and at least one storage medium; The receiver is used to receive the bitstream as described in any one of claims 13-18; The at least one storage medium is used to store the bitstream.

29. An apparatus for transmitting a code stream, characterized in that, The device includes: a transmitter and a receiver; The receiver is used to receive the bitstream as described in any one of claims 13-18; The transmitter is used to send the bitstream to the end-side device via the transmission medium.

30. An apparatus for transmitting a code stream, characterized in that, The device includes: a transmitter and at least one storage medium; The at least one storage medium is used to store the bitstream as described in any one of claims 13-18; The transmitter is used to obtain the bitstream from the storage medium and send the bitstream to the end-side device through the transmission medium.

31. A system for distributing bitstreams, characterized in that, The system includes: At least one storage medium for storing at least one bitstream as described in any one of claims 13-18; A streaming media device is configured to acquire a target bitstream from the at least one storage medium and send the target bitstream to an end-side device, wherein the streaming media device includes a content server or a content distribution server.