Image decoding and encoding method, apparatus, and device based on a neural network
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- HANGZHOU HIKVISION DIGITAL TECHNOLOGY CO LTD
- Filing Date
- 2026-04-28
- Publication Date
- 2026-07-02
Smart Images

Figure 2026110828000001_ABST
Abstract
Claims
1. The steps include decoding control parameters and image information corresponding to the current block from the bitstream, The steps include obtaining neural network information corresponding to the decoding processing unit from the control parameters, and generating a decoding neural network corresponding to the decoding processing unit based on the neural network information, The process includes the steps of determining input features corresponding to the decoding processing unit based on the image information, and processing the input features based on the decoding neural network to obtain output features corresponding to the decoding processing unit. A neural network-based image decoding method characterized by the following.
2. If the neural network information includes basic layer information and reinforcement layer information, the step of generating a decoding neural network corresponding to the decoding processing unit based on the neural network information is: The steps include determining the basic layer corresponding to the decoding processing unit based on the basic layer information, The steps include determining the reinforcement layer corresponding to the decoding processing unit based on the reinforcement layer information, The step of generating a decoding neural network corresponding to the decoding processing unit based on the basic layer and the reinforcement layer, The method according to feature 1.
3. The step of determining the basic layer corresponding to the decoding processing unit based on the basic layer information is: The basic layer information includes a basic layer default network use flag bit, and if the basic layer default network use flag bit indicates that the basic layer uses the default network, the step includes obtaining the basic layer of the default network structure. The method according to feature 2.
4. The step of determining the basic layer corresponding to the decoding processing unit based on the basic layer information is: The basic layer information includes a basic layer predesign network use flag bit and a basic layer predesign network index number, and if the basic layer predesign network use flag bit indicates that the basic layer uses a predesign network, the step includes selecting a basic layer of the predesign network structure corresponding to the basic layer predesign network index number from the predesign neural network pool. The aforementioned pre-designed neural network pool includes a network layer of at least one pre-designed network structure. The method according to feature 2.
5. The step of determining the reinforcement layer corresponding to the decoding processing unit based on the reinforcement layer information is: The reinforcement layer information includes a reinforcement layer default network use flag bit, and if the reinforcement layer default network use flag bit indicates that the reinforcement layer uses the default network, the step includes obtaining the reinforcement layer of the default network structure. The method according to feature 2.
6. The step of determining the reinforcement layer corresponding to the decoding processing unit based on the reinforcement layer information is: The enhancement layer information includes an enhancement layer predesign network use flag bit and an enhancement layer predesign network index number, and if the enhancement layer predesign network use flag bit indicates that the enhancement layer uses a predesign network, the step includes selecting an enhancement layer with a predesign network structure corresponding to the enhancement layer predesign network index number from the predesign neural network pool. The aforementioned pre-designed neural network pool includes a network layer of at least one pre-designed network structure. The method according to feature 2.
7. The step of determining the reinforcement layer corresponding to the decoding processing unit based on the reinforcement layer information is: If the enhancement layer information includes network parameters for generating the enhancement layer, the step includes generating the enhancement layer corresponding to the decoding processing unit based on the network parameters. The aforementioned network parameters are: The neural network includes at least one of the following: number of layers, deconvolutional layer flag bit, number of deconvolutional layers, quantization stride of each deconvolutional layer, number of channels of each deconvolutional layer, size of the convolutional kernel, number of filters, filtering size index, zero filtering coefficient flag bit, filtering coefficient, activation layer flag bit, and activation layer type. The method according to feature 2.
8. The image information includes coefficient hyperparameter feature information and image feature information, and the steps of determining the input features corresponding to the decoding processing unit based on the image information, and processing the input features based on the decoding neural network to obtain output features corresponding to the decoding processing unit are as follows: When executing a decoding process for generating coefficient hyperparameter features, the steps include: determining coefficient hyperparameter feature coefficient reconstruction values based on the coefficient hyperparameter feature information; and obtaining coefficient hyperparameter feature values by performing an inverse transformation on the coefficient hyperparameter feature coefficient reconstruction values based on the decoding neural network, wherein the coefficient hyperparameter feature values are used to decode the image feature information from the bitstream. The process of decoding an inverse image feature transformation includes the steps of determining an image feature reconstruction value based on the image feature information, and performing an inverse transformation on the image feature reconstruction value based on the decoding neural network to obtain an image low-level feature value, wherein the image low-level feature value is used to obtain a reconstructed image block corresponding to the current block. The method according to any one of claims 1 to 7, characterized by the features described herein.
9. The step of determining the coefficient hyperparameter feature coefficient reconstruction value based on the coefficient hyperparameter feature information is as follows: If the control parameter includes first enable information, and the first enable information indicates that a first inverse quantization process is enabled, the process includes the step of performing inverse quantization on the coefficient hyperparameter feature information to obtain a coefficient hyperparameter feature coefficient reconstruction value. The step of determining the image feature reconstruction value based on the aforementioned image feature information is: If the control parameter includes a second enable information, and the second enable information indicates that a second inverse quantization process is enabled, the process includes the step of performing inverse quantization on the image feature information to obtain an image feature reconstruction value. The method according to feature 8.
10. If the control parameter includes a third enable information, and the third enable information indicates that the quality enhancement process is enabled, then when performing the quality enhancement decoding process, the steps further include obtaining the low-level image feature values and performing an enhancement process on the low-level image feature values based on the decoding neural network to obtain a reconstructed image block corresponding to the current block. The method according to feature 8.
11. The steps include: determining input features corresponding to an encoding processing unit based on the current block; processing the input features based on an encoding neural network corresponding to the encoding processing unit to obtain output features corresponding to the encoding processing unit; and determining image information corresponding to the current block based on the output features. A step of obtaining control parameters corresponding to the current block, wherein the control parameters include neural network information corresponding to a decoding processing unit, and the neural network information is used to determine the decoding neural network corresponding to the decoding processing unit. The step of encoding image information and control parameters corresponding to the current block into a bitstream, A neural network-based image encoding method characterized by the following.
12. The neural network information includes basic layer information and reinforcement layer information, and the decoding neural network includes a basic layer determined based on the basic layer information and a reinforcement layer determined based on the reinforcement layer information. The method according to 11, characterized by the features described above.
13. The aforementioned base layer information includes a base layer default network use flag bit, and if the base layer default network use flag bit indicates that the base layer uses the default network, then the decoding neural network uses the base layer of the default network structure. The method according to 12, characterized by the features described above.
14. The basic layer information includes a basic layer predesign network use flag bit and a basic layer predesign network index number, and if the basic layer predesign network use flag bit indicates that the basic layer uses a predesign network, the decoding neural network uses the basic layer of the predesign network structure corresponding to the basic layer predesign network index number, selected from the predesign neural network pool. The aforementioned pre-designed neural network pool includes a network layer of at least one pre-designed network structure. The method according to 12, characterized by the features described above.
15. If the enhancement layer information includes an enhancement layer default network use flag bit, and the enhancement layer default network use flag bit indicates that the enhancement layer uses the default network, then the decoding neural network uses the enhancement layer of the default network structure. The method according to 12, characterized by the features described above.
16. The enhancement layer information includes an enhancement layer predesign network use flag bit and an enhancement layer predesign network index number, and if the enhancement layer predesign network use flag bit indicates that the enhancement layer uses a predesign network, the decoding neural network uses an enhancement layer of a predesign network structure corresponding to the enhancement layer predesign network index number, selected from the predesign neural network pool. The aforementioned pre-designed neural network pool includes a network layer of at least one pre-designed network structure. The method according to 12, characterized by the features described above.
17. If the reinforcement layer information includes network parameters for generating the reinforcement layer, the decoding neural network uses the reinforcement layer generated based on the network parameters. The aforementioned network parameters are: The neural network includes at least one of the following: number of layers, deconvolutional layer flag bit, number of deconvolutional layers, quantization stride of each deconvolutional layer, number of channels of each deconvolutional layer, size of the convolutional kernel, number of filters, filtering size index, zero filtering coefficient flag bit, filtering coefficient, activation layer flag bit, and activation layer type. The method according to 12, characterized by the features described above.
18. Before determining the input features corresponding to the coding unit based on the current block, further, A step of dividing the current image into N non-overlapping image blocks, where N is a positive integer, and a step of... A step of performing boundary padding on each image block to obtain an image block after boundary padding, wherein the padding value does not depend on the reconstructed pixel values of adjacent image blocks. The steps include generating N current blocks based on image blocks after boundary padding, The method according to any one of claims 11 to 17, characterized by...
19. Before determining the input features corresponding to the coding unit based on the current block, further, A step of dividing the current image into multiple basic blocks, each basic block containing at least one image block, and a step of A step of obtaining an image block after boundary padding by performing boundary padding on each image block, wherein when performing boundary padding on each image block, the padding value of the image block is permitted to depend on the reconstructed pixel values of image blocks in different basic blocks, and not on the reconstructed pixel values of other image blocks in the same basic block. The steps include generating multiple current blocks based on image blocks after boundary padding, The method according to any one of claims 11 to 17, characterized by...
20. A memory configured to store video data, The steps include decoding control parameters and image information corresponding to the current block from the bitstream, The steps include obtaining neural network information corresponding to the decoding processing unit from the control parameters, and generating a decoding neural network corresponding to the decoding processing unit based on the neural network information, A decoder configured to perform the steps of determining input features corresponding to the decoding processing unit based on the image information, and processing the input features based on the decoding neural network to obtain output features corresponding to the decoding processing unit, is included. A neural network-based image decoding device characterized by the following:
21. A memory configured to store video data, The steps include: determining input features corresponding to an encoding processing unit based on the current block; processing the input features based on an encoding neural network corresponding to the encoding processing unit to obtain output features corresponding to the encoding processing unit; and determining image information corresponding to the current block based on the output features. A step of obtaining control parameters corresponding to the current block, wherein the control parameters include neural network information corresponding to a decoding processing unit, and the neural network information is used to determine the decoding neural network corresponding to the decoding processing unit. An encoder configured to perform the steps of encoding image information and control parameters corresponding to the current block into a bitstream, An image encoding device based on a neural network, characterized by the following:
22. A decoding device comprising a processor and a machine-readable storage medium, wherein the machine-readable storage medium stores machine-executable instructions that can be executed by the processor. The processor is used to execute machine-executable instructions and carry out the method according to any one of claims 1 to 10. A decoding device characterized by the following features.
23. An encoding device comprising a processor and a machine-readable storage medium, wherein the machine-readable storage medium stores machine-executable instructions that can be executed by the processor, The processor is used to execute machine-executable instructions and carry out the method according to any one of claims 11 to 19. An encoding device characterized by the following features.