Encoding device, decoding device, encoding method, and decoding method
The technique addresses the limitations of VVC's quantization matrix by incorporating prediction and quantization methods to improve image quality in template matching prediction, achieving enhanced subjective image quality.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- CANON KK
- Filing Date
- 2025-12-03
- Publication Date
- 2026-07-02
AI Technical Summary
The quantization matrix in Versatile Video Coding (VVC) is not designed to handle template matching prediction effectively, leading to a lack of quantization control and poor subjective image quality.
A technique that includes prediction means to find similar pixel groups, conversion means for frequency conversion, and quantization means using a quantization matrix to improve image quality in template matching prediction.
Enhances subjective image quality by allowing quantization control for each frequency component, particularly in template matching prediction.
Smart Images

Figure JP2025042139_02072026_PF_FP_ABST
Abstract
Description
Symbolization device, decoding device, symbolization method, decoding method
[0001] The present disclosure relates to encoding / decoding technology.
[0002] As an encoding method for compressed recording of moving images, the Versatile Video Coding (VVC) encoding method (VVC) is known. In VVC, in order to improve the encoding efficiency, a basic block of up to 128x128 pixels is divided into sub-blocks not only in the shape of a conventional square but also in the shape of a rectangle.
[0003] In addition, in VVC, a process called quantization matrix processing is used to weight the coefficients (orthogonal transform coefficients) after performing an orthogonal transform according to the frequency components. By further reducing the data of high-frequency components that are less likely to be noticeable in human vision degradation, it is possible to improve the compression efficiency while maintaining the image quality. Patent Document 1 discloses a technique for encoding such a quantization matrix.
[0004] In recent years, the Joint Video Experts Team (JVET), which has standardized VVC, has been conducting technical studies to achieve a compression efficiency higher than that of VVC. In order to improve the encoding efficiency, in addition to the conventional intra prediction and inter prediction, a new prediction method (template matching prediction) for searching for a block-shaped group of predicted pixels of the encoding target picture using the surrounding pixels of the encoding target block as a template is being studied.
[0005] Japanese Patent Application Laid-Open No. 2013-38758
[0006] The quantization matrix in VVC is premised on prediction methods such as conventional intra prediction and inter prediction, and cannot cope with the template matching prediction, which is a new prediction method. Therefore, there is a problem that quantization control according to the frequency components cannot be performed for the error of the template matching prediction, and the subjective image quality cannot be improved. The present disclosure provides a technique for improving the subjective image quality in template matching prediction.
[0007] One aspect of this disclosure is characterized by comprising: prediction means that searches for a group of pixels in a frame that is determined to be similar to a group of pixels adjacent to a block to be encoded in the frame, from a group of encoded pixels in the frame, generates a predicted image composed of the encoded pixels of the frame based on the searched group of pixels, and calculates a prediction error using the predicted image; conversion means that frequency-converts the prediction error to generate conversion coefficients; quantization means that quantizes the conversion coefficients using a quantization matrix to generate quantization coefficients; and encoding means that encodes the quantization coefficients.
[0008] According to this disclosure, we can provide a technique for improving subjective image quality in template fitting prediction.
[0009] Other features and advantages of the technical ideas derived from this disclosure will become apparent from the following description with reference to the attached drawings. In the attached drawings, the same or similar components are given the same reference numeral.
[0010] The attached drawings are included in the specification and constitute part thereof, illustrating embodiments in this disclosure and used to explain the technical ideas derived from this disclosure together with their descriptions. Block diagram showing an example of the functional configuration of an image encoding device. Block diagram showing an example of the functional configuration of an image decoding device. Flowchart of the processing performed by the image encoding device. Flowchart of the processing performed by the image decoding device. Block diagram showing an example of the hardware configuration of a computer device. Diagram showing an example of the configuration of a bitstream. Diagram showing an example of the configuration of a bitstream. Diagram showing an example of a subblock partitioning pattern. Diagram showing an example of a subblock partitioning pattern. Diagram showing an example of a subblock partitioning pattern. Diagram showing an example of a subblock partitioning pattern. Diagram showing an example of a subblock partitioning pattern. Diagram showing an example of a subblock partitioning pattern. Diagram showing an example of a quantization matrix. Diagram showing an example of a quantization matrix. Diagram showing an example of a scan order. Diagram showing an example of a one-dimensional difference matrix. Diagram showing an example of a one-dimensional difference matrix. Diagram showing an example of a one-dimensional difference matrix. Diagram showing an example of a coding table. Diagram showing an example of a coding table. Diagram illustrating an example of template fit prediction.
[0011] The embodiments will be described in detail below with reference to the attached drawings. Note that the following embodiments do not limit the scope of the claims. While the embodiments describe multiple features, not all of these features are necessary, and the features may be combined in any way. Furthermore, in the attached drawings, identical or similar configurations are given the same reference numerals, and redundant descriptions are omitted.
[0012] [First Embodiment] First, an example of the functional configuration of an image encoding device, which is an example of an encoding device according to this embodiment, will be explained using the block diagram in Figure 1. The block division unit 102 acquires the input frame (input image). The method of acquiring frames by the block division unit 102 is not limited to a specific acquisition method. For example, the block division unit 102 may acquire each frame in a moving image (for example, 30 frames / second) captured by an imaging device connected to the image encoding device. Alternatively, for example, the block division unit 102 may acquire each frame in a moving image stored in an external device such as a server device or an external memory device connected to the image encoding device. Alternatively, for example, the block division unit 102 may acquire each frame in a moving image stored in the memory of the image encoding device. The block division unit 102 then divides the acquired frame into a plurality of basic blocks.
[0013] The storage unit 103 stores multiple quantization matrices used for quantization processing. The method for obtaining the quantization matrices is not limited to a specific method; for example, the quantization matrices may be input to the image encoding device in response to user operation, or the image encoding device may calculate the quantization matrices from the characteristics of the frames. Also, for example, the quantization matrices stored by the storage unit 103 may use pre-specified initial values.
[0014] In this embodiment, as an example, a two-dimensional quantization matrix corresponding to an 8x8 pixel orthogonal transformation (frequency transformation) is generated as shown in Figures 8A to 8C and held in the holding unit 103.
[0015] The prediction unit 104 divides the basic block into multiple subblocks (subblock division), and performs prediction processing (intra-prediction, which is intra-frame prediction; inter-prediction, which is inter-frame prediction; template fitting prediction, etc.) on a subblock basis to generate a predicted image for the subblock. The prediction unit 104 then calculates the prediction error from the subblock and the predicted image generated for the subblock. The prediction unit 104 also outputs information necessary for prediction (for example, information such as subblock division, prediction mode, motion vector, etc.) as prediction information.
[0016] The conversion / quantization unit 105 generates conversion coefficients by orthogonal transformation (frequency transformation) of the prediction error generated by the prediction unit 104 on a sub-block basis. Then, the conversion / quantization unit 105 quantizes the conversion coefficients for each sub-block using the quantization matrix held by the holding unit 103 to generate quantization coefficients (quantized conversion coefficients).
[0017] The inverse quantization / inverse transformation unit 106 performs the reverse operation of the transformation / quantization unit 105 to regenerate the prediction error. In other words, the inverse quantization / inverse transformation unit 106 regenerates the transformation coefficients by inverse quantization of the quantization coefficients generated by the transformation / quantization unit 105 using the quantization matrix used to generate the quantization coefficients, and then regenerates the prediction error by inverse orthogonal transformation of the transformation coefficients.
[0018] The image playback unit 107 generates a predicted image by appropriately referring to the frame memory 108 based on the prediction information output from the prediction unit 104. Then, the image playback unit 107 generates a reconstructed image from the generated predicted image and the prediction error reconstructed by the inverse quantization / inverse transformation unit 106, and stores the generated reconstructed image in the frame memory 108.
[0019] The in-loop filter unit 109 performs in-loop filtering, such as deblocking filtering and sample adaptive offsetting, on the playback image stored in the frame memory 108, and then stores the playback image with the in-loop filtering applied back into the frame memory 108.
[0020] The encoding unit 110 entropy encodes the quantization coefficients generated by the transformation / quantization unit 105 and the prediction information output from the prediction unit 104 to generate coded image data. The encoding unit 113 encodes the quantization matrix held in the storage unit 103 to generate coded data for the quantization matrix.
[0021] The integrated encoding unit 111 generates header code data using the code data of the quantization matrix generated by the encoding unit 113. Furthermore, the integrated encoding unit 111 forms a bitstream by combining the generated header code data with the image code data generated by the encoding unit 110, and outputs the formed bitstream. The output destination of the bitstream by the integrated encoding unit 111 is not limited to a specific output destination. For example, the integrated encoding unit 111 may transmit the bitstream to an external device such as a server device via a network such as a LAN or the Internet, or it may output (store) the bitstream in the memory of the image encoding device.
[0022] The control unit 150 controls the operation of the entire image encoding device. For example, the control unit 150 controls the operation of each of the above-mentioned functional units in the image encoding device. As a result, each of the above-mentioned functional units in the image encoding device operates under the control of the control unit 150.
[0023] Next, the encoding operation of frames by the image encoding device will be explained in more detail. Below, we will describe the case in which the holding unit 103 generates and holds the quantization matrix.
[0024] The quantization matrix is generated according to the size of the subblock to be encoded and the type of prediction method. In this embodiment, the holding unit 103 generates a quantization matrix of 8x8 pixels, corresponding to the 8x8 pixel basic block shown in Figure 7A. However, the size of the generated quantization matrix is not limited to 8x8 pixels; quantization matrices corresponding to the shape of the subblock, such as 4x8 pixels, 8x4 pixels, or 4x4 pixels, may also be generated.
[0025] The method for determining the values of each element in a quantization matrix is not limited to a specific method. For example, a predetermined value may be used for each element in the quantization matrix, a different value may be used for each element, or a value may be used that is appropriate to the characteristics of the frame.
[0026] The holding unit 103 holds the quantization matrix generated in this manner. In this embodiment, we will describe the case in which the holding unit 103 generates and holds the quantization matrices shown in Figures 8A, 8B, and 8C, respectively.
[0027] The quantization matrix shown in Figure 8A is an example of a quantization matrix used for quantizing the prediction error obtained in intra-prediction. The quantization matrix shown in Figure 8B is an example of a quantization matrix used for quantizing the prediction error obtained in inter-prediction. The quantization matrix shown in Figure 8C is an example of a quantization matrix used for quantizing the prediction error obtained in template-fit prediction.
[0028] As shown in Figures 8A, 8B, and 8C, the quantization matrix has 8x8 elements (quantization step values). In this embodiment, the quantization matrix held by the holding unit 103 is assumed to be a two-dimensional array of quantization step values, as shown in Figures 8A, 8B, and 8C, but the form of holding the quantization step values in the quantization matrix is not limited to a specific form.
[0029] Furthermore, the holding unit 103 can also hold multiple quantization matrices for the same prediction method, depending on the size of the subblock or whether the encoding target is a luminance block or a chrominance block. Generally, in order to realize quantization processing according to human visual characteristics, the quantization matrix has a small quantization step value for the DC component corresponding to the upper left corner of the quantization matrix, as shown in Figures 8A, 8B, and 8C, and a large quantization step value for the AC component corresponding to the lower right corner.
[0030] The encoding unit 113 reads the respective quantization matrices shown in Figures 8A, 8B, and 8C from the holding unit 103, scans each element in the read quantization matrices to calculate the difference between elements, and generates a one-dimensional array by arranging the calculated differences.
[0031] In this embodiment, each element in the quantization matrix is scanned according to the scanning order shown in Figure 9, and for each scanned element, the difference between the value of that element (quantization step value) and the value of the element immediately preceding it in the scanning order (quantization step value) is calculated.
[0032] For example, when each element in the quantization matrix in Figure 8C is scanned according to the scanning order shown in Figure 9, the first element located in the upper left corner (quantization step value "8") is scanned, followed by the element located directly below it (quantization step value "11"). The difference calculated is "3," which is the difference between the value of the former element "8" and the value of the latter element "11." Note that for the first element of the quantization matrix in the scanning order, the difference is calculated from a predetermined initial value (for example, "8"), but it is not limited to this; the difference can also be calculated from an arbitrary value or from the value of the first element itself.
[0033] In this way, the encoding unit 113 scans each element in the quantization matrix of Figure 8A according to the scanning order of Figure 9, calculates the differences between the elements, and arranges them to generate a one-dimensional array as the one-dimensional difference matrix shown in Figure 10A.
[0034] Similarly, the encoding unit 113 scans each element in the quantization matrix of Figure 8B according to the scanning order of Figure 9, calculates the differences between the elements, and arranges them to generate a one-dimensional array as the one-dimensional difference matrix shown in Figure 10B.
[0035] Similarly, the encoding unit 113 scans each element in the quantization matrix in Figure 8C according to the scanning order in Figure 9, calculates the differences between the elements, and arranges them to generate a one-dimensional array as the one-dimensional difference matrix shown in Figure 10C.
[0036] The encoding unit 113 then encodes the one-dimensional difference matrices in Figures 10A, 10B, and 10C, which were generated for the respective quantization matrices in Figures 8A, 8B, and 8C, to generate coded data for the quantization matrices. In this embodiment, the encoding unit 113 encodes the difference matrices using the encoding table shown in Figure 11A, but the encoding table used to encode the difference matrices is not limited to the encoding table shown in Figure 11A. For example, the encoding unit 113 may encode the difference matrices using the encoding table shown in Figure 11B.
[0037] The integrated encoding unit 111 encodes the header information necessary for encoding the frame, and integrates the encoded header information with the encoded data of the quantization matrix generated by the encoding unit 113.
[0038] The block division unit 102 divides the input frame (the input image for one frame) into multiple basic blocks. In this embodiment, as described above, the size of the basic block is 8x8 pixels.
[0039] The prediction unit 104 determines a subblock division method, which is a method of dividing a basic block into multiple subblocks (subblock division), and determines a prediction process (prediction mode) for generating predicted images of the subblocks.
[0040] Figures 7A to 7F show examples of subblock division patterns. The rectangles indicated by the thick outer borders in Figures 7A to 7F represent basic blocks, and in this embodiment, the basic blocks have a size of 8x8 pixels. The rectangles within the thick borders represent subblocks.
[0041] Figure 7A shows an example of a basic block = subblock. Figure 7B shows an example of a conventional square subblock division, where a basic block with a size of 8x8 pixels is divided into four "subblocks with a size of 4x4 pixels".
[0042] Figures 7C to 7F show an example of rectangular sub-block division. In Figure 7C, a basic block having a size of 8x8 pixels is divided into two "vertically long sub-blocks having a size of 4x8 pixels". In Figure 7D, a basic block having a size of 8x8 pixels is divided into two "horizontally long sub-blocks having a size of 8x4 pixels". In Figure 7E, a basic block having a size of 8x8 pixels is divided into three sub-blocks: a "vertically long sub-block having a size of 2x8 pixels", a "vertically long sub-block having a size of 4x8 pixels", and a "vertically long sub-block having a size of 2x8 pixels". In Figure 7F, a basic block having a size of 8x8 pixels is divided into three sub-blocks: a "horizontally long sub-block having a size of 8x2 pixels", a "horizontally long sub-block having a size of 8x4 pixels", and a "horizontally long sub-block having a size of 8x2 pixels". Thus, not only squares but also rectangular sub-blocks are used for the encoding process.
[0043] In this embodiment, as shown in Figure 7A, a case determined by the sub-block determination method in which a basic block having a size of 8x8 pixels is not divided and is used as a sub-block will be described. However, a quadtree division such as that in Figure 7B, a ternary tree division such as those in Figures 7E and 7F, or a binary tree division such as those in Figures 7C and 7D may be used. That is, the block to be encoded (encoding target block) may be a basic block or a sub-block obtained by dividing the basic block.
[0044] When other sub-block division methods than those in Figure 7A are also used, it is necessary to generate a quantization matrix corresponding to the sub-blocks to be used. Further, the generated quantization matrix will be encoded by the encoding unit 113.
[0045] The method for determining the sub-block division method is not limited to a specific determination method. For example, it may be determined according to a user operation, or may be determined according to a predetermined criterion. Further, the sub-block determination method may be predetermined.
[0046] Further, the prediction unit 104 determines the prediction mode of the sub-block. In the present embodiment, for each sub-block, any one of intra prediction, inter prediction, and template matching prediction (prediction mode) is determined.
[0047] In intra prediction, a predicted image of the encoding target block is generated using encoded pixels spatially adjacent to the encoding target block, and an intra prediction mode indicating an intra prediction method such as horizontal prediction, vertical prediction, and DC prediction is also generated.
[0048] In inter prediction, a frame different from (temporally different from) the frame to which the encoding target block belongs is used as a reference frame, and a predicted image of the encoding target block is generated using encoded pixels in the reference frame, and motion information indicating the reference frame, motion vectors, etc. is also generated.
[0049] In template matching prediction, a pixel group adjacent to the encoding target block is used as a template, and a pixel group determined to be similar to the template is searched for from "the encoded pixel group in the frame to which the encoding target block belongs", and a predicted image of the encoding target block is generated based on the searched pixel group.
[0050] Here, an example of template matching prediction will be described using the specific example shown in FIG. 12. Here, a method for generating a predicted image (P) 1203 corresponding to the encoding target block (C) 1201 in the frame 1200 will be described.
[0051] The prediction unit 104 uses a pixel group (N) 1202 composed of a pixel group adjacent to the upper side of the encoding target block 1201 and a pixel group adjacent to the left side of the encoding target block 1201 as a template.
[0052] The prediction unit 104 then searches for a group of pixels similar to the template from the encoded pixel group in frame 1200. The method for searching for the "group of pixels similar to the template" is not limited to a specific search method. For example, template matching may be performed between the region of the encoded pixel group in frame 1200 and the template, and the group of pixels in the region of the encoded pixel group in frame 1200 with the highest similarity to the template may be defined as the "group of pixels most similar to the template". Figure 12 shows a case in which the pixel group (T') 1204 was found as the group of pixels most similar to the template from the encoded pixel group in frame 1200.
[0053] The prediction unit 104 then generates a predicted image 1203 composed of encoded pixels of frame 1200 based on the pixel group 1204. The example in Figure 12 shows a case where the pixel group 1204 and the image in the rectangular area adjacent to it in the lower right are generated as the predicted image 1203.
[0054] Template fit prediction is considered a technique that improves encoding efficiency, particularly in artificial images such as computer screens where the same characters or textures repeatedly appear within the frames being encoded and decoded.
[0055] In other words, the prediction unit 104 divides the basic block into multiple subblocks according to the subblock division method determined as described above. In this embodiment, the basic block is not divided as described above, so the subblocks in the following description are the same as the basic block.
[0056] The prediction unit 104 then calculates the difference between each subblock and the predicted image generated by the prediction process of the prediction mode determined for that subblock as the prediction error.
[0057] Furthermore, the prediction unit 104 outputs information necessary for prediction (such as the subblock partitioning method, prediction mode (information indicating which prediction mode was used: intra prediction, inter prediction, or template fitting prediction), motion vector, etc.) as prediction information.
[0058] The conversion and quantization unit 105 performs orthogonal transformation (frequency transformation) and quantization on the prediction error for each subblock generated by the prediction unit 104 to generate the quantization coefficient (conversion coefficient after quantization) for the subblock.
[0059] More specifically, the transformation / quantization unit 105 generates transformation coefficients by performing an orthogonal transformation corresponding to the size of the prediction error of each subblock, selects a quantization matrix from the quantization matrices held by the holding unit 103 that corresponds to the prediction mode of the subblock, and quantizes the transformation coefficients using the selected quantization matrix to generate quantization coefficients (transformation coefficients after quantization).
[0060] In this embodiment, the transformation / quantization unit 105 selects the quantization matrix shown in Figure 8A for subblocks where the prediction mode is intra-prediction. The transformation / quantization unit 105 also selects the quantization matrix shown in Figure 8B for subblocks where the prediction mode is inter-prediction. Furthermore, the transformation / quantization unit 105 selects the quantization matrix shown in Figure 8C for subblocks where the prediction mode is template-fit prediction. However, the quantization matrices used are not limited to these.
[0061] The inverse quantization / inverse transformation unit 106 obtains from the holding unit 103 the quantization matrix used in the quantization process performed by the transformation / quantization unit 105 to generate the quantization coefficients of the subblock (i.e., the quantization matrix corresponding to the prediction process (prediction mode) of the subblock). Then, the inverse quantization / inverse transformation unit 106 reconstructs the transformation coefficients by inverse quantization of the quantization coefficients generated by the transformation / quantization unit 105 using the quantization matrix obtained for those quantization coefficients, and then reconstructs the prediction error by inverse orthogonal transformation of the transformation coefficients.
[0062] The image playback unit 107 generates (plays back) a predicted image by appropriately referring to the frame memory 108 based on the prediction information output from the prediction unit 104. Then, the image playback unit 107 adds the generated predicted image and the prediction error of the subblock reproduced by the inverse quantization / inverse transform unit 106 to generate a reproduced image of the subblock, and stores the generated reproduced image in the frame memory 108.
[0063] The in-loop filter unit 109 performs in-loop filtering, such as deblocking filtering and sample adaptive offsetting, on the playback image stored in the frame memory 108, and then stores the playback image with the in-loop filtering applied back into the frame memory 108.
[0064] The encoding unit 110 entropy encodes, for each subblock, the quantization coefficients generated by the transformation / quantization unit 105 for that subblock and the prediction information output from the prediction unit 104 to generate coded data. For entropy encoding, for example, Golomb coding, arithmetic coding, Huffman coding, etc., can be used.
[0065] The integrated encoding unit 111 generates header code data using the code data of the quantization matrix generated by the encoding unit 113. Furthermore, the integrated encoding unit 111 multiplexes the generated header code data with the code data of the image generated by the encoding unit 110 to form a bitstream, and outputs the formed bitstream. An example of the data structure of the bitstream generated and output by the integrated encoding unit 111 is shown in Figure 6A.
[0066] The sequence header contains coded data for the quantization matrix, and this coded data contains coded data for each element of the quantization matrix. However, the location where the quantization matrix is encoded in the sequence header is not limited to this; it may also be encoded in the picture header or other header sections. Furthermore, when changing the quantization matrix within a single sequence, it is possible to update it by recoding the quantization matrix. In this case, the entire quantization matrix may be rewritten, or it may be possible to change only a part of it by specifying the prediction mode of the quantization matrix corresponding to the quantization matrix to be rewritten.
[0067] Next, the processing performed by the image encoding device for encoding one frame in a moving image will be explained according to the flowchart in Figure 3. In step S301, the holding unit 103 generates and holds a quantization matrix, which is a two-dimensional array of quantization step values, prior to the encoding process. In this embodiment, as described above, the holding unit 103 generates and holds the quantization matrices shown in Figures 8A to 8C (corresponding to subblocks having a size of 8x8 pixels, and corresponding to the prediction methods of intra prediction, inter prediction, and template fitting prediction).
[0068] In step S302, the encoding unit 113 reads the quantization matrix held in the holding unit 103 in step S301, scans each element in the read quantization matrix to calculate the difference between elements, and generates a one-dimensional array of the calculated differences as a one-dimensional difference matrix. In this embodiment, as described above, the encoding unit 113 scans each element in the quantization matrix of Figure 8A according to the scan order shown in Figure 9 to calculate the difference between elements, and generates the difference matrix of Figure 10A by arranging the calculated differences. Furthermore, the encoding unit 113 scans each element in the quantization matrix of Figure 8B according to the scan order shown in Figure 9 to calculate the difference between elements, and generates the difference matrix of Figure 10B by arranging the calculated differences. Furthermore, the encoding unit 113 scans each element in the quantization matrix of Figure 8C according to the scan order shown in Figure 9 to calculate the difference between elements, and generates the difference matrix of Figure 10C by arranging the calculated differences.
[0069] The encoding unit 113 then refers to the encoding table shown in Figures 11A and 11B to identify the binary code corresponding to the value of each element (value to be encoded) in the one-dimensional difference matrix of the quantization matrix, and generates the set of identified binary codes as the code data of the quantization matrix.
[0070] In step S303, the integrated encoding unit 111 encodes the header information necessary for encoding the frame, and integrates the encoded header information with the encoded data of the quantization matrix generated in step S302.
[0071] In step S304, the block division unit 102 divides the input frame (one frame of input image) into multiple basic blocks. In step S305, the prediction unit 104 selects one of the multiple basic blocks obtained in step S304 that was not selected as the selected basic block. The prediction unit 104 then divides the selected basic block into multiple subblocks, performs prediction processing on each subblock to generate a predicted image for that subblock, and calculates the difference between the subblock and its predicted image as the prediction error. The prediction unit 104 also outputs the information necessary for prediction as prediction information.
[0072] More specifically, the prediction unit 104 designates each subblock in the selected basic block as a subblock of interest, and generates a predicted image of the subblock of interest by performing the following processing on the subblock of interest.
[0073] The prediction unit 104 performs intra-prediction on the subblock of interest by referring to the region of the encoded pixel group of the frame to which the subblock of interest belongs, and generates a predicted image (intra-predicted image) of the subblock of interest.
[0074] Furthermore, the prediction unit 104 performs interpretation on the subblock of interest by referring to an encoded frame different from the frame to which the subblock of interest belongs (for example, the frame encoded immediately before), and generates a predicted image (interpretation image) of the subblock of interest.
[0075] Furthermore, the prediction unit 104 performs template fitting prediction on the subblock of interest, as illustrated in Figure 12, to generate a predicted image (template fitting prediction image) of the subblock of interest.
[0076] The prediction unit 104 then generates a difference image between the subblock of interest and the intra-predicted image generated for the subblock of interest, and calculates the sum of the squared values (or absolute values, for example) of the pixel values of each pixel in the difference image as the first evaluation value.
[0077] Furthermore, the prediction unit 104 generates a difference image between the subblock of interest and the inter-predicted image generated for the subblock of interest, and calculates the sum of the squared values (or absolute values, for example) of the pixel values of each pixel in the difference image as the second evaluation value.
[0078] Furthermore, the prediction unit 104 generates a difference image between the subblock of interest and the template-fitted prediction image generated for the subblock of interest, and calculates the sum of the squared values (or absolute values, for example) of the pixel values of each pixel in the difference image as the third evaluation value.
[0079] The prediction unit 104 then identifies the smallest evaluation value among the first evaluation value, second evaluation value, and third evaluation value, and determines the prediction mode of the predicted image for which the smallest evaluation value was calculated as the prediction mode of the subblock of interest.
[0080] For example, if the smallest of the first evaluation value, second evaluation value, and third evaluation value is the first evaluation value, the prediction unit 104 determines that the prediction image for which the first evaluation value was calculated is an intra-predicted image, and therefore the prediction mode of the subblock of interest is the intra-prediction mode.
[0081] For example, if the second evaluation value is the smallest of the first evaluation value, second evaluation value, and third evaluation value, the prediction unit 104 determines that the prediction mode of the subblock of interest is the inter-prediction mode, because the predicted image from which the second evaluation value was calculated is an inter-prediction image.
[0082] For example, if the third evaluation value is the smallest of the first evaluation value, second evaluation value, and third evaluation value, the prediction unit 104 determines that the prediction mode of the subblock of interest is the template-fitting prediction mode, because the predicted image for which the third evaluation value was calculated is a template-fitting prediction image.
[0083] The prediction unit 104 then outputs prediction information including the prediction mode of the subblock of interest. If the prediction unit 104 determines that the prediction mode of the subblock of interest is the intra-prediction mode, it takes the difference between the subblock of interest and the intra-prediction image as the prediction error of the subblock of interest.
[0084] Furthermore, if the prediction unit 104 determines that the prediction mode of the subblock of interest is the interprediction mode, it sets the difference between the subblock of interest and the interprediction image as the prediction error of the subblock of interest.
[0085] Furthermore, if the prediction unit 104 determines that the prediction mode for the subblock of interest is the template-fitting prediction mode, it sets the difference between the subblock of interest and the template-fitting prediction image as the prediction error for the subblock of interest.
[0086] In step S306, the conversion / quantization unit 105 generates conversion coefficients by performing an orthogonal transformation (frequency transformation) on the prediction error generated by the prediction unit 104 in step S305 in subblock units.
[0087] Then, the conversion / quantization unit 105 identifies the prediction mode of the subblock from the prediction information and selects the matrix corresponding to the prediction mode of the subblock from the quantization matrices held by the holding unit 103.
[0088] In this embodiment, if the prediction mode of a subblock is intra-prediction, the quantization matrix shown in Figure 8A is selected for that subblock. If the prediction mode of a subblock is inter-prediction, the quantization matrix shown in Figure 8B is selected for that subblock. If the prediction mode of a subblock is template-fit prediction, the quantization matrix shown in Figure 8C is selected for that subblock.
[0089] The conversion / quantization unit 105 then quantizes the conversion coefficients of the subblock using a quantization matrix selected for the subblock to generate quantization coefficients (conversion coefficients after quantization).
[0090] In step S307, the inverse quantization / inverse transformation unit 106 reconstructs the transformation coefficients by inverse quantization of the quantization coefficients generated in step S306 using the quantization matrix used to generate the quantization coefficients, and then reconstructs the prediction error by inverse orthogonal transformation of the transformation coefficients.
[0091] In step S308, the image playback unit 107 generates a predicted image by appropriately referring to the frame memory 108 based on the prediction information output from the prediction unit 104 in step S305. Then, the image playback unit 107 generates (plays back) a replayed image (subblock) from the generated predicted image and the prediction error regenerated by the inverse quantization / inverse transformation unit 106 in step S307, and stores the generated replayed image in the frame memory 108.
[0092] In step S309, the encoding unit 110 entropy encodes the quantization coefficients generated by the conversion / quantization unit 105 in step S306 and the prediction information output from the prediction unit 104 in step S305 to generate coded image data.
[0093] Then, the integrated encoding unit 111 generates header code data using the code data of the quantization matrix generated by the encoding unit 113 in step S302. Furthermore, the integrated encoding unit 111 forms a bitstream by combining the generated header code data with the code data of the image generated by the encoding unit 110, and outputs the formed bitstream.
[0094] In step S310, the control unit 150 determines whether all basic blocks in the frame have been selected as selected basic blocks. If, as a result of this determination, all basic blocks in the frame have been selected as selected basic blocks, the process proceeds to step S311. On the other hand, if there are still basic blocks in the frame that have not yet been selected as selected basic blocks, the process proceeds to step S305.
[0095] In step S311, the in-loop filter unit 109 performs in-loop filtering on the playback image generated in step S308 and stored in the frame memory 108, and then stores the playback image that has undergone the in-loop filtering back into the frame memory 108.
[0096] Furthermore, if the image encoding device performs encoding processing on a group of subsequent frames following the above-mentioned single frame, it will perform the processing in steps S304 to S311 above for each subsequent frame in the group of subsequent frames.
[0097] Thus, according to this embodiment, in particular in step S306, by performing quantization processing using a quantization matrix for template fitting prediction on the subblock, quantization can be controlled for each frequency component, thereby improving subjective image quality.
[0098] In this embodiment, quantization matrices are defined individually for the three types of prediction methods: intra prediction, inter prediction, and template fitting prediction, and all three types of quantization matrices are encoded. However, some of these matrices may be shared.
[0099] For example, the prediction error of a subblock using template fitting prediction may also be quantized using the quantization matrix in Figure 8A, similar to the prediction error of a subblock using intra prediction. In that case, the quantization matrix in Figure 8C becomes unnecessary, and coding can be omitted. This reduces the code size of the quantization matrix in Figure 8C while mitigating image quality degradation caused by errors in prediction using pixels within the same frame, such as block distortion.
[0100] Furthermore, for example, the prediction error of a subblock using template-fit prediction may also be quantized using the quantization matrix in Figure 8B, similar to the prediction error of a subblock using interpretation. In this case as well, the quantization matrix in Figure 8C becomes unnecessary, and coding can be omitted. This reduces the amount of code in the quantization matrix in Figure 8C while also reducing image quality degradation caused by errors in predictions using blocky pixel groups, such as jerky motion.
[0101] Furthermore, in this embodiment, the quantization matrix for a subblock using template fit prediction is uniquely determined, but it is also possible to make it selectable by introducing an identifier. For example, Figure 6B shows an example of a bitstream configuration in which quantization matrix coding for a subblock using template fit prediction is selectively made by newly introducing a quantization matrix coding method information code.
[0102] For example, if the quantization matrix coding method information code indicates 0, the quantization matrix shown in Figure 8A for the subblock using intra prediction is used for the subblock using template fit prediction.
[0103] Furthermore, if the quantization matrix coding method information code indicates 1, the quantization matrix shown in Figure 8B is used for the subblock using interpretation, while the subblock using template fitting prediction is used for the subblock using interpretation.
[0104] Furthermore, if the quantization matrix coding method information code indicates 2, the individually coded quantization matrix shown in Figure 8C is used for the subblocks using template fit prediction.
[0105] Therefore, for example, if it is set that the quantization matrix shown in Figure 8A is used for a subblock using template fit prediction, the control unit 150 sets "quantization matrix coding method information code" to 0.
[0106] For example, if the subblock using template fit prediction is set to use the quantization matrix shown in Figure 8B, the control unit 150 sets "quantization matrix coding method information code" to 1.
[0107] For example, if the subblock using template fit prediction is set to use the quantization matrix shown in Figure 8C, the control unit 150 sets "quantization matrix coding method information code" to 2.
[0108] The integrated encoding unit 111 then includes a "quantization matrix encoding method information code" in the bitstream, with the values set as described above, as shown in Figure 6B. This makes it possible to selectively implement quantization matrix code amount reduction and unique quantization control for subblocks using template fit prediction.
[0109] Furthermore, although this embodiment assumes that only three types of prediction methods—intra-prediction, inter-prediction, and template-fit prediction—are used, it is not limited to these, and for example, intra-inter-composite prediction (CIIP) used in VVC may also be used. Intra-inter-composite prediction is a prediction method that calculates the pixels of the entire subblock to be encoded using a weighted average of intra-predicted pixels and inter-predicted pixels. In this case, the quantization matrix used for subblocks using template-fit prediction and the quantization matrix used for subblocks using intra-inter-composite prediction can be made common. This makes it possible to reduce the code amount of the quantization matrix corresponding to the new prediction method. Note that there is no particular form in which of the quantization matrices used for subblocks using template-fit prediction and the quantization matrix used for subblocks using intra-inter-composite prediction is made common.
[0110] Furthermore, although this embodiment describes a case where the frame is an image, the frame is not limited to an image. For example, a two-dimensional array of feature data used in machine learning, such as object recognition, may be encoded as a frame and output as a bitstream. In this case, each element in this two-dimensional array can be treated as a pixel and this embodiment can be applied. This makes it possible to efficiently encode feature data used in machine learning.
[0111] [Second Embodiment] In this embodiment, an image decoding device, which is an example of a decoding device that acquires and decodes the bitstream for each frame generated by the image encoding device according to the first embodiment, will be described. In this embodiment, the differences from the first embodiment will be described, and unless otherwise specified below, it will be assumed to be the same as the first embodiment. First, an example of the functional configuration of the image decoding device according to this embodiment will be described using the block diagram in Figure 2.
[0112] The decoupling unit 202 acquires the bitstream generated by the image encoding device according to the first embodiment. The method of acquiring the bitstream by the decoupling unit 202 is not limited to a specific acquisition method. For example, the decoupling unit 202 may acquire the bitstream transmitted from the image encoding device via a network such as a LAN or the Internet. Also, if the bitstream generated by the image encoding device is stored in an external device such as a server device, the decoupling unit 202 may acquire the bitstream from the external device.
[0113] The separation and decoding unit 202 then separates coded data related to the decoding process and coefficients from the acquired bitstream, and further separates coded data present in the header portion of the bitstream.
[0114] For example, the decoupling unit 202 extracts the coded data of the quantization matrix shown in Figures 8A to 8C from the sequence header of the bitstream shown in Figure 6A, and supplies the extracted coded data of the quantization matrix to the decoding unit 209.
[0115] Furthermore, the separation / decoding unit 202 extracts the coded image data at the subblock level of the basic block in the picture data of the bitstream shown in Figure 6A, and supplies the extracted coded data to the decoding unit 203. In this way, the separation / decoding unit 202 operates in the opposite way to the integrated encoding unit 111 described above.
[0116] The decoding unit 209 decodes the coded data of the quantization matrix shown in Figures 8A to 8C supplied from the separation decoding unit 202 and reconstructs the one-dimensional difference matrix shown in Figures 10A to 10C. In this embodiment, as in the first embodiment, the coded data of the quantization matrix is decoded using the coding table shown in Figures 11A and 11B, but the coding table is not limited to this, and other coding tables may be used as long as they are the same as those in the first embodiment.
[0117] The decoding unit 209 then reconstructs a two-dimensional quantization matrix from the reconstructed one-dimensional difference matrix in the reverse process of the process by which the encoding unit 113 generated the one-dimensional difference matrix from the quantization matrix. In this way, the decoding unit 209 reconstructs the quantization matrices shown in Figures 8A to 8C from the difference matrices shown in Figures 10A to 10C, respectively. The decoding unit 203 decodes the coded data supplied from the separation decoding unit 202 and reconstructs the quantization coefficients and prediction information.
[0118] The inverse quantization / inverse transformation unit 204 selects a quantization matrix from the quantization matrix reconstructed by the decoding unit 209 to be used for the inverse quantization of the quantization coefficients of the subblock reconstructed by the decoding unit 203. More specifically, the inverse quantization / inverse transformation unit 204 refers to the prediction information reconstructed by the decoding unit 203 to identify the prediction mode of the block to be decoded, and selects a quantization matrix corresponding to the prediction mode from the quantization matrix reconstructed by the decoding unit 209 as the quantization matrix to be used for the inverse quantization of the quantization coefficients of the block to be decoded.
[0119] For example, if the prediction mode of the block to be decoded is intraprediction, the inverse quantization / inverse transformation unit 204 selects the quantization matrix shown in Figure 8A from among the quantization matrices shown in Figures 8A to 8C as the quantization matrix to be used for inverse quantization of the quantization coefficients of the block to be decoded.
[0120] For example, if the prediction mode of the block to be decoded is interprediction, the inverse quantization / inverse transformation unit 204 selects the quantization matrix shown in Figure 8B from the quantization matrices shown in Figures 8A to 8C as the quantization matrix to be used for inverse quantization of the quantization coefficients of the block to be decoded.
[0121] For example, if the prediction mode of the block to be decoded is template-fit prediction, the inverse quantization / inverse transformation unit 204 selects the quantization matrix shown in Figure 8C from the quantization matrices shown in Figures 8A to 8C as the quantization matrix to be used for inverse quantization of the quantization coefficients of the block to be decoded. Note that the quantization matrix used for inverse quantization is not limited to the three types mentioned above; it is acceptable as long as the same quantization matrix is used for the same prediction mode on both the encoding and decoding sides.
[0122] The inverse quantization / inverse transformation unit 204 then reconstructs the transformation coefficients by inverse quantizing the quantization coefficients of the subblock reconstructed by the decoding unit 203 using a quantization matrix selected for the subblock, and then reconstructs the prediction error by inverse orthogonal transformation of these transformation coefficients.
[0123] The image playback unit 205 appropriately refers to the frame memory 206 based on the prediction information reproduced by the decoding unit 203, performs prediction processing for each subblock, and generates (plays back) a predicted image for that subblock. The image playback unit 205 identifies the prediction mode of the subblock by referring to the prediction information, and refers to the frame memory 206 to perform prediction processing according to the identified prediction mode to generate a predicted image for that subblock. As described in the first embodiment, there are three types of prediction modes: intra prediction, inter prediction, and template fitting prediction. The image playback unit 205 generates a predicted image for that subblock by performing the prediction processing corresponding to the prediction mode of the subblock from among these three types of prediction processing. Each of these three types of prediction processing is performed in the same manner as the prediction unit 104.
[0124] The image playback unit 205 then adds the predicted image of the subblock to the prediction error of the subblock reproduced by the inverse quantization / inverse transformation unit 204 to generate a reproduced image of the subblock, and stores the generated reproduced image in the frame memory 206. The stored reproduced image becomes a prediction reference candidate when decoding other subblocks.
[0125] The in-loop filter unit 207, like the in-loop filter unit 109, performs in-loop filtering, such as deblocking filtering and sample adaptive offsetting, on the playback image stored in the frame memory 206, and then stores the playback image with the in-loop filtering applied back into the frame memory 206.
[0126] The control unit 250 controls the operation of the entire image decoding device. For example, the control unit 250 controls the operation of each of the above-mentioned functional units in the image decoding device. As a result, each of the above-mentioned functional units in the image decoding device operates under the control of the control unit 250.
[0127] Next, the process performed by the image decoding device to decode a single frame bitstream in a video will be explained according to the flowchart in Figure 4. Since the details of the process at each step are as described above, a brief explanation will follow below.
[0128] In step S401, the separation / decoding unit 202 performs the various separations described above from the bitstream generated by the image encoding device according to the first embodiment to extract the coded data of the quantization matrix and the coded data of the image. The separation / decoding unit 202 then supplies the coded data of the quantization matrix to the decoding unit 209 and the coded data of the image to the decoding unit 203.
[0129] In step S402, the decoding unit 209 decodes the coded data of the quantization matrix supplied from the separation decoding unit 202 to reconstruct a one-dimensional difference matrix, and then reconstructs a two-dimensional quantization matrix from the reconstructed one-dimensional difference matrix. In step S403, the decoding unit 203 decodes the coded data of the image supplied from the separation decoding unit 202 to reconstruct the quantization coefficients and prediction information.
[0130] In step S404, the inverse quantization / inverse transformation unit 204 selects a quantization matrix from the quantization matrix reconstructed by the decoding unit 209 in step S402 to be used for inverse quantization of the quantization coefficients of the subblock reconstructed by the decoding unit 203 in step S403. The inverse quantization / inverse transformation unit 204 then reconstructs the transformation coefficients by inverse quantization of the quantization coefficients of the subblock reconstructed by the decoding unit 203 using the quantization matrix selected for the subblock, and then reconstructs the prediction error by inverse orthogonal transformation of the transformation coefficients.
[0131] In step S405, the image playback unit 205 appropriately refers to the frame memory 206 based on the prediction information reproduced by the decoding unit 203 in step S403, performs prediction processing for each subblock, and generates (plays back) the predicted image for that subblock.
[0132] The image playback unit 205 then adds the predicted image of the subblock to the prediction error of the subblock reproduced by the inverse quantization / inverse transformation unit 204 in step S404 to generate a reproduced image of the subblock, and stores the generated reproduced image in the frame memory 206.
[0133] In step S406, the control unit 250 determines whether the processing in steps S403 to S405 has been performed for all basic blocks in the frame. If the result of this determination is that the processing in steps S403 to S405 has been performed for all basic blocks in the frame, the process proceeds to step S407. On the other hand, if there are still basic blocks in the frame that have not yet undergone the processing in steps S403 to S405 (unprocessed basic blocks), the process proceeds to step S403, and the processing in steps S403 to S405 is performed for the unprocessed basic blocks.
[0134] In step S407, the in-loop filter unit 207 performs in-loop filtering on the playback image stored in the frame memory 206 in step S405, and stores the playback image that has undergone the in-loop filtering back into the frame memory 206.
[0135] Furthermore, the handling of the in-loop filtered playback image stored in the frame memory 206 is not limited to any specific method. For example, the control unit 250 may transmit the in-loop filtered playback image stored in the frame memory 206 to an external device via a network such as a LAN or the Internet, or it may display it on a display screen of the image decoding device.
[0136] Thus, according to this embodiment, even for subblocks generated by the image coding device according to the first embodiment using template fit prediction, it is possible to decode a bitstream with improved subjective image quality by controlling quantization for each frequency component.
[0137] In this embodiment, quantization matrices are defined individually for the three types of prediction methods: intra prediction, inter prediction, and template fitting prediction. The configuration involves decoding all three types of quantization matrices, but some of them may be shared.
[0138] For example, subblocks using template-fit prediction can also be dequantized using the quantization matrix in Figure 8A, similar to subblocks using intra-prediction. In this case, the quantization matrix in Figure 8C becomes unnecessary, and decoding can be omitted. This allows for decoding a bitstream that reduces the code amount of the quantization matrix in Figure 8C while mitigating image quality degradation caused by errors in prediction using pixels within the same frame, such as block distortion.
[0139] Furthermore, for example, subblocks using template-fit prediction may also be dequantized using the quantization matrix in Figure 8B, similar to subblocks using interpretation. In this case, the quantization matrix in Figure 8C becomes unnecessary, and decoding can be omitted. This allows for decoding a bitstream that reduces the code amount of the quantization matrix in Figure 8C while minimizing image quality degradation caused by errors in predictions using blocky pixel groups, such as jerky motion.
[0140] Furthermore, in this embodiment, the quantization matrix for a subblock is uniquely determined using template fit prediction, but it is also acceptable to make it selectable by introducing an identifier. This is as explained in the first embodiment.
[0141] This makes it possible to decode bitstreams that selectively implement quantization matrix code reduction and unique quantization control for subblocks using template fit prediction.
[0142] Furthermore, although this embodiment uses only three types of prediction methods—intra-prediction, inter-prediction, and template-fit prediction—it is not limited to these, and for example, intra-inter-combined prediction (CIIP) used in VVC may also be used. In this case, the quantization matrix used for subblocks using template-fit prediction and the quantization matrix used for subblocks using intra-inter-combined prediction can be made common. This makes it possible to decode a bitstream with the code amount reduced by the amount of the quantization matrix corresponding to the new prediction method. Note that there is no particular form in which of the quantization matrices used for subblocks using template-fit prediction and the quantization matrix used for subblocks using intra-inter-combined prediction is made common.
[0143] Furthermore, although this embodiment describes a case where the frame is an image, the frame is not limited to an image. For example, a bitstream encoded with a two-dimensional array of feature data used in machine learning, such as object recognition, may be used as the target for decoding. In this case, each element in this two-dimensional array can be treated as a pixel and the embodiment can be applied accordingly. This makes it possible to efficiently decode a bitstream encoded with feature data used in machine learning.
[0144] Note that the encoding device according to the first embodiment and the decoding device according to the second embodiment may be separate devices. Alternatively, a single device may be configured having the above-described functions of the encoding device according to the first embodiment and the above-described functions of the decoding device according to the second embodiment.
[0145] Furthermore, the encoding device may also have an imaging unit for capturing moving images, in which case the encoding device can be implemented as an imaging device capable of capturing moving images. Such an imaging device may further have the functions of a decoding device according to the second embodiment.
[0146] [Third Embodiment] Each of the functional units shown in Figures 1 and 2 may be implemented in hardware, or the functional units excluding frame memory 108 and frame memory 206 may be implemented in software (computer program). In the latter case, a computer device capable of executing such a computer program can be applied to an encoding device or a decoding device. Such a computer device can be a PC, smartphone, tablet terminal, or other computer device. An example of the hardware configuration of such a computer device will be explained using the block diagram in Figure 5.
[0147] The CPU 501 executes various processes using computer programs and data stored in the RAM 502. In doing so, the CPU 501 controls the operation of the entire computer system and also executes or controls the various processes described as being performed by the encoding and decoding devices.
[0148] The RAM 502 has an area for storing computer programs and data loaded from the ROM 503 and storage device 506, and an area for storing computer programs and data received from an external device via the I / F 507. Furthermore, the RAM 502 has a work area used by the CPU 501 when executing various processes. In this way, the RAM 502 can provide various areas as appropriate.
[0149] ROM 503 stores configuration data for the computer device, computer programs and data related to the startup of the computer device, computer programs and data related to the basic operation of the computer device, and so on.
[0150] The operation unit 504 is a user interface such as a keyboard, mouse, or touch panel, which allows the user to input various instructions and information to the computer device through operation.
[0151] The display unit 505 has an LCD screen or a touch panel screen and can display the processing results of the CPU 501 as images, text, etc. The display unit 505 may also be a projection device such as a projector that projects images and text.
[0152] The storage device 506 is a large-capacity information storage device such as a hard disk drive or an SSD. The storage device 506 stores the OS (operating system), computer programs and data for the CPU 501 to execute or control the various processes described as being performed by the encoding device and decoding device. The data stored in the storage device 506 may include frames to be encoded and bitstreams to be decoded. The frame memory 108 and frame memory 206 described above can be implemented using memory devices such as RAM 502 and storage device 506.
[0153] I / F 507 may include a communication interface for a computer device to communicate data with external devices via a network such as a LAN or the Internet. I / F 507 may also include an interface for connecting external devices such as display devices, memory devices, and projectors to the computer device.
[0154] The CPU 501, RAM 502, ROM 503, operation unit 504, display unit 505, storage device 506, and I / F 507 are all connected to the system bus 508. Note that the hardware configuration shown in Figure 5 is merely one example of a hardware configuration for a computer device applicable to an encoding device or decoding device, and can be modified or changed as appropriate.
[0155] In the above configuration, when the power to the computer device is turned ON, the CPU 501 executes the boot program in the ROM 503, loads the OS stored in the storage device 506 into the RAM 502, and starts the OS. As a result, the computer device becomes capable of communication via the I / F 507 and functions as an encoding device and a decoding device. Then, under the control of the OS, the CPU 501 loads an application related to encoding (an application corresponding to the flowchart in Figure 3) from the storage device 506 into the RAM 502 and executes it, so that the CPU 501 functions as each of the functional units in Figure 1 (excluding the frame memory 108), and the computer device functions as an encoding device. On the other hand, the CPU 501 loads an application related to image decoding (an application corresponding to the flowchart in Figure 4) from the storage device 506 into the RAM 502 and executes it, so that the CPU 501 functions as each of the functional units in Figure 2 (excluding the frame memory 206), and the computer device functions as a decoding device.
[0156] The numerical values, processing timing, processing order, processing entity, data (information) structure / acquisition method / destination / source / storage location, etc., used in the above embodiment are given as examples for the purpose of providing a concrete explanation, and are not intended to limit the scope to such examples.
[0157] Furthermore, some or all of the embodiments described above may be used in appropriate combinations. Alternatively, some or all of the embodiments described above may be used selectively.
[0158] (Other Embodiments) The present invention can also be realized by supplying a program that implements one or more of the functions of the above embodiments to a system or device via a network or storage medium, and by having one or more processors in the computer of that system or device read and execute the program. It can also be realized by a circuit (e.g., ASIC) that implements one or more functions.
[0159] The technical ideas derived from this disclosure are not limited to the exemplary embodiments disclosed, but are intended to encompass various modifications of the exemplary embodiments, or substitutions with equivalent structures or functions. The scope of the following claims should be interpreted in the broadest way to encompass all such modifications and equivalent structures and functions.
[0160] This application claims priority based on Japanese Patent Application No. 2024-229379, filed on 25 December 2024, and all of its contents are incorporated herein by reference.
Claims
1. An encoding device comprising: prediction means for searching for a group of pixels from a group of encoded pixels in a frame that is determined to be similar to a group of pixels adjacent to a block to be encoded in the frame, generating a predicted image composed of the encoded pixels of the frame based on the searched group of pixels, and calculating a prediction error using the predicted image; conversion means for frequency-converting the prediction error to generate conversion coefficients; quantization means for quantizing the conversion coefficients using a quantization matrix to generate quantization coefficients; and encoding means for encoding the quantization coefficients.
2. The encoding device according to claim 1, characterized in that the prediction means generates a block consisting of an encoded group of pixels in the frame that is adjacent to the searched group of pixels in the frame, as the predicted image.
3. The encoding device according to claim 1 or 2, characterized in that the quantization means quantizes the transformation coefficients using a quantization matrix corresponding to the prediction process for generating the predicted image.
4. The encoding device according to claim 3, characterized in that the quantization means quantizes the transformation coefficients using a quantization matrix corresponding to the intra prediction.
5. The encoding device according to claim 3, characterized in that the quantization means quantizes the transformation coefficients using a quantization matrix corresponding to interprediction.
6. The encoding device according to claim 3, characterized in that the quantization means quantizes the transformation coefficients using a quantization matrix corresponding to intra-inter-composite prediction.
7. The encoding apparatus according to any one of claims 1 to 6, characterized in that the encoding means encodes the quantization matrix.
8. The encoding apparatus according to claim 7, characterized in that the encoding means generates a bitstream including the code data of the quantization coefficients and the code data of the quantization matrix.
9. The encoding device according to any one of claims 1 to 8, characterized in that the frame is a two-dimensional array of feature data.
10. A decoding device for decoding a bitstream generated by an encoding device according to claim 8, comprising: decoding means for decoding the quantization coefficients and the quantization matrix from the bitstream; and generating means for generating a frame using the quantization coefficients and the quantization matrix decoded by the decoding means.
11. An encoding method performed by an encoding device, comprising: a prediction step in which the prediction means of the encoding device searches for a group of pixels from a group of encoded pixels in a frame that is determined to be similar to a group of pixels adjacent to a block to be encoded in the frame, generates a predicted image composed of the encoded pixels of the frame based on the searched group of pixels, and calculates a prediction error using the predicted image; a conversion step in which the conversion means of the encoding device frequency-converts the prediction error to generate conversion coefficients; a quantization step in which the quantization means of the encoding device quantizes the conversion coefficients using a quantization matrix to generate quantization coefficients; and an encoding step in which the encoding means of the encoding device encodes the quantization coefficients.
12. A decoding method performed by a decoding device that decodes a bitstream generated by the encoding method described in claim 11, wherein the decoding means of the decoding device comprises a decoding step of decoding the quantization coefficients and the quantization matrix from the bitstream, and a generation step of the generation means of the decoding device generating a frame using the quantization coefficients and the quantization matrix decoded in the decoding step.
13. A computer program for causing a computer to function as one of the means of an encoding apparatus according to any one of claims 1 to 9.
14. A computer program for causing a computer to function as each of the means of the decoding device described in claim 10.