Method and device of improving local illumination compensating for video encoding and decoding, and apparatus
The PDLIC method improves local illumination compensation by adapting templates based on position-dependent reference blocks, enhancing coding efficiency and reducing distortion in video encoding and decoding.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SHENZHEN TCL NEW-TECH CO LTD
- Filing Date
- 2024-12-21
- Publication Date
- 2026-06-25
AI Technical Summary
Conventional local illumination compensation (LIC) models are inadequate for complex illumination scenarios, leading to inaccurate parameter derivation and reduced coding efficiency due to insufficient template similarity and data availability.
A Position-Dependent Local Illumination Compensating (PDLIC) method that adapts the template based on position information, determining reference blocks and areas to improve parameter accuracy and fit complex illumination patterns.
Enhances coding efficiency by better fitting complex illumination patterns with minimal additional bit consumption, resulting in improved prediction performance and reduced distortion.
Smart Images

Figure CN2024141249_25062026_PF_FP_ABST
Abstract
Description
METHOD AND DEVICE OF IMPROVING LOCAL ILLUMINATION COMPENSATING FOR VIDEO ENCODING AND DECODING, AND APPARATUSTECHNICAL FIELD
[0001] The present invention relates to the field of video encoding and decoding, more particularly to a method and a device of improving local illumination compensating for video encoding and decoding.BACKGROUND
[0002] Video coding (video encoding and decoding) is widely used in digital video applications such as broadcast digital television, video distribution over the internet and mobile networks, real-time conversational applications such as video conferencing, video chat and, DVD and blu-ray discs, and security applications for camcorders.
[0003] With the development of block-based video coding in the h. 261 standard, video coding standards including MPEG-1 video, MPEG-2 video, ITU-T H. 262 / MPEG-2, ITU-T H. 263, ITU-T H. 264 / MPEG-4 part 10 advanced video coding (Advanced Video Coding, AVC) , ITU-T H. 265 / high efficiency video coding (High Efficiency Video Coding, HEVC) , and extensions of such standards, such as scalability and / or 3D (wireless-dimension) extensions have evolved. As video creation and use becomes more widespread, video traffic becomes the biggest burden on communication networks and data storage. One of the goals of most video coding standards is therefore to reduce the bit rate without sacrificing picture quality compared to previous standards.
[0004] In the traditional motion-based inter-prediction, a coding block is reconstructed by combining the residuals with a prediction block generated from the reference blocks. This is also called motion compensation. In general, coding efficiency is related to the similarity between the coding block and the prediction block, which implies the distortion level.
[0005] Currently most contributions model the local illumination variation with a one-dimensional linear model. This model may be efficient enough for some typical lightning change scenarios. However, local illumination variation in reality can be more complicated, especially for large blocks. More accurate model is desired to further improve the performance of local illumination compensation (LIC) . That is, conventional LIC is inadequate to fit complicated local illumination variation.
[0006] Furthermore, the template is crucial for LIC because the parameters of the model are derived according to the samples in the template. The template is a previous encoded area, which is supposed to have high correlation to the coding block, and will also be decoded before the current block in the decoding process. By using a template for parameter derivation, the parameters can be implicitly derived and no bits are required to represent them in the bitstream. This raises another problem that the parameter accuracy may not be adequate if the content similarity between the current block and its template is low or there are no enough data for parameter derivation. For the current design, the template for LIC is one adjacent line on the top of the block and one adjacent column on the left of the block, which has room for improvement.SUMMARY
[0007] An object of the present disclosure is to propose a method and a device of improving local illumination compensating for video encoding and decoding, which can solve issues in the prior art and / or other issues.
[0008] In a first aspect of the present disclosure, a method for improving Local Illumination Compensating, applied to a video decoder is provided. The method includes: enabling a Position-Dependent Local Illumination Compensating (PDLIC) for a current coding block; determining a template for the current coding block; determining a reference block related to the current coding block, and reference areas for the reference block, based on the current coding block and the template, wherein coordination of reference samples of the reference block map coordination of samples of the current coding block; determining Local Illumination Compensation parameters based on samples of the template, reference samples of the reference area, and position information of samples in the template; performing the Position-Dependent Local Illumination Compensating for the current coding block based on the Local Illumination Compensation parameters.
[0009] In a second aspect of the present disclosure, a method for improving Local Illumination Compensating, applied to a video decoder is provided. The method includes: determining whether a number of samples of an original template is less than a template adaption threshold; setting an expanded template as a template for a current coding block, when the number of the samples of the original template is less than the template adaption threshold, wherein a number of samples of the expanded template equals or in excess of the template adaption threshold; determining a reference block related to the current coding block, and reference areas for the reference block, based on the current coding block and the template, wherein coordination of reference samples of the reference block map coordination of samples of the current coding block; determining Local Illumination Compensation parameters based on samples of the template, reference samples of the reference area, and position information of samples in the template; and performing the Position-Dependent Local Illumination Compensating for the current coding block based on the Local Illumination Compensation parameters.
[0010] In a third aspect of the present disclosure, a video decoder is configured to perform the above methods.
[0011] In a fourth aspect of the present disclosure, a non-transitory machine-readable storage medium has stored thereon instructions that, when executed by a computer, cause the computer to perform the above method.
[0012] In a fifth aspect of the present disclosure, a chip includes a processor, configured to call and run a computer program stored in a memory, to cause a device in which the chip is installed to execute the above method.
[0013] In a sixth aspect of the present disclosure, a computer readable storage medium, in which a computer program is stored, causes a computer to execute the above method.
[0014] In a seventh aspect of the present disclosure, a computer program product includes a computer program, and the computer program causes a computer to execute the above method.
[0015] In an eighth aspect of the present disclosure, a computer program causes a computer to execute the above method.
[0016] In contrast to prior art, the present disclosure proposes a position-Dependent Local Illumination Compensating (PDLIC) method to cope with the local illumination changes. By incorporating the position parameters, new models are expected to fit complicated illumination patterns better. The proposed template adaption scheme is supposed to improve the fitting performance from the perspective of improving parameter accuracy. By using PDLIC for the current coding block and determining a template for the current coding block, a better prediction performance can be obtained in the picture / video coding process and less distortion can be observed with little additional bit consumption. The present disclosure may improve the coding efficiency.BRIEF DESCRIPTION OF DRAWINGS
[0017] In order to illustrate the embodiments of the present disclosure or related art more clearly, the following figures will be described in the embodiments are briefly introduced. It is obvious that the drawings are merely some embodiments of the present disclosure, a person having ordinary skill in this field can obtain other figures according to these figures without paying the premise.
[0018] FIG. 1 is a block diagram illustrating a video encoding and decoding system according to an embodiment of the present disclosure.
[0019] FIG. 2 illustrates a block diagram of a video encoder depicted in FIG. 1.
[0020] FIG. 3 illustrates a block diagram of a video decoder depicted in FIG. 1.
[0021] FIG. 4 illustrates a relationship of a current coding block, templates, a reference block, and reference areas.
[0022] FIG. 5 illustrates a flow chart of a method of improving Local Illumination Compensating applied to a video decoder according to a first embodiment of the present disclosure.
[0023] FIG. 6 illustrates a flow chart of a step S500 applied to a video decoder according to an embodiment of the present disclosure.
[0024] FIG. 7 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure.
[0025] FIG. 8 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure.
[0026] FIG. 9 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure.
[0027] FIG. 10 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure.
[0028] FIG. 11 illustrates a flow chart of a step S520 applied to a video decoder according to an embodiment of the present disclosure.
[0029] FIG. 12 illustrates expanded template after performing template adaption.
[0030] FIG. 13 illustrates an example of expanded templates.
[0031] FIG. 14 is an example of a computing device according to an embodiment of the present disclosure.DETAILED DESCRIPTION OF EMBODIMENTS
[0032] Embodiments of the present disclosure are described in detail with the technical matters, structural features, achieved objects, and effects with reference to the accompanying drawings as follows. Specifically, the terminologies in the embodiments of the present disclosure are merely for describing the purpose of the certain embodiment, but not to limit the disclosure.
[0033] Video coding refers to the processing of a sequence of images that form a video sequence. Video coding includes video encoding and video decoding. Video encoding is performed on the source side, involving compressing the original video image to reduce the amount of data needed to represent the video image (for more efficient storage and / or transmission) . Video decoding is performed on the destination side, involving inverse processing with respect to the encoder to reconstruct the video image.
[0034] For lossless video coding, the original video image may be reconstructed, i.e., the reconstructed video image has the same quality as the original video image. For lossy video coding, further compression (e.g., by quantization) is performed to reduce the amount of data representing video images that cannot be fully reconstructed at the decoder. That is, the quality of the reconstructed video images is lower or worse than the quality of the original video images.
[0035] Each picture of a video sequence is divided into a set of non-overlapping blocks. At the encoder, video is encoded by using intra-picture prediction and / or inter-picture prediction to generate a prediction block, subtracting the prediction block from a current block to obtain a residual block, transforming the residual block in the transform domain and quantizing the residual block to reduce the amount of data to be transmitted (compression) . At the decoder, inverse processing with respect to the encoder is applied to the encoded block or compressed block to reconstruct the current block for presentation. Furthermore, the encoder replicates the decoder processing loop so that both will generate the same predictions (e.g., intra-picture prediction and inter-picture prediction) and reconstructions to process the subsequent block.
[0036] As used in the present disclosure, the term “video coder” refers to both video encoders and video decoders. Similarly, the terms “video coding” or “coding” may refer to video encoding or video decoding. The term "picture" , the term "frame" or "picture" may be used as synonyms in the field of video coding. It should be assumed that techniques described with reference to coding may be performed by either a video encoder or a video decoder. In some portions of this application, certain techniques may be described with reference to video decoding or to a video decoder. It should not be assumed, however, that such techniques are not applicable to video encoding or may be not be performed by a video encoder. Such techniques may, for example, be performed as part of determining how to encode video data or may be performed as part of a video decoding loop in a video encoder. As used in this disclosure, the term “current coding block” refers to a block currently being coded, as opposed to a block that is already coded or yet to be coded. Similarly, a current coding unit (CU) , prediction unit (PU) , or transform unit (TU) , refers to a coding unit, prediction unit, or transform unit that is currently being coded.
[0037] FIG. 1 is a block diagram illustrating a video encoding and decoding system 1 according to an embodiment of the present disclosure. As shown in FIG. 1, the video encoding and decoding system 1 includes a source device 10 that provides encoded video data to be decoded by a destination device 200. In particular, the source device 10 provides the video data to the destination device 200 via a direct communication link (e.g., a direct wired connection or a wireless connection) or any type of network (e.g., a wired network or a wireless network or any combination thereof, or any type of private and public networks, or any combination thereof) .
[0038] The source device 10 and the destination device 200 may be a wide range of devices such as desktop computers, notebook computers, mobile phones, tablet computers, display devices, televisions, cameras, digital media players, video gaming consoles, video streaming device. The source device 10 and the destination device 200 may be equipped for wireless communication. Thus, the source device 10 and the destination device 200 may be wireless communication devices. The source device 10 is a video encoding device for encoding video data. The destination device 200 is a video decoding device for decoding video data.
[0039] As illustrated in FIG. 1, the source device 10 includes a video source 102, a memory 104 configured to store video data, a video encoder 110, and an output interface 108. The destination device 200 includes an input interface 208, a memory 204 configured to store encoded video data, a video decoder 220, and a display device 202. According to the embodiment of the present disclosure, the video encoder 110 of the source device 10 and the video decoder 220 of the destination device 200 may be configured to classify, by processing circuitry, luma samples of a neighboring luma block of a reference block and luma samples of a neighboring luma block of a current block into a plurality of groups, deriving, by the processing circuitry, one or more local illumination compensation parameters for each group of the plurality of groups to generate a plurality of local illumination compensation parameters for the current block, deriving, by the processing circuitry, a plurality of linear models between the neighboring luma block of the reference block and the neighboring luma block of the current block using the plurality of local illumination compensation parameters for the current block, and generating, by the processing circuitry, a prediction block using the plurality of linear models. Thus, the source device 10 is a video encoding device, while destination device 200 is a video decoding device.
[0040] In other examples, the source device 10 and the destination device 200 include other components or arrangements. For example, the source device 10 may receive video data from an external video source, such as an external camera. The destination device 200 may interface with an external display device, rather than including an integrated display device.
[0041] The system 10 is merely one example. Techniques for processing video data may be performed by any digital video encoding and / or decoding device. Although generally the techniques of this disclosure are performed by a video encoding device, the techniques may also be performed by a video encoder / decoder, typically referred to as a “CODEC. ” The source device 10 and the destination device 200 are merely examples of such coding devices in which the source device 10 generates coded video data for transmission to the destination device 200. In some embodiments, the source device 10 and the destination device 200 may operate in a substantially symmetrical manner such that each of the source device 10 and the destination device 200 include video encoding and decoding components. Hence, the system 1 may support one-way or two-way video transmission between the source device 10 and the destination device 200, e.g., for video streaming, video playback, video broadcasting, or video telephony.
[0042] The video source 102 of the source device 10 may include a video capture device, such as a video camera, a video archive containing previously captured video, and / or a video feed interface to receive video data from a video content provider. The video source 102 may generate computer graphics-based data as the source video, or a combination of live video, archived video, and computer-generated video. The source device 10 may comprise one or more data storage media (e.g., memory 104) configured to store the video data. The techniques described in this disclosure may be applicable to video coding in general, and may be applied to wireless and / or wired applications. In each case, the captured, pre-captured, or computer-generated video may be encoded by the video encoder 110. The output interface 108 may be used, for example, to package the encoded image data into a suitable format (e.g., packets) and / or to process the encoded image data for transmission over a communication link or network using any type of transmission encoding or processing.
[0043] The input interface 208 (e.g., a receiver) forms a corresponding part of the output interface 108 and may be used to receive the transmitted data and to process the transmitted data using any type of corresponding transmission decoding or processing and / or unpacking to obtain the encoded image data.
[0044] In some embodiments, encoded data may be output from output interface 108 to a storage device. Similarly, encoded data may be accessed from the storage device by input interface. The storage device may include any of a variety of distributed or locally accessed data storage media such as a hard drive, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or any other suitable digital storage media for storing encoded video data. In a further example, the storage device may correspond to a file server or another intermediate storage device that may store the encoded video generated by source device 10. The destination device 200 may access stored video data from the storage device via streaming or download. The file server may be any type of server capable of storing encoded video data and transmitting that encoded video data to the destination device 200. Example file servers include a web server (e.g., for a website) , an FTP server, network attached storage (NAS) devices, or a local disk drive. The destination device 200 may access the encoded video data through any standard data connection, including an Internet connection. This may include a wireless channel (e.g., a Wi-Fi connection) , a wired connection (e.g., DSL, cable modem, etc. ) , or a combination of both that is suitable for accessing encoded video data stored on a file server. The transmission of encoded video data from the storage device may be a streaming transmission, a download transmission, or a combination thereof.
[0045] The techniques may be applied to video coding in support of any of a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, Internet streaming video transmissions, such as dynamic adaptive streaming over HTTP (DASH) , digital video that is encoded onto a data storage medium, decoding of digital video stored on a data storage medium, or other applications. In some embodiments, system 1 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and / or video telephony.
[0046] The input interface 208 may be configured to receive encoded video data. The memory 204 may be configured to store encoded video data, such as encoded video data (e.g., a bitstream) received by the input interface 208. The display device 202 displays the decoded video data to a user, and may comprise any of a variety of display devices such as a liquid crystal display (LCD) , an organic light emitting diode (OLED) display, a plasma display, a projector, a micro LED display, a liquid crystal on silicon (liquid crystal on silicon, LCoS) , a digital light processor (digital light processor, DLP) , or another type of display device.
[0047] The video encoder 110 and video decoder 220 each may be implemented as any of a variety of suitable encoder circuitry, such as one or more microprocessors, digital signal processors (DSPs) , application specific integrated circuits (ASICs) , field programmable gate arrays (FPGAs) , discrete logic, software, hardware, firmware or any combinations thereof. When the techniques are implemented partially in software, a device may store instructions for the software in a suitable, non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the techniques of this disclosure. Each of video encoder 110 and video decoder 220 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder / decoder (CODEC) in a respective device.
[0048] In some embodiments, the video encoder 110 and video decoder 220 may operate according to a video coding standard developed by the video coding joint working group (joint collaboration team on video coding, JCT-VC) of the ITU-T video coding expert group (video coding experts group, VCEG) and the ISO / IEC moving picture expert group (motion picture experts group, MPEG) with reference to high-efficiency video coding (high-efficiency video coding, HEVC) or multi-function video coding (versatile video coding, VVC) . Those of ordinary skill in the art will appreciate that embodiments of the present invention are not limited to HEVC or VVC.
[0049] The image can be considered as a two-dimensional array or matrix of samples with intensity values. The samples in the array may also be referred to as pixels. The number of samples in the horizontal and vertical directions of the array or image defines the size and / or resolution of the image. To represent color, three color components are employed, i.e., an image may be represented by or include three sample arrays. In RGB format or color space, the image includes corresponding arrays of red, green, and blue samples. However, in video coding, each pixel is typically represented by a luminance and chrominance format or color space, such as YCbCr, which includes a luminance component indicated by Y (sometimes replaced with L) and two chrominance components indicated by Cb and Cr. The luminance (or luma) component Y represents the luminance or gray-scale intensity (e.g., as in a gray-scale image) , while the two chrominance (or chroma) components Cb and Cr represent the chrominance or color information components. Accordingly, an image in YCbCr format includes a luminance sample array of luminance sample values (Y) and two chrominance sample arrays of chrominance values (Cb and Cr) . The RGB formatted image may be converted or transformed to YCbCr format and vice versa. This process is also known as color conversion or conversion. If the image is monochromatic, the image may include only luminance sample values. Accordingly, for example, the image may be an array of luma samples in a monochrome format or one array of luma samples and two corresponding arrays of chroma samples in a 4: 2: 0, 4: 2: 2, and 4: 4: 4 color format.
[0050] Please refer to FIG. 2 illustrating a block diagram of a video encoder 110 illustrated in FIG. 1. The video encoder 110 includes multiple modules, including a block partitioning unit 1101, a transform and quantization unit 1102, an inter-frame estimation unit 1103, an inter-frame prediction unit 1104, motion compensation unit 1105, motion estimation unit 1106, an inverse transformation and inverse quantization unit 1107, a filter control analysis unit 1108, a filtering unit 1109, an encoding unit 1110, an encoded image buffer unit 1111, and a subtractor 1112.
[0051] Original video signals comprise video frames. Each video frame can be divided into blocks by a block partitioning unit 1101. These blocks may also be referred to as root blocks (root blocks) , macro blocks (H. 264 / AVC) , or coding tree blocks (coding tree block, CTB) or Coding Tree Units (CTU) (H. 265 / HEVC and VVC) . The block partitioning unit 1101 may be used to use the same block size for all images in the video sequence and to define a corresponding grid of block sizes, or to change the block sizes between images or groups or subsets of images and to divide each image into corresponding blocks. For each of the video frames, the subtractor 1112 generates residual pixel information of a residual frame by subtracting the video frame by prediction blocks output by the inter-frame prediction unit 1104 or the motion compensation unit 1105. The residual pixel information obtained after inter-frame prediction (motion compensation) , is transformed by the transformation and quantization unit 1102. The transformation includes transforming the residual pixel information from the pixel domain to a transform domain, and the resulting transform parameters are quantized to further reduce the bit rate. The inter-frame estimation unit 1103 performs inter-frame estimation, and the inter-frame prediction unit 1104 performs inter-frame prediction on the video reconstruction blocks. Motion estimation performed by the motion estimation unit 1106 is a process of generating a motion vector that can estimate the motion of the video reconstruction block, and then motion compensation is performed by the motion compensation unit 1105 based on the determined motion vector. After determining an inter-frame prediction mode, the inter-frame prediction unit 1104 provides selected inter-frame predicted data to the encoding unit 1110, and the motion estimation unit 1106 also sends calculated motion vector data to the encoding unit 1110. The inverse transform and inverse quantization unit 1107 reconstructs the video reconstruction blocks and reconstructs a residual block in the pixel domain, and the filtering unit 1109 is controlled by the filter analysis unit 1108 to remove the blocking artifacts in the reconstructed residual block, and the encoding unit 1110 adds the reconstructed residual block to the prediction block of the encoded image buffer unit 1111 to generate a reconstructed block. The encoding unit 1110 is used for encoding various encoding parameters and quantized transform parameters (quantized transform parameters) into bitstream, and outputs the bitstream of the video signals. The encoded image buffer unit 1111 is used for storing reconstructed blocks as the reference blocks for inter-frame prediction. As the video image encoding progresses, new reconstructed blocks are continuously generated, and these blocks may be stored in the encoded image buffer unit 1111.
[0052] Please refer to FIG. 3 illustrating a block diagram of a video decoder 220 as illustrated in FIG. 1. The video decoder 220 may include multiple modules comprising a decoding unit 2201, an inverse transform and inverse quantization unit 2202, an inter-frame prediction unit 2203, a motion compensation unit 2204, a filtering unit 2205, a decoded image buffer unit 2206, and a post filtering unit 2207.
[0053] The input signals of video frames are encoded by the video decoder 220 to obtain an output bitstream. The video decoder 220 transmits the bitstream to the video decoder 220. The video decoder 220 receives the bitstream representing the video frames in an encoded format (i.e., in a compressed format) . In the video decoder 220, the bitstream is processed by the decoding unit 2201 to obtain decoded transform parameters. The inverse transform and inverse quantization unit 2202 process the transform parameters to generate a residual block in the pixel domain. The inter-frame prediction unit 2203 is operable to generate an inter-frame prediction block for a current video decoding block based on a determined inter-frame prediction mode and data from previously decoded blocks of the current video frame or picture. The motion compensation unit 2204 determines the inter-frame prediction information for the current video decoding block and generates an inter-frame prediction block by parsing the motion vector and other associated syntax elements. Finally, the decoded video block is formed by summing the residual block from the inverse transform and inverse quantization unit 2202 and the corresponding prediction block generated by the inter-frame prediction unit 2203 or the motion compensation unit 2204. In order to improve video quality, the decoded video blocks are filtered through the filtering unit 2205 to remove blocking artifacts. The decoded video block is then stored in the decoded image buffer unit 2206 as the reference block for subsequent intra-prediction or motion compensation, and for video output, i.e., to reproduce and reconstruct the original video signals. The output video can be optionally further processed by a post filtering unit 2207 for more suitable or enhanced viewing experiences.
[0054] An image includes multiple blocks each of which is a two-dimensional array or matrix of samples having intensity values (sample values) , but is smaller than the dimension of the image. The number of samples in the horizontal and vertical directions of the block defines the size of the block. That is, a block is an array of samples of MxN (M columns by N rows) . The block may comprise, for example, one sample array (e.g., a luminance array in the case of a monochrome image, or a luminance or chrominance array in the case of a color image) or three sample arrays (e.g., one luminance array and two chrominance arrays in the case of a color image) or any other number and / or variety of arrays, depending on the color format applied.
[0055] The video decoder 220 as shown in FIG. 2 may be used to encode the image on a block-by-block basis, e.g., encoding and prediction performed in each block. The video decoder 220 shown in FIG. 3 is used to divide and encode a picture by using slices (also referred to as video slices) , where a picture may be divided into or encoded using one or more slices (typically non-overlapping) , and each slice may include one or more blocks (e.g., CTUs) where each block may be rectangular.
[0056] Inter-frame prediction
[0057] The set of inter-frame prediction modes depends on the available reference pictures (i.e., previously at least partially decoded pictures stored in the decoded image buffer unit 2206 and other inter-frame prediction parameters, e.g., whether to use the entire reference picture or only a portion of the reference picture (e.g., a search window area around the area of the current block) to search for the best matching reference block, and / or whether to apply pixel interpolation, e.g., half-pixel and / or quarter-pixel interpolation.
[0058] The encoder 110 may be configured to select a reference block from a plurality of reference blocks in the same or different ones of a plurality of other pictures and to provide the reference picture (or reference picture index) and / or an offset (spatial offset) between the position (x, y coordinates) of the reference block and the position of the current block as inter-frame prediction parameters. This offset is also called Motion Vector (MV) .
[0059] The motion compensation unit 2204 is for receiving inter-prediction parameters and for performing inter-prediction based on the inter-prediction parameters to obtain the inter-prediction block. The motion compensation performed by the motion compensation unit 2204 may involve fetching or generating a prediction block based on the motion vector determined by the motion estimation, possibly performing interpolation of sub-pixel accuracy. Interpolation filtering may generate additional pixel samples from known pixel samples, thus potentially increasing the number of candidate prediction blocks that may be used to encode an image block. Upon receiving a motion vector of a PU of a current coding block, the motion compensation unit 2204 may place a prediction block to which the motion vector points in one of the reference image lists.
[0060] There is a scenario that the coding block and the prediction block have similar content structures but noticeable different pixel values due to lighting variation or fading effect. To deal with this, a series of Local Illumination Compensation (LIC) tools have been proposed. Local illumination compensation (LIC) is one of the techniques for improving motion compensation. The LIC is applied to the result of motion compensation, or generally to the result of inter-frame prediction (e.g., inter-frame prediction performed by inter-frame prediction unit 1104 of encoder 110 shown in FIG. 2 and / or inter-frame prediction unit 2203 of decoder 220 shown in FIG. 3) . LIC is modelled as a linear adaption with the form:
[0061] p′ (x, y) =α*p (x, y) +β, Equation (1)
[0062] where p (x, y) and p′ (x, y) are the prediction sample values with position (x, y) before and after applying LIC, respectively. In Equation (1) , α means the scale parameter and β is the offset parameter. The two parameters are usually obtained from a coded area adjacent to the current coding block, which is called a template. The template is assumed to have the same motion information with the current coding block. Hence, the origin pixel in the template and its predicted sample can be obtained, and the two linear model parameters can be calculated by the least squares method. And LIC is enabled or disabled adaptively for each inter-frame coded coding unit (CU) .
[0063] FIG. 4 illustrates on the left side a current coding block 401 (e.g., current CU) and templates 411U and 411L. For example, as shown in FIG. 4, one or more subsets of reference samples may be utilized to derive the parameters α and β. The upper template 411U includes samples (e.g., neighboring samples) that are adjacent to the upper boundary of the current coding block 401, and the left template 411L includes samples (e.g., neighboring samples) that are adjacent to the left boundary of the current coding block 401. The upper template 411U and left template 411L may belong to previously reconstructed neighboring blocks of the current coding block 401. FIG. 4 illustrates a reference block 402 and reference areas 412U, 412L on the right-hand side. The upper reference area 412U includes reference samples (e.g., neighboring reference samples) that are adjacent to the upper boundary of the reference coding block 402, and the left reference area 412L includes reference samples (e.g., neighboring reference samples) that are adjacent to the left boundary of the reference block 402. The samples of the current coding block 401 have the same relative position with respect to the reference samples of the reference block. The (relative) positions of the neighboring reference samples of the reference block and neighboring samples of the current block are matched.
[0064] The neighboring reference samples of the reference block 402 are decoded samples at locations adjacent to the left and upper boundaries of the reference block (e.g., from decoded pictures after loop filtering and deblocking) . The neighboring samples of the current coding block 401 are reconstructed samples at locations adjacent to the left and upper boundaries of the current coding block 401 (e.g., samples of reconstructed neighboring blocks prior to loop filtering and deblocking) .
[0065] Referring to FIG. 4 and Equation 1, the scale parameter α and the offset parameter β are calculated by using the least mean square (LMS) difference computation based on the following formula:
[0066]
[0067]
[0068] where N is the number of samples, C (n) is the neighboring reconstructed samples of the current block (e.g., the upper and / or left neighboring samples of the current coding block 401 shown in FIG. 4) , and L (n) is the neighboring reconstructed samples of the reference block (e.g., the upper and / or left neighboring reference samples of the reference block 402 shown in FIG. 4) .
[0069] In an exemplary embodiment, C is a set of neighboring samples of the current block and L is a set of neighboring reference samples (e.g., upper and / or left neighboring samples as shown in FIG. 4) of the reference block (i.e., the block that serves as an input for motion compensation and is referenced by a motion vector) .
[0070] When deriving the scale parameter α, the mean value can be removed from both sets (set C and set L) . In this case, the offset parameter β is further calculated by taking into account the difference between the average values of L and C. The parameters α and β may be referred to as compensation parameters.
[0071] When a CU is coded with merge mode, the LIC flag is copied from neighboring blocks, in a way similar to motion information copy in merge mode. Otherwise, an LIC flag is signalled for the CU to indicate whether LIC applies or not. When LIC is enabled for a CU, mean-removed sum of absolute difference (MR-SAD) and mean-removed sum of absolute Hadamard-transformed difference (MR-SATD) are used, instead of SAD and SATD, for integer pixel motion search and fractional pixel motion search, respectively.
[0072] Please refer to FIG. 5. FIG. 5 illustrates a flow chart of a method of improving Local Illumination Compensating applied to a video decoder according to a first embodiment of the present disclosure. The method includes the steps S500-S540.
[0073] At step S500, enable a Position-Dependent Local Illumination Compensating (PDLIC) for a current coding block.
[0074] At step S510, determine a template for the current coding block.
[0075] At step S520, determine a reference block related to the current coding block, and reference areas for the reference block, based on the current coding block and the template.
[0076] At step S530, determine Local Illumination Compensation parameters based on samples of the template, reference samples of the reference area, and position information of samples in the template.
[0077] At step S540, perform the Position-Dependent Local Illumination Compensating for the current coding block based on the Local Illumination Compensation parameters.
[0078] Please refer to FIGs. 3-5, at step S500, the video decoder 220 makes a decision on whether applying PDLIC on a current coding block 402. Generally speaking, all the information obtained from the bitstream explicitly and implicitly can be employed for decision-making. The basis can be the syntax explicitly encoded in the bitstream, or the usage of certain incompatible coding tools, or a size of the current coding block 401, or the number of available samples in the templates 411U, 411L.
[0079] At step S502, templates 411U, 411L for the current coding block 401 are determined. If PDLIC is determined to be applied, the video decoder 220 may identify the templates 411U, 411L. For example, one sample row adjacent to the upper boundary of the current block 401 is adopted as the template 411U, and one sample column adjacent to the left boundary of the current block 401 is adopted as the templates 411L. In another embodiments, two or more sample rows adjacent to the upper boundary of the current block 401 is adopted as the template 411U, and two or more sample columns adjacent to the left boundary of the current block 401 is adopted as the templates 411L.
[0080] At step S504, determining a reference block related to the current coding block, and reference areas for the reference block, based on the current coding block and the template. For Inter-frame prediction, the video decoder 220 locates the reference block for the current coding block in the reference frames, which requires the coordinates of the top-left pixel in the current coding block 401, the size of the current coding block 401, and motion information. Motion information includes motion vectors and reference frame indices. A template uses the same motion information of the coding block to find the corresponding region in the reference frame. To obtain the reference samples in the reference block 402 and the reference areas 412U, 412L as the template, interpolation may be required because the motion vectors included in the motion information may be with sub-pixel precision.
[0081] Please refer to FIG. 6. FIG. 6 illustrates a flow chart of a step S500 applied to a video decoder according to an embodiment of the present disclosure. As stated in step S500, enabling a Position-Dependent Local Illumination Compensating (PDLIC) for the current coding block is enabled in response to a detection of a flag for PDLIC enablement. In this embodiment, PDLIC enablement is indicated by one flag decoded from the bitstream. The step S500 includes steps S5000-S5006.
[0082] At step S5000, derive the flag for PDLIC enablement for the coding block.
[0083] At step S5002, determine whether the flag equal to an enablement value A.
[0084] At step S5004, disable PDLIC for the coding block when the flag does not equal to the enablement value A.
[0085] At step S5006, enable PDLIC for the coding block when the flag equals to the enablement value A.
[0086] In this embodiment, a flag indicating the PDLIC enablement is assigned to each coding block in the bitstream. Upon receiving coding blocks in the bitstream, the video decoder 220 derives the flag of each of the coding blocks to determine whether to enable the PDLIC. The flag can be represented by one or more bits, as long as it can indicates at least two statuses to indicate the enablement of PDLIC. The enablement value A (e.g. “1” ) is used to indicate that PDLIC is used, otherwise PDLIC is disabled if the flag has other values (e.g. “0” ) . A variant for this embodiment is that the enablement value B is used to indicate that PDLIC is disabled, otherwise PDLIC is enabled if the flag has other values. A variant for this embodiment is that the PDLIC is enabled in response to a detection of the flag. If the flag does not exist in the coding block, PDLIC is disabled.
[0087] Please refer to FIG. 7. FIG. 7 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure. The step S500 includes steps S5010-S5018.
[0088] At step S5010, derive the enablement of the LIC for the current coding block.
[0089] At step S5012, determine whether a LIC for the current coding block is enabled.
[0090] At step S5014, derive the flag for PDLIC enablement for the coding block.
[0091] At step S5015, determine whether the flag equal to an enablement value A.
[0092] At step S5016, disable PDLIC for the coding block when the flag does not equal to the enablement value A.
[0093] At step S5018, enable PDLIC for the coding block when the flag equals to the enablement value A.
[0094] In this embodiment, PDLIC enablement is determined by considering the enablement of other tools, because some coding tools are incompatible with PDLIC. PDLIC enablement is indicated by one flag and the enablement of the LIC simultaneously. The video decoder 220 derives the enablement of the LIC for the current coding block. Upon determining that the LIC for the current coding block is enabled, the video decoder 220 derives the flag of each of the coding blocks to determine whether to enable the PDLIC. The video decoder 220 disables PDLIC for the coding block upon detecting the LIC for the current coding block is not enabled. If the LIC for the current coding block is enabled, the video decoder 220 derives the flag for PDLIC enablement for the coding block. The flag can be represented by one or more bits, as long as it can indicates at least two statuses to indicate the enablement of PDLIC. The enablement value A (e.g. “1” ) is used to indicate that PDLIC is used, otherwise PDLIC is disabled if the flag has other values (e.g. “0” ) . A variant for this embodiment is that the enablement value B is used to indicate that PDLIC is disabled, otherwise PDLIC is enabled if the flag has other values. A variant for this embodiment is that the PDLIC is enabled in response to a detection of the flag. If the flag does not exist in the coding block, PDLIC is disabled.
[0095] Please refer to FIG. 8. FIG. 8 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure. The step S500 includes steps S5020-S5028.
[0096] At step S5020, determine whether a size parameter of the current coding block meet a first condition.
[0097] At step S5022, derive the flag for PDLIC enablement for the coding block.
[0098] At step S5024, determine whether the flag equal to an enablement value A.
[0099] At step S5026, enable PDLIC for the coding block when the flag equals to the enablement value A.
[0100] At step S5028, disable PDLIC for the coding block.
[0101] In this embodiment, PDLIC enablement is determined by a size of the current coding block. The embodiment is to apply PDLIC to large coding blocks. PDLIC is impossible to be used if the size of the coding block is not satisfied. The video decoder 220 determines whether a size parameter of the current coding block 401 meet a first condition.
[0102] With reference to FIG. 4, the size of the current coding block 401 can be expressed as M×N, where M and N are the width and height of the current coding block 401, respectively. The first condition is a size of the current coding block 401 equaling or in excess of a first threshold T0.When the size of the current coding block 401 is less than the first threshold T0 (e.g. 256) , the video decoder 220 does not enable PDLIC for the current coding block 401. When the size of the current coding block 401 equals to or exceeds the first threshold T0, the video decoder 220 enables PDLIC for the current coding block 401. The first condition is a width of the current coding block equaling or in excess of a second threshold T1 (e.g., 8) and a height of the current coding block equal or in excess of a third threshold T2 (e.g., 8) .
[0103] When the size of the current coding block 401 does not equal to or exceeds the second threshold T1 (e.g. 8) or the height of the current coding block 401 does not equal to or exceeds the third threshold T2 (e.g. 8) , the video decoder 220 does not enable PDLIC for the current coding block 401.
[0104] When the width of the current coding block 401 equals to or exceeds the second threshold T1 (e.g. 8) and the height of the current coding block 401 equals to or exceeds the third threshold T2 (e.g. 8) , the video decoder 220 derives the flag of each of the coding blocks to determine whether to enable the PDLIC. The video decoder 220 disables PDLIC for the coding block upon detecting the LIC for the current coding block is not enabled. If the LIC for the current coding block is enabled, the video decoder 220 derives the flag for PDLIC enablement for the coding block. The flag can be represented by one or more bits, as long as it can indicates at least two statuses to indicate the enablement of PDLIC. The enablement value A (e.g. “1” ) is used to indicate that PDLIC is used, otherwise PDLIC is disabled if the flag has other values (e.g. “0” ) . A variant for this embodiment is that the enablement value B is used to indicate that PDLIC is disabled, otherwise PDLIC is enabled if the flag has other values. A variant for this embodiment is that the PDLIC is enabled in response to a detection of the flag. If the flag does not exist in the coding block, PDLIC is disabled.
[0105] Please refer to FIG. 9. FIG. 9 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure. The step S500 includes steps S5030-S5038.
[0106] At step S5030, derive the enablement of the LIC for the current coding block.
[0107] At step S5031, determine whether a LIC for the current coding block is enabled.
[0108] At step S5032, determine whether a size parameter of the current coding block meet a first condition.
[0109] At step S5033, derive the flag for PDLIC enablement for the coding block.
[0110] At step S5034, determine whether the flag equal to an enablement value A.
[0111] At step S5036, enable PDLIC for the coding block when the flag equals to the enablement value A.
[0112] At step S5038, disable PDLIC for the coding block.
[0113] In this embodiment, PDLIC enablement is determined by a flag for PDLIC enablement, the enablement of the LIC, and a size of the current coding block. The video decoder 220 derives the enablement of the LIC for the current coding block. The video decoder 220 disables the PDLIC for the current coding block 401 when the LIC for the current coding block is not enabled. Upon determining that the LIC for the current coding block is enabled, the video decoder 220 determines whether a size parameter of the current coding block 401 meet a first condition. When the width of the current coding block 401 equals to or exceeds the second threshold T1 (e.g. 8) and the height of the current coding block 401 equals to or exceeds the third threshold T2 (e.g. 8) , the video decoder 220 derives the flag of each of the coding blocks to determine whether to enable the PDLIC. The video decoder 220 disables PDLIC for the coding block upon detecting the LIC for the current coding block is not enabled. If the LIC for the current coding block is enabled, the video decoder 220 derives the flag for PDLIC enablement for the coding block. The flag can be represented by one or more bits, as long as it can indicates at least two statuses to indicate the enablement of PDLIC. The enablement value A (e.g. “1” ) is used to indicate that PDLIC is used, otherwise PDLIC is disabled if the flag has other values (e.g. “0” ) . A variant for this embodiment is that the enablement value B is used to indicate that PDLIC is disabled, otherwise PDLIC is enabled if the flag has other values. A variant for this embodiment is that the PDLIC is enabled in response to a detection of the flag. If the flag does not exist in the coding block, PDLIC is disabled. It is noteworthy that if the LIC is used or the requirement on block size is not satisfied, the flag is supposed to be non-existent.
[0114] Please refer to FIG. 10. FIG. 10 illustrates a flow chart of a step S500 applied to a video decoder according to another embodiment of the present disclosure. The step S500 includes steps S5040-S5048.
[0115] At step S5040, determinate a template for the coding block.
[0116] At step S5041, determine whether the number of samples of the template is greater or equal to a template adaption threshold C.
[0117] At step S5043, derive the flag for PDLIC enablement for the coding block.
[0118] At step S5044, determine whether the flag equal to an enablement value A.
[0119] At step S5046, enable PDLIC for the coding block when the flag equals to the enablement value A.
[0120] At step S5048, disable PDLIC for the coding block.
[0121] With reference to FIG. 4 and FIG. 10, in this embodiment, enabling the PDLIC for the current coding block is determined based on a number of samples of the upper template 411U greater than / equaling to or greater than a template adaption threshold C, and based on a number of samples of the left template 411L greater than / equaling to or greater than a template adaption threshold D. In another embodiment, enabling the PDLIC for the current coding block is determined based on the sum of samples in the upper template 411U and the left template 411L greater than / equaling or greater than a template adaption threshold E. If the number of samples in the templates 411U, 411L is not enough, it may be hard to derive accurate parameters for PDLIC. The lack of samples may be caused by sample selection. Moreover, some coding blocks located on frame boundaries may not have enough samples because their templates may be out of the frames. In this embodiment, the decision of PDLIC enablement depends on the sample quantity in the templates 411U, 411L.
[0122] In a case that a subrange [L, R] is defined as [0, 2bitdepth-1] , where bitdepth represents a maximum number of bits that is used to represent the value of a sample, if a sample value is close to the upper bound or lower bound of this range, it may be a result of clipping. Due to this, such samples may be inefficient to indicate the underlying relationship between samples in templates 411U, 411L and samples in current coding block 401.
[0123] In this embodiment, a value of a sample of the templates 411U, 411L located in a range between a low bound L and a high bound R is selected and counted, while a value of a sample of the templates 411U, 411L beyond to the high bound R or low bound L of this range will be excluded. Optionally, the high bound R or low bound L are determined as follows:
[0124] L is equal to 2bitdepth-1-R. In this case, the midpoint of the subrange is the same as that of [0, 2bitdepth-1] . For example, [20, 235] and [25, 230] when bitdepth is 8.
[0125] L is inequal to 2bitdepth-1-R. In this case, the midpoint of the subrange has an offset to that of [0, 2bitdepth-1] . For example, [0, 235] and [10, 235] when bitdepth is 8.
[0126] As shown in FIG. 10, the video decoder 220 determines templates 411U and 411L for the current coding block 401. If the number of samples in the template 411U is less than / less than or equal to the template adaption threshold C or the number of samples in the template 411L is less than / less than or equal to the template adaption threshold D, or the sum of samples in the upper template 411U and the left template 411L is less than / less than or equal to the template adaption threshold E, the PDLIC will be disabled for the current coding block 401. It is regarded as the case that the samples are insufficient. For example, the template adaption threshold C, template adaption threshold D and template adaption threshold E can be 16. The template adaption threshold C , template adaption threshold D and template adaption threshold E can be different. When the number of samples in the template 411U is greater than / greater than or equal to the template adaption threshold C and the number of samples in the template 411L is greater than / greater than or equal to the template adaption threshold D, or when the sum of samples in the upper template 411U and the left template 411L is greater than / greater than or equal to the template adaption threshold E, the video decoder 220 derives the flag of each of the coding blocks to determine whether to enable the PDLIC. The video decoder 220 disables PDLIC for the coding block upon detecting the LIC for the current coding block is not enabled. If the LIC for the current coding block is enabled, the video decoder 220 derives the flag for PDLIC enablement for the coding block. The flag can be represented by one or more bits, as long as it can indicates at least two statuses to indicate the enablement of PDLIC. The enablement value A (e.g. “1” ) is used to indicate that PDLIC is used, otherwise PDLIC is disabled if the flag has other values (e.g. “0” ) . A variant for this embodiment is that the enablement value B is used to indicate that PDLIC is disabled, otherwise PDLIC is enabled if the flag has other values. A variant for this embodiment is that the PDLIC is enabled in response to a detection of the flag. If the flag does not exist in the coding block 401, PDLIC is disabled. Definition of PDLIC
[0127] This embodiment is to clarify the model form of PDLIC associated with the position information of samples. It is expected to enhance the model fitting capacity and makes the illumination-compensated reference sample value closer to the original sample value. LIC with position-dependent scale and offset
[0128] In this embodiment, the illumination-compensated reference sample, which is represented as p′, is the target value, and it is determined by the following parameters:
[0129] Reference sample: a sample in a reference block, and has one-to-one correspondence with the sample in the original coding block, which is illustrated by p.
[0130] Position-dependent scale parameter: the scale parameter contains the position information of the reference sample, and is represented as γ (x, y) .
[0131] Position-dependent offset parameter: the offset parameter contains the position information of the reference sample, and is represented as δ (x, y) .
[0132] The position-dependent scale γ (x, y) is used to adjust the reference sample proportionally to its sample value. The position-dependent offset parameter δ (x, y) is expected to be an additive offset. The form of the relationship between the target and these parameters is determined as the following,
[0133] p′=γ (x, y) ·p+δ (x, y) .
[0134] The position-dependent scale γ (x, y) is determined by several coordinate-dependent terms, and / or a position-independent term, and / or weighting parameters for the coordinate-dependent terms.
[0135] A position-dependent term relies on the position information, such as the horizontal coordinate x, the vertical coordinate y, nonlinear combination of the coordinates, nonlinear transform on the coordinates.
[0136] A position-independent term does not contain any position information, such as a constant, or a parameter depending on each coding block.
[0137] Option 1: One instant for γ (x, y) , which is the linear weighting of the coordinates, is showed as following
[0138] γ (x, y) = t0·x+t1·y+t2,
[0139] where t0 and t1 are two weighting parameters, and t2 is a position-independent parameter.
[0140] Option 2: An alternative instant for γ (x, y) , which has nonlinear combination of the coordinates of the coordinates, can be γ (x, y) = t0·x+t1·y+t2·xy+t3,
[0141] where t0, t1 and t2 are three weighting parameters, xy is a nonlinear combination of the coordinates, and t3 is a position-independent parameter.
[0142] Option 3: An alternative instant for γ (x, y) , which has nonlinear transform on the coordinates, can be γ (x, y) = t0·log2x+t1·log2y+t2,
[0143] where t0 and t1 are two weighting parameters, and t2 is a position-independent parameter.
[0144] The position-dependent offset δ (x, y) is determined by several coordinate-dependent terms, and / or a position-independent term, and / or weighting parameters for the coordinate-dependent terms.
[0145] A position-dependent term relies on the position information, such as the horizontal coordinate x, the vertical coordinate y, nonlinear combination of the coordinates, nonlinear transform on the coordinates.
[0146] A position-independent term does not contain any position information, such as a constant, or a parameter that varies for each coding block.
[0147] One instant for δ (x, y) which is the linear weighting of the coordinates, is showed as following δ (x, y) = s0·x+s1·y+s2,
[0148] where s0 and s1 are two weighting parameters, and s2 is a position-independent parameter.
[0149] An alternative instant for δ (x, y) , which has nonlinear combination of the coordinates of the coordinates, can be
[0150] δ (x, y) = s0·x+s1·y+s2·xy+s3, where s0, s1 and s2 are three weighting parameters, xy is a nonlinear combination of the coordinates, and s3 is a position-independent parameter.
[0151] An alternative instant for δ (x, y) , which has nonlinear transform on the coordinates, can be δ (x, y) = s0·log2x+s1·log2y+s2,
[0152] where s0 and s1 are two weighting parameters, and s2 is a position-independent parameter. LIC with position-dependent scale
[0153] In this embodiment, the illumination-compensated reference sample, which is represented as p′, is the target value, and it is determined by the following parameter:
[0154] Reference sample: a sample in a reference block, and has one-to-one correspondence with the sample in the original coding block, which is illustrated by p.
[0155] Position-dependent scale: the scale parameter that contains the position information of the reference sample, and is represented as γ (x, y) .
[0156] Position-independent offset: the offset parameter that does not depend on the position information of the reference sample, and is represented as δ.
[0157] The position-dependent scale γ (x, y) is used to adjust the reference sample proportionally to its sample value, and the position-independent offset parameter δ is expected to be an additive offset. The form of the relationship between the target and these parameters are showed in the following,
[0158] p′=γ (x, y) ·p+δ.
[0159] The position-dependent scale γ (x, y) is determined by several coordinate-dependent terms, and / or a position-independent term, and / or weighting parameters for the coordinate-dependent terms.
[0160] A position-dependent term relies on the position information, such as the horizontal coordinate x, the vertical coordinate y, nonlinear combination of the coordinates, nonlinear transform on the coordinates.
[0161] A position-independent term does not contain any position information, such as a constant, or a parameter that varies for each coding block.
[0162] In this embodiment, the definition of γ (x, y) is defined as following:
[0163] Option 1: One instant for γ (x, y) , which is the linear weighting of the coordinates, is showed as following
[0164] γ (x, y) = t0·x+t1·y+t2,
[0165] where t0 and t1 are two weighting parameters, and t2 is a position-independent parameter.
[0166] Option 2: An alternative instant for γ (x, y) , which has nonlinear combination of the coordinates of the coordinates, can be γ (x, y) = t0·x+t1·y+t2·xy+t3,
[0167] where t0, t1 and t2 are three weighting parameters, xy is a nonlinear combination of the coordinates, and t3 is a position-independent parameter.
[0168] Option 3: An alternative instant for γ (x, y) , which has nonlinear transform on the coordinates, can be γ (x, y) = t0·log2x+t1·log2y+t2,
[0169] where t0 and t1 are two weighting parameters, and t2 is a position-independent parameter.
[0170] The position-independent offset δ, which does not rely on the position information of the reference sample, can be a constant, or a parameter that varies for each coding block. LIC with position-dependent offset
[0171] In this embodiment, the illumination-compensated reference sample, which is represented as p′, is the target value, and it is determined by the following parameter:
[0172] Reference sample: a sample in a reference block, and has one-to-one correspondence with the sample in the original coding block, which is illustrated by p.
[0173] Position-independent scale: the scale parameter that does not depend on the position information of the reference sample, and is represented as γ.
[0174] Position-dependent offset: the offset parameter that contains the position information of the reference sample, and is represented as δ (x, y) .
[0175] The position-independent scale γ is used to adjust the reference sample proportionally to its sample value, and the position-dependent offset parameter δ (x, y) is expected to be an additive offset. The form of the relationship between the target and these parameters are showed in the following,
[0176] p′=γ·p+δ (x, y) .
[0177] The position-independent scale parameter γ, which does not rely on the position information of the reference sample, can be a constant, or a parameter that varies for each coding block.
[0178] The position-dependent offset parameter δ (x, y) is determined by several coordinate-dependent terms, and / or a position-independent term, and / or weighting parameters for the coordinate-dependent terms.
[0179] A position-dependent term relies on the position information, such as the horizontal coordinate x, the vertical coordinate y, nonlinear combination of the coordinates, nonlinear transform on the coordinates.
[0180] A position-independent term does not contain any position information, such as a constant, or a parameter that varies for each coding block.
[0181] In this embodiment, the definition of γ (x, y) is defined as following:
[0182] Option 1: One instant for γ (x, y) , which is the linear weighting of the coordinates, is showed as following
[0183] γ (x, y) = t0·x+t1·y+t2,
[0184] where t0 and t1 are two weighting parameters, and t2 is a position-independent parameter.
[0185] Option 2: An alternative instant for γ (x, y) , which has nonlinear combination of the coordinates of the coordinates, can be γ (x, y) = t0·x+t1·y+t2·xy+t3,
[0186] where t0, t1 and t2 are three weighting parameters, xy is a nonlinear combination of the coordinates, and t3 is a position-independent parameter.
[0187] Option 3: An alternative instant for γ (x, y) , which has nonlinear transform on the coordinates, can be γ (x, y) = t0·log2x+t1·log2y+t2,
[0188] where t0 and t1 are two weighting parameters, and t2 is a position-independent parameter.
[0189] Please refer to FIG. 11 and FIG. 12. FIG. 11 illustrates a flow chart of a step S510 applied to a video decoder according to an embodiment of the present disclosure. FIG. 12 illustrates expanded template 411Ue, 411Le after performing template adaption. The step S510 includes steps S5100 and S5102.
[0190] At step S5100, determine whether a number of samples of an original template is less than a template adaption threshold.
[0191] At step S5102, set an expanded template as a template for a current coding block, when the number of the samples of the original template is less than the template adaption threshold.
[0192] This embodiment is to clarify the expanded templates 411Ue, 411Le for template adaption. The idea in this embodiment is that template should be adapted to the block size. Specifically, a small template is suitable for a small block and large template is suitable for a large block. Accordingly, a tradeoff is considered between sample quantity and sample correlation, which is supposed to be beneficial for deriving more accurate model parameters. In one embodiment, templates 411U, 411L usually contains two subregions adjacent to the current coding block 401 as illustrated in FIG. 4. The left boundary of the upper template 411U is aligned to the left boundary of the current coding block 401, while the upper boundary of the left template 411L is aligned to the upper boundary of the current coding block 401.
[0193] Enabling the PDLIC for the current coding block is determined based on a number of samples of the upper template 411U greater than / equaling to or greater than a template adaption threshold C, and based on a number of samples of the left template 411L greater than / equaling to or greater than a template adaption threshold D, or based on the sum of samples in the upper template 411U and the left template 411L greater than / equaling to or greater than the template adaption threshold E. If the number of samples in the templates 411U, 411L is not enough, it may be hard to derive accurate parameters for PDLIC. The lack of samples may be caused by sample selection. Moreover, some coding blocks located on frame boundaries may not have enough samples because their templates may be out of the frames. In this embodiment, the decision of PDLIC enablement depends on the sample quantity in the templates 411U, 411L. Accordingly, an expanded template is set as the template for the current coding block, when a number of samples of the upper template 411U is less than / less than or equal to the template adaption threshold C, and a number of samples of the left template 411L is less than / less than or equal to the template adaption threshold D, or when the sum of samples in the upper template 411U and the left template 411L is less than / less than or equal to the template adaption threshold E.
[0194] In this embodiment, the width and height of each expanded templates 411Ue, 411Le should be adapted to the size of the current coding block 401. Assuming that a size of the current coding block 401 is M×N, the size of the upper expanded template 411Ue and the left expanded template 411Le depend on the width M and the height N of the current coding block 401. That is, the width α (M) of the left expanded template 411Le and the width γ (M) of the upper expanded template 411Ue depend on the width of the current coding block 401, while the height δ (N) of the left expanded template 411Le and the height β (N) of the upper expanded template 411Ue depend on the height N of the current coding block 401. Accordingly, the size of the upper expanded template 411Ue can be expressed as α (M) ×β (N) , and the size of left expanded template 411Le can be expressed as γ (M) ×δ (N) , where α, β, γ, δ are four independent functions. In the following, only the determination of α is depicted, and β, γ, δ can be determined by the same rule.
[0195] The function α can be linear and nonlinear with respect to M. For linear transform, the function α is, for example, derived by the following equation:
[0196] Option 1: α (M) =a×M+b,
[0197] where a and b are two constants. For example, when a is 1.25 and b is 0, α (M) = 1.25 × M.
[0198] Please refer to FIG. 13 illustrating an example of expanded templates. In a case that α (M) =1.25 × M, β (N) =0.25×N, γ (M) =0.25 ×M, and δ (N) =1.25×N, the current coding block 401, the expanded templates 411Le and 411Ue are illustrated in FIG. 12. For the current coding block 401 with a width M of 16 and height N of 8, the width α (M) of the upper expanded template 411Ue is 20 and the height β (N) of the upper expanded template 411Ue is 2, while the width γ (M) of the left expanded template 411Le is 4, and the height δ (N) of the left expanded template 411Le is 10.
[0199] Furthermore, for nonlinear transform, the function α is, for example, derived by the following equation:
[0200] Option 2: α (M) = a× log2M + b, where a and b are two constants. For example, when a is 1 and b is -3, α (M) = log2M -3.
[0201] In another embodiment, the size of the left expanded template 411Le and the upper expanded template 411Ue is associated with the size of the coding block 401 and determined by a look-up table. An example is provided in Table 1. Table 1
[0202] Please refer to FIG. 13. The positions in the template where the video decoder 220 get the template samples for parameter derivation is determined. Conventionally, all samples in templates are used for parameter derivation, which may be suboptimal. The video decoder 220 can exclude samples with weak correlation with distance-based method. In this embodiment, sampling intervals of the expanded template 411Ue, 411Le are determined based on a distance to a boundary of the current coding block 401 and a width M of the current coding block 401 or a height N of the current coding block 401. For a row / column that is n pixels away from the boundary of the current coding block 401, a sampling interval Δn can be derived as Δn=f (n) .
[0203] The principle for this function is that the sampling interval of the row / column nearer to the boundary will not be greater than the one used by the row / column more distant to the boundary. At the same time, Δn cannot be negative for any n that is greater or equal to 0. For a special but common case, Δn can be 0 for any n, which means all samples will be included.
[0204] Δn may have a linear relationship with n. One example for this case is that Δn=n.
[0205] In another embodiment, a sampling interval for a row of a upper expanded template 411Ue is determined based on a distance from the row to a upper boundary of the current coding block and a width of the current coding block 401, and a sampling interval for a column of a left expanded template 411Le is determined based on a distance from the column to a left boundary of the current coding block and a height of the current coding block 401. For example, Δn=n%2,
[0206] where %is the modulo operator.
[0207] FIG. 13 illustrates the current coding block 401 with a size of 16×8, the upper expanded template 411Ue with a size of 20×2, the left expanded template 411Le with a size of 4×10. If the sampling rule is used, the sampling interval will be increased by one every two rows / columns. The sampling interval for a first row R0 of the upper expanded template 411Ue is less than the sampling interval for a second row R1 of the upper expanded template 411Ue, where a distance from the first row R0 to the upper boundary of the current coding block 401 is shorter than a distance from the second row R1 to the upper boundary of the current coding block 401. That is, 20 pixels (blocks with diagonal lines shown in FIG. 13) on the first row R0 are sampled, while 10 pixels on the second row R1 are sampled. Correspondingly, the sampling interval for a column C0 of the left expanded template 411Le is less than the sampling interval for a column C1 of the upper expanded template 411Le, where a distance from the column C0 to the left boundary of the current coding block 401 is shorter than a distance from the column C1 to the left boundary of the current coding block 401. That is, 10 pixels (blocks with diagonal lines shown in FIG. 13) on the column C0 are sampled, while 5 pixels on the column C1 are sampled, 4 pixels on the column C2 are sampled, 3 pixels on the column C3 are sampled.
[0208] FIG. 14 is an example of a computing device 1500 according to an embodiment of the present disclosure. Any suitable computing device can be used for performing the operations described herein. For example, FIG. 14 illustrates an example of the computing device 1500 that can implement some embodiments of FIG. 1 to FIG. 13 using any suitably configured hardware and / or software. In some embodiments, the computing device 1500 can include a processor 1512 that is communicatively coupled to a memory 1514 and that executes computer-executable program code and / or accesses information stored in the memory 1514. The processor 1512 may include a microprocessor, an application-specific integrated circuit ( “ASIC” ) , a state machine, or other processing device. The processor 1512 can include any of a number of processing devices, including one. Such a processor can include or may be in communication with a computer-readable medium storing instructions that, when executed by the processor 1512, cause the processor to perform the operations described herein.
[0209] The memory 1514 can include any suitable non-transitory computer-readable medium. The computer-readable medium can include any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable instructions or other program code. Non-limiting examples of a computer-readable medium include a magnetic disk, a memory chip, a read-only memory (ROM) , a random access memory (RAM) , an application specific integrated circuit (ASIC) , a configured processor, optical storage, magnetic tape or other magnetic storage, or any other medium from which a computer processor can read instructions. The instructions may include processor-specific instructions generated by a compiler and / or an interpreter from code written in any suitable computer-programming language, including, for example, C, C++, C#, visual basic, java, python, perl, javascript, and actionscript.
[0210] The computing device 1500 can also include a bus 1516. The bus 1516 can communicatively couple one or more components of the computing device 1500. The computing device 1500 can also include a number of external or internal devices such as input or output devices. For example, the computing device 1500 is illustrated with an input / output ( “I / O” ) interface 1518 that can receive input from one or more input devices 1520 or provide output to one or more output devices 1522. The one or more input devices 1520 and one or more output devices 1522 can be communicatively coupled to the I / O interface 1518. The communicative coupling can be implemented via any suitable manner (e.g., a connection via a printed circuit board, connection via a cable, communication via wireless transmissions, etc. ) . Non-limiting examples of input devices 1520 include a touch screen (e g., one or more cameras for imaging a touch area or pressure sensors for detecting pressure changes caused by a touch) , a mouse, a keyboard, or any other device that can be used to generate input events in response to physical actions by a user of a computing device. Non-limiting examples of output devices 1522 include a liquid crystal display (LCD) screen, an external monitor, a speaker, or any other device that can be used to display or otherwise present outputs generated by a computing device.
[0211] The computing device 1500 can execute program code that configures the processor 1512 to perform one or more of the operations described above with respect to some embodiments of FIG. 1 to FIG. 13. The program code can include an encoder 1526 and / or a video decoder 1528. The program code may be resident in the memory 1514 or any suitable computer-readable medium and may be executed by the processor 1512 or any other suitable processor.
[0212] The computing device 1500 can also include at least one network interface device 1524. The network interface device 1524 can include any device or group of devices suitable for establishing a wired or wireless data connection to one or more data networks 1528. Non limiting examples of the network interface device 1524 include an Ethernet network adapter, a modem, and / or the like. The computing device 1500 can transmit messages as electronic or optical signals via the network interface device 1524.
[0213] A person having ordinary skill in the art understands that each of the units, algorithm, and steps described and disclosed in the embodiments of the present disclosure are realized using electronic hardware or combinations of software for computers and electronic hardware. Whether the functions run in hardware or software depends on the condition of application and design requirement for a technical plan. A person having ordinary skill in the art can use different ways to realize the function for each specific application while such realizations should not go beyond the scope of the present disclosure. It is understood by a person having ordinary skill in the art that he / she can refer to the working processes of the system, device, and unit in the above-mentioned embodiment since the working processes of the above-mentioned system, device, and unit are basically the same. For easy description and simplicity, these working processes will not be detailed.
[0214] It is understood that the disclosed system, device, and method in the embodiments of the present disclosure can be realized with other ways. The above-mentioned embodiments are exemplary only. The division of the units is merely based on logical functions while other divisions exist in realization. It is possible that a plurality of units or components are combined or integrated in another system. It is also possible that some characteristics are omitted or skipped. On the other hand, the displayed or discussed mutual coupling, direct coupling, or communicative coupling operate through some ports, devices, or units whether indirectly or communicatively by ways of electrical, mechanical, or other kinds of forms.
[0215] The units as separating components for explanation are or are not physically separated. The units for display are or are not physical units, that is, located in one place or distributed on a plurality of network units. Some or all of the units are used according to the purposes of the embodiments. Moreover, each of the functional units in each of the embodiments can be integrated in one processing unit, physically independent, or integrated in one processing unit with two or more than two units.
[0216] If the software function unit is realized and used and sold as a product, it can be stored in a readable storage medium in a computer. Based on this understanding, the technical plan proposed by the present disclosure can be essentially or partially realized as the form of a software product. Or, one part of the technical plan beneficial to the conventional technology can be realized as the form of a software product. The software product in the computer is stored in a storage medium, including a plurality of commands for a computational device (such as a personal computer, a server, or a network device) to run all or some of the steps disclosed by the embodiments of the present disclosure. The storage medium includes a USB disk, a mobile hard disk, a read-only memory (ROM) , a random access memory (RAM) , a floppy disk, or other kinds of media capable of storing program codes.
[0217] In contrast to prior art, the present disclosure proposes a position-Dependent Local Illumination Compensating (PDLIC) method to cope with the local illumination changes. By incorporating the position parameters, new models are expected to fit complicated illumination patterns better. The proposed template adaption scheme is supposed to improve the fitting performance from the perspective of improving parameter accuracy. By using PDLIC for the current coding block and determining a template for the current coding block, a better prediction performance can be obtained in the picture / video coding process and less distortion can be observed with little additional bit consumption. The present disclosure may improve the coding efficiency.
[0218] While the present disclosure has been described in connection with what is considered the most practical and preferred embodiments, it is understood that the present disclosure is not limited to the disclosed embodiments but is intended to cover various arrangements made without departing from the scope of the broadest interpretation of the appended claims.
Claims
1.A method for improving Local Illumination Compensating, applied to a video decoder, comprising:enabling a Position-Dependent Local Illumination Compensating (PDLIC) for a current coding block;determining a template for the current coding block;determining a reference block related to the current coding block, and reference areas for the reference block, based on the current coding block and the template, wherein coordination of reference samples of the reference block map coordination of samples of the current coding block;determining Local Illumination Compensation parameters based on samples of the template, reference samples of the reference area, and position information of samples in the template; andperforming the Position-Dependent Local Illumination Compensating for the current coding block based on the Local Illumination Compensation parameters.2.The method of claim 1, wherein the enabling a PDLIC for a current coding block comprises:enabling the Position-Dependent Local Illumination Compensating (PDLIC) for the current coding block in response to at least one of the following conditions is meet:a detection of a flag for PDLIC enablement;an enablement of a Local Illumination Compensating (LIC) for the current coding block;a size parameter of the current coding block meeting a first condition; anda number of samples of the template equaling to or greater than a template adaption threshold.3.The method of claim 2, wherein the first condition is a size of the current coding block equaling or in excess of a first threshold.4.The method of claim 2, wherein the first condition is a width of the current coding block equaling or in excess of a second threshold and a height of the current coding block equal or in excess of a third threshold.5.The method of claim 1, wherein the Local Illumination Compensation parameters comprise a position-dependent scale factor derived by the position information of the samples of the template.6.The method of claim 1, wherein the Local Illumination Compensation parameters comprise a position-dependent offset factor derived by the position information of the samples of the template.7.The method of claim 1, wherein the determining a template for the current coding block comprises:determining whether a number of samples of an original template is less than a template adaption threshold, wherein a value of each of the samples is in a range between a low bound L and a high bound R; andsetting an expanded template as the template, when the number of the samples of the original template is less than the template adaption threshold, wherein a number of samples of the expanded template equals or in excess of the template adaption threshold.8.The method of claim 7, wherein the determining a template for the current coding block further comprises:determining a boundary of the expanded template based on a width of the current coding block and a height of the current coding block.9.The method of claim 7, wherein the setting an expanded template as the template comprises:determining a sampling interval of the expanded template based on a distance to a boundary of the current coding block and a width of the current coding block or a height of the current coding block;counting a number of samples of the expanded template sampled based on the sampling interval; andsetting the expanded template as the template when the number of samples of the expanded template equals or is in excess of the template adaption threshold.10.The method of claim 9, wherein the determining a sampling interval of the expanded template comprises:determining a sampling interval for a row of a upper expanded template based on a distance from the row to a upper boundary of the current coding block and a width of the current coding block; anddetermining a sampling interval for a column of a left expanded template based on a distance from the column to a left boundary of the current coding block and a height of the current coding block.11.The method of claim 10, wherein the sampling interval for a first row of the upper expanded template is greater than the sampling interval for a second row of the upper expanded template, where a distance from the first row to the upper boundary of the current coding block is shorter than a distance from the second row to the upper boundary of the current coding block;the sampling interval for a first column of the left expanded template is greater than the sampling interval for a second column of the upper expanded template, where a distance from the first column to the left boundary of the current coding block is shorter than a distance from the second column to the left boundary of the current coding block.12.The method of claim 7, wherein the determining a template for the current coding block further comprises:determining a size of the expanded template by using a look-up table.13.The method of claim 7, wherein the low bound and the high bound are determined based on a maximum number of bits that is used to represent the value of a sample.14.A method for improving Local Illumination Compensating, applied to a video decoder, comprising:determining whether a number of samples of an original template is less than a template adaption threshold;setting an expanded template as a template for a current coding block, when the number of the samples of the original template is less than the template adaption threshold, wherein a number of samples of the expanded template equals or in excess of the template adaption threshold;determining a reference block related to the current coding block, and reference areas for the reference block, based on the current coding block and the template, wherein coordination of reference samples of the reference block map coordination of samples of the current coding block;determining Local Illumination Compensation parameters based on samples of the template, reference samples of the reference area, and position information of samples in the template; andperforming the Position-Dependent Local Illumination Compensating for the current coding block based on the Local Illumination Compensation parameters.15.The method of claim 14, wherein the setting an expanded template as a template for a current coding block comprises:determining a boundary of the expanded template based on a width of the current coding block and a height of the current coding block.16.The method of claim 14, wherein the setting an expanded template as the template for a current coding block comprises:determining a sampling interval of the expanded template based on a distance to a boundary of the current coding block and a width of the current coding block or a height of the current coding block;counting a number of samples of the expanded template sampled based on the sampling interval; andsetting the expanded template as the template when the number of samples of the expanded template equals or is in excess of the template adaption threshold.17.The method of claim 16, wherein the determining a sampling interval of the expanded template comprises:determining a sampling interval for a row of a upper expanded template based on a distance from the row to a upper boundary of the current coding block and a width of the current coding block; anddetermining a sampling interval for a column of a left expanded template based on a distance from the column to a left boundary of the current coding block and a height of the current coding block.18.The method of claim 17, wherein the sampling interval for a first row of the upper expanded template is greater than the sampling interval for a second row of the upper expanded template, where a distance from the first row to the upper boundary of the current coding block is shorter than a distance from the second row to the upper boundary of the current coding block;the sampling interval for a first column of the left expanded template is greater than the sampling interval for a second column of the upper expanded template, where a distance from the first column to the left boundary of the current coding block is shorter than a distance from the second column to the left boundary of the current coding block.19.The method of claim 14, wherein the setting an expanded template as the template for a current coding block comprises:determining a size of the expanded template by using a look-up table.20.A video encoder comprising:a memory storing instructions;a processor coupled to the memory;wherein the processor is configured to perform the method of any one of claims 1 to 19.21.A non-transitory computer-readable storage medium storing instructions that, when executed by a computer, cause the computer to perform the method of any one of claims 1 to 19.22.A chip, comprising:a processor, configured to call and run a computer program stored in a memory, to cause a device in which the chip is installed to execute the method of any one of claims 1-19.23.A computer program product, comprising a computer program, wherein the computer program causes a computer to execute the method of any one of claims 1-19.