Deblock filter with reduced pixel revisions at stripe boundaries

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By adjusting pixel revision ranges based on stripe boundary positions, deblock filtering is optimized to enhance quality and efficiency without increasing hardware costs, addressing the limitations of existing methods in video coding.

WO2026130404A1PCT designated stage Publication Date: 2026-06-25MEDIATEK INC

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: MEDIATEK INC
Filing Date: 2025-12-17
Publication Date: 2026-06-25

AI Technical Summary

Technical Problem

Existing deblock filtering methods in video coding increase hardware costs and reduce filtering quality due to the need for larger buffer sizes to accommodate larger values of Mp and Nreq, limiting the efficiency and effectiveness of stripe-based hardware post-processing.

Method used

Adjust the range of pixel positions (Mp) to be modified by deblock filtering operations based on the position of processing edges relative to stripe boundaries, setting Mp to a larger value (Mpo) for edges above the stripe boundary and reducing it to a smaller value (Mps) for edges below the stripe boundary, ensuring efficient operation without increasing hardware costs.

Benefits of technology

Improves deblock filtering quality without increasing hardware costs by optimizing pixel revision ranges, allowing stripe-based hardware to operate efficiently and maintain filtering quality even when processing edges are close to stripe boundaries.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025143149_25062026_PF_FP_ABST

Patent Text Reader

Abstract

A method for performing deblock filtering with reduced pixel revisions at stripe boundaries is provided. A video coder receives data to be encoded or decoded as a current picture of a video and reconstructs the current picture based on the received data. The video coder applies deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units. The video coder applies a sequence of additional filtering operations on result of the deblock filtering operations for a first stripe of the current picture, wherein a range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo and reduced to Mps when the processing edge separates two deblocking process units. The video coder provides the result of the sequence of additional filtering operations as filtered current picture to be stored or outputted.

Need to check novelty before this filing date? Find Prior Art

Description

DEBLOCK FILTER WITH REDUCED PIXEL REVISIONS AT STRIPE BOUNDARIESCROSS REFERENCE TO RELATED PATENT APPLICATION (S)

[0001] The present disclosure is part of a non-provisional application that claims the priority benefit of U.S. Provisional Patent Application No. 63 / 734,792, filed on 17 December 2024. Content of above-listed applications are herein incorporated by reference.TECHNICAL FIELD

[0002] The present disclosure relates generally to video coding and processing. In particular, the present disclosure relates to methods of deblock filtering.BACKGROUND

[0003] Unless otherwise indicated herein, approaches described in this section are not prior art to the claims listed below and are not admitted as prior art by inclusion in this section.

[0004] High-Efficiency Video Coding (HEVC) is an international video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC) . HEVC is based on the hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit for compression, termed coding unit (CU) , is a 2Nx2N square block of pixels, and each CU can be recursively split into four smaller CUs until the predefined minimum size is reached. Each CU contains one or multiple prediction units (PUs) .

[0005] Versatile video coding (VVC) is the latest international video coding standard developed by the Joint Video Expert Team (JVET) of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11. The input video signal is predicted from the reconstructed signal, which is derived from the coded picture regions. The prediction residual signal is processed by a block transform. The transform coefficients are quantized and entropy coded together with other side information in the bitstream. The reconstructed signal is generated from the prediction signal and the reconstructed residual signal after inverse transform on the de-quantized transform coefficients. The reconstructed signal is further processed by in-loop filtering for removing coding artifacts. The decoded pictures are stored in the frame buffer for predicting the future pictures in the input video signal.

[0006] In VVC, a coded picture is partitioned into non-overlapped square block regions represented by the associated coding tree units (CTUs) . The leaf nodes of a coding tree correspond to the coding units (CUs) . A coded picture can be represented by a collection of slices, each comprising an integer number of CTUs. The individual CTUs in a slice are processed in raster-scan order. A bi-predictive (B) slice may be decoded using intra prediction or inter prediction with at most two motion vectors (MVs) and reference indices to predict the sample values of each block. A predictive (P) slice is decoded using intra prediction or inter prediction with at most one motion vector and reference index to predict the sample values of each block. An intra (I) slice is decoded using intra prediction only.

[0007] A CTU can be partitioned into one or multiple non-overlapped coding units (CUs) using the quadtree (QT) with nested multi-type-tree (MTT) structure to adapt to various local motion and texture characteristics. A CU can be further split into smaller CUs using quad-tree partitioning, vertical binary tree partitioning, horizontal binary tree partitioning, vertical center-side triple-tree partitioning, horizontal center-side triple-tree partitioning, etc.

[0008] Each CU contains one or more prediction units (PUs) . The prediction unit, together with the associated CU syntax, works as a basic unit for signaling the predictor information. The specified prediction process is employed to predict the values of the associated pixel samples inside the PU. Each CU may contain one or more transform units (TUs) for representing the prediction residual blocks. A transform unit (TU) is comprised of a transform block (TB) of luma samples and two corresponding transform blocks of chroma samples and each TB correspond to one residual block of samples from one color component. An integer transform is applied to a transform block. The level values of quantized coefficients together with other side information are entropy coded in the bitstream. The terms coding tree block (CTB) , coding block (CB) , prediction block (PB) , and transform block (TB) are defined to specify the 2-D sample array of one-color component associated with CTU, CU, PU, and TU, respectively. Thus, a CTU consists of one luma CTB, two chroma CTBs, and associated syntax elements. A similar relationship is valid for CU, PU, and TU.

[0009] For each inter-predicted CU, motion parameters-including motion vectors, reference picture indices, reference picture list usage flags, and additional information-are used for inter-predicted sample generation. Such motion parameters can be signalled either explicitly or implicitly. When a CU is coded in skip mode, it is associated with one PU and contains no significant residual coefficients. In this mode, neither motion vector delta nor reference picture index is explicitly coded; instead, these motion parameters are derived from a candidate list (typically the merge candidate list) . Merge mode is specified such that motion parameters for the current CU are obtained from neighbouring CUs, including both spatial and temporal candidates. VVC introduces additional candidate generation mechanisms to enrich the merge candidate pool. The merge mode may be applied to any inter-predicted CU. Alternatively, motion parameters may be signalled explicitly for each CU. In this approach, the motion vector, the corresponding reference picture index for each reference picture list, reference list usage flags, and other necessary information are directly encoded in the bitstream.SUMMARY

[0010] The following summary is illustrative only and is not intended to be limiting in any way. That is, the following summary is provided to introduce concepts, highlights, benefits and advantages of the novel and non-obvious techniques described herein. Select and not all implementations are further described below in the detailed description. Thus, the following summary is not intended to identify essential features of the claimed subject matter, nor is it intended for use in determining the scope of the claimed subject matter.

[0011] Some embodiments of the disclosure provide a method for performing deblock filtering with reduced pixel revisions at stripe boundaries. A video coder receives data to be encoded or decoded as a current picture of a video and reconstructs the current picture based on the received data. The video coder applies deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units. The processing edges may be boundaries of transform units. The video coder applies a sequence of additional filtering operations on result of the deblock filtering operations for a first stripe of the current picture, wherein a range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo and reduced to Mps when the processing edge separates two deblocking process units. The video coder provides the result of the sequence of additional filtering operations as filtered current picture to be stored or outputted. Mps is 6 and Mpo is 10 for some embodiments. Mps is 6 and Mpo is 8 for some other embodiments.

[0012] In some embodiments, when a first processing edge separates two deblocking process unit and is the closest processing edge below the lower bound of the first stripe, the range of pixel positions above the first processing edge to be modified by the deblock filtering operation is set to Mps. In some embodiments, when Nreq is a range of pixel positions below the lower bound of the first stripe that are to be used by the sequence of additional filtering operations, Mps is less than a difference between Nreq and a stripe offset between the processing edge and the lower bound of the first stripe.

[0013] In some embodiments, Mps is smaller than a range of pixel positions below the first processing edge to be modified by the deblock filtering operation. The range of pixel positions below the first processing edge to be modified by the deblock filtering operation is Mpo. In some embodiments, Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge being above the lower bound of the first stripe, and the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.

[0014] In some embodiments, Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge being below the first processing edge and further away from the lower bound of the first stripe, and the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.BRIEF DESCRIPTION OF THE DRAWINGS

[0015] The accompanying drawings are included to provide a further understanding of the present disclosure, and are incorporated in and constitute a part of the present disclosure. The drawings illustrate implementations of the present disclosure and, together with the description, serve to explain the principles of the present disclosure. It is appreciable that the drawings are not necessarily in scale as some components may be shown to be out of proportion than the size in actual implementation in order to clearly illustrate the concept of the present disclosure.

[0016] FIG. 1 conceptually illustrates samples at both sides of an edge upon which deblock filtering may be applied.

[0017] FIG. 2 conceptually illustrates filtering operations after the deblock filtering with respect to a stripe.

[0018] FIG. 3 illustrates parameters related to stripe-based hardware.

[0019] FIGS. 4A-4B illustrate the samples used for deblocking around processing edges at different positions relative to stripe boundaries.

[0020] FIG. 5 illustrates an example video encoder that implements deblock filtering.

[0021] FIG. 6 conceptually illustrates a process that implement deblock filtering with changing maximum number of samples above a processing edge.

[0022] FIG. 7 illustrates an example video decoder that implement deblock filtering.

[0023] FIG. 8 conceptually illustrates a process that implement deblock filtering with changing maximum number of samples above a processing edge.

[0024] FIG. 9 conceptually illustrates an electronic system with which some embodiments of the present disclosure are implemented.DETAILED DESCRIPTION

[0025] In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant teachings. Any variations, derivatives and / or extensions based on teachings described herein are within the protective scope of the present disclosure. In some instances, well-known methods, procedures, components, and / or circuitry pertaining to one or more example implementations disclosed herein may be described at a relatively high level without detail, in order to avoid unnecessarily obscuring aspects of teachings of the present disclosure. I. Video Coding Post-Processing

[0026] Post-processing refers to operations that occurs to samples of a block of pixels (e.g., transform unit or coding unit) after the samples of the block have been reconstructed during the video coding process but before the samples are output for display or storage. The post-processing flow of a video decoder (or video encoder) often includes a series of filtering operations, which may include deblocking (DBK) filter and other filters.

[0027] Some of the filtering operations are in-loop, i.e., the result of the filtering operations may be stored in a reconstructed picture buffer or decoded picture buffer to be used for coding subsequent blocks or pictures in the video.

[0028] A. Deblocking Filtering

[0029] Deblocking (DBK) filtering is used in video processing to reduce or eliminate blocking artifacts, which are the visible square-like distortions that appear at the boundaries of blocks in compressed images and video. Deblocking Filter is used to smooth edges of transform units and reduce high-frequency noise. Generally, deblock filtering requires pixels on both sides of the edge to be ready before the filtering can be performed. In block pipe filtering, the samples of the bottom portion of the current block are stored in a memory buffer for next block filtering.

[0030] FIG. 1 conceptually illustrates samples at both sides of an edge upon which deblock filtering may be applied. In the example, blocks 101 and 102 may be transform units. An edge 110 demarcate block 101 from block 102, with block 101 at “p-side” of the edge 110 and the block 102 at “q-side” of the edge 110. Samples p0 through pMp-1 at “p-side” and samples q0 through qMq-1 at “q-side” are used for deblocking filtering to smooth the edge 110, which may be the boundary of a transform unit. Samples beyond pMp at p-side (beyond range Mp, for example) and qMq at q-side (beyond range Mq, for example) may pass through the deblock filter without being modified.

[0031] B. Stripe-based Post Processing

[0032] A stripe is a shifted-up process unit that correspond to a region of the current picture upon which the video coder may perform filter operations sequentially after the deblocking filter. Specifically, when referencing pixels in the same stripe, a filter operation uses the result from a previous filter. However, when referencing pixels from a different stripe, a filter operation uses the result from the deblock filter, particularly when referencing samples below the lower bound of the stripe.

[0033] FIG. 2 conceptually illustrates filtering operations after the deblock filtering with respect to a stripe. As illustrated, video post-processing operations 200 includes a first stage filter 201, followed by a second stage filter 202, then followed by a third stage filter 203. The first stage filter 201 is a deblocking (DBF) filter. The second and third stages 202 and 203 are filters following the deblock filter, for example, the second and third stages 202 and 203 maybe CDEF filter, and / or LR filter.

[0034] The figure shows a region 210 of the current picture at different filtering stages 201-203. The region 210 includes rows at vertical positions y to y+256. The region 210 has processing edges at rows y, y+64, y+128, y+192, and y+256 (i.e., the y-th, (y+64) -th, (y+128) -th, (y+192) -th, and (y+256) -th row, where y is a multiple of 64 and the row index starts from 0) .

[0035] Samples at both sides of the processing edges are used and revised by deblock filtering at the first stage filter 201. The processing edges at rows y, y+64, y+128, y+192, and y+256 are referred to as “separating” processing edges, since they are boundaries of 64x64 process units which separate the current picture 210 into 64x64 process units for deblock filtering operations (also referred to as deblocking process units) . A processing edge that is an edge of only a transform units but not of a deblocking process unit is not a “separating” processing edge.

[0036] The post-processing operations also divides the region 210 into stripes. Normally, a stripe includes 64 rows (may also referred to as lines) . For example, in FIG. 2, area between row y+56 &row y+120 is a stripe. However, the stripe at the topmost of a frame consists of 56 rows, while the stripe at the bottommost part of a frame consists of 8 rows. In region 210, the stripe boundaries are located at rows y+56, y+120, y+184, and y+248. As mentioned, when referencing pixels in the same stripe (for example, area between line y+56 &line y+120) , a filter operation uses the result from a previous filter, when referencing pixels across a different stripe, a filter operation uses the result from the deblock filter, particularly when referencing samples below the lower bound of the stripe.

[0037] For the second stage filter 202, when referencing samples above the stripe boundary at y+120, samples from the first filter stage 201 is used. When referencing samples below the stripe boundary at y+120, samples from the deblock filter stage is taken, which happens to be the first stage 201.

[0038] For the third stage filter 203, when referencing samples above the stripe boundary at y+120, samples from the second filter stage 202 is used. When referencing samples below the stripe boundary at y+120, samples from the deblock filter stage 201 is used.

[0039] It is determined that, the stripe-based hardware described above can operate efficiently if the following condition is satisfied: Mp ≤ stripe_offset –Nreq (1)

[0040] The parameter stripe_offset is the shifted number from a processing edge (e.g., processing edge at y+128) to a position at a lower bound of a stripe (e.g., stripe boundary at y+120) (i.e., stripe_offset is the distance from the stripe boundary to the processing edge below) ; Mp is the range of pixels or pixel positions above a processing edge that is required for deblocking (and modified at the p side above the processing edge) , and Nreq is the range of pixels below a stripe boundary that is required by the following filters. Satisfying eq. (1) ensures that the filtering process for boundary position Mp only takes place when there are at least Nreq samples available above the stripe_offset line, in order to satisfy the sample requirements for the deblocking filter. FIG. 3 illustrates parameters related to stripe-based post processing. Specifically, the figure shows the definition of stripe_offset, Mp, and Nreq in relation to the stripe boundary at y+120 and the separating processing edge at y+128.

[0041] Thus, for example, if stripe_offset = 12, Mp = 6, and Nreq =4, then the requirement of eq. (1) is satisfied (because 6 < 12–4) , and stripe-based hardware post-processing can efficiently operate because the samples required by the following filters do not depend on samples to be revised by deblock filtering. For another example, if stripe_offset = 8, Mp = 6, and Nreq =5, then the requirement of eq. (1) is not satisfied (because 6 > 8–5) , and stripe-based hardware post-processing cannot be efficiently performed because the following filters use samples to be revised by deblock filtering, since these samples can only be completed after the deblocking filter has received the 64×64 block below.

[0042] Improving the filtering quality of the DBK filter and the following filters generally requires the parameters Mp and Nreq to take on larger values. This is because larger value of Mp result in better filtering quality of the DBK filter, and larger value of Nreq result in better filtering quality of the following filters. However, according to Eq. (1) , larger values of Mp and Nreq require a corresponding larger value of stripe_offset, which increases the size requirement of a buffer storing samples of the stripe. (An example of such a buffer is referred to as a “line buffer” , which will be further described by reference to FIG. 5 and FIG. 7. ) In other, the hardware cost of the line buffer limits stripe_offset, which in turn limits the filter’s quality by constraining Mp and Nreq.

[0043] Some embodiments of the disclosure provide a method for improving deblock filtering quality without increasing hardware cost. In some embodiments, the method sets or reduces the range (Mp) of pixels or pixel positions above a processing edge based on the position of the processing edge relative to a lower bound of a stripe boundary and to edges of deblocking process units. Specifically, Mp is set to a larger original range Mpo for a processing edge that is above the lower bound of a stripe boundary, and set (or reduced) to a smaller reduced range Mps for a separating processing edge that is below the lower bound of a stripe boundary, where Mps < Mpo. Specifically, when the pixels being filtered or revised belong to two different deblocking process units (e.g., in two different 64x64 process units) , the video coder reduces Mp (to e.g., Mps) ; and when the pixels being filtered or revised belong to a same process unit (e.g., within a same 64x64 process unit) , the video coder uses the larger original Mp (e.g., Mpo) . For a processing edge that is above the lower bound of a stripe (and hence in a same 64x64 deblocking process unit) , the video coder also uses the larger original Mp (e.g., Mpo) . In some embodiments, Mpo is assigned according to the size of the transform unit. In some embodiments, Mpo is assigned according to other parameters. In some embodiments, Mpo is a non-zero constant. In some embodiments, deblock filtering is performed on the boundaries of the transform units (as processing edges) , while along the boundaries of 64x64 process units, the number of samples for deblock filtering maybe adjusted.

[0044] On the other hand, for a processing edge that is below a lower bound of a stripe (and hence in a different process unit) , the range Mps above the processing edge is set (or reduced) according to: Mps ≤ stripe_offset –Nreq (3)

[0045] By reducing the Mp to Mps, the stripe-based hardware can operate efficiently without affecting Nreq (i.e., the quality of following filtrers) , even when stripe_offset is small (i.e., when the separating processing edge is close to the lower bound of the stripe) .

[0046] FIGS. 4A-4B illustrate the samples used for deblocking around processing edges at different positions relative to stripe boundaries. The figure illustrates a region 400 being filtered by the deblock filter and following filters. The region 400 is processed as stripes, including stripe 401 and stripe 402. A stripe boundary 410 serves as the lower bound of the stripe 401 and as the upper bound of the stripe 402. The deblock filter performs filtering operations at processing edges 421, 422 and 423, which may be edges of transform units or coding units. The processing edge 421 is in the stripe 401 and therefore above the stripe boundary 410. The processing edges 422 and 423 are in the stripe 402 and therefore below the stripe boundary 410. The processing edge 422 is the boundary of a 64*64 block (e. g. ., a deblock process unit) , while processing edges 421 and 423 may be the boundaries of a smaller block (32*32, for example) .

[0047] The processing edge 421 is above the stripe boundary 410 and far below the upper bound of the stripe 401. Therefore, the deblock filtering operations of the samples around the processing edge 421 can proceed without affecting the stripe-based operations of a different stripe. Its Mp therefore is unchanged at Mpo.

[0048] The processing edge 422 is below the stripe boundary 410 and therefore the deblocking operations around this processing edge may affect the operations of the following filters for the stripe 401 if the Mp remain unchanged at Mpo (i.e., violating Eq. (1) ) . The video coder therefore uses a reduced Mp (i.e., Mps) so that the deblocking operations around the processing edge 422 would not affect the operations of the following filters for the strip 401 (by satisfying Eq. (3)) .

[0049] The processing edge 423 is far below the stripe boundary 410. Therefore, the deblock filtering operations of the samples around the processing edge 421 can proceed without affecting the stripe-based operations of a different stripe (i.e., the stripe 401) . Its Mp therefore remain unchanged at Mpo.

[0050] In the example of FIG. 4A, the reduction of Mp to Mps is applied to both sides of the processing edge 422. However, it is possible only the samples above the processing edge 422 may affect the following filters, while the samples below the processing edge would have no effect (since the samples below the processing edge 422 are further away from the stripe boundary 410 and not needed as part of Nreq. ) Thus, in some embodiments, the reduction to Mps is applied to only the above side of the processing edge but not to the below side.

[0051] FIG. 4B illustrates that, for the processing edge 422, the maximum number of samples Mp is reduced to Mps for above the processing edge but not for below the processing edge. The maximum number of samples for below the processing edge 422 remains at Mpo. This allows the deblock filtering around the processing edge 422 to have higher quality result than if both sides are reduced to Mps.

[0052] For example, a processing edge (e.g., edge 422) that is near the lower bound of a stripe (e.g., stripe boundary 410) may use Mps = 6 for the above-side and Mpo = 10 (8 in some other embodiments) for below-side of deblock filtering, while other processing edges (e.g., edges 421 and 423) that are further away from the lower bound of a stripe may use Mpo = 10 for both above-side and below-side of deblock filtering. II. Example Video Encoder

[0053] FIG. 5 illustrates an example video encoder 500 that implements deblock filtering. As illustrated, the video encoder 500 receives input video signal from a video source 505 and encodes the signal into bitstream 595. The video encoder 500 has several components or modules for encoding the signal from the video source 505, at least including some components selected from a transform module 510, a quantization module 511, an inverse quantization module 514, an inverse transform module 515, an intra estimation module 524, an intra prediction module 525, a motion compensation module 530, a motion estimation module 535, an in-loop filter 545, a reconstructed picture buffer 550, a MV buffer 565, and a MV prediction module 575, and an entropy encoder 590. The motion compensation module 530 and the motion estimation module 535 are part of an inter-prediction module 540. The intra-prediction module 525 and the intra-estimation module 524 are part of a current picture prediction module 520, which uses current picture reconstructed samples as reference samples for prediction of the current block.

[0054] In some embodiments, the modules 510 –590 are modules of software instructions being executed by one or more processing units (e.g., a processor) of a computing device or electronic apparatus. In some embodiments, the modules 510 –590 are modules of hardware circuits implemented by one or more integrated circuits (ICs) of an electronic apparatus. Though the modules 510 –590 are illustrated as being separate modules, some of the modules can be combined into a single module.

[0055] The video source 505 provides a raw video signal that presents pixel data of each video frame without compression. A subtractor 508 computes the difference between the raw video pixel data of the video source 505 and the predicted pixel data 513 from the motion compensation module 530 or intra-prediction module 525 as prediction residual 509. The transform module 510 converts the difference (or the residual pixel data or residual signal 508) into transform coefficients (e.g., by performing Discrete Cosine Transform, or DCT) . The quantization module 511 quantizes the transform coefficients into quantized data (or quantized coefficients) 512, which is encoded into the bitstream 595 by the entropy encoder 590.

[0056] The inverse quantization module 514 de-quantizes the quantized data (or quantized coefficients) 512 to obtain transform coefficients 518, and the inverse transform module 515 performs inverse transform on the transform coefficients 518 to produce reconstructed residual 519. The reconstructed residual 519 is added with the predicted pixel data 513 to produce reconstructed pixel data 517. In some embodiments, the reconstructed pixel data 517 is temporarily stored in a line buffer 527 (or intra prediction buffer) for intra-picture prediction and spatial MV prediction. The reconstructed pixels are filtered by the in-loop filter 545 and stored in the reconstructed picture buffer 550. In some embodiments, the reconstructed picture buffer 550 is a storage external to the video encoder 500. In some embodiments, the reconstructed picture buffer 550 is a storage internal to the video encoder 500.

[0057] The intra estimation module 524 derives intra-prediction data (e.g., intra prediction modes) based on the reconstructed pixel data 517 (stored in the line buffer 527) . The intra-prediction data is provided to the entropy encoder 590 to be encoded into bitstream 595. The intra-prediction data is also used by the intra-prediction module 525 to produce the predicted pixel data 513.

[0058] The motion estimation module 535 performs inter-prediction by producing MVs to reference pixel data of previously decoded frames stored in the reconstructed picture buffer 550. These MVs are provided to the motion compensation module 530 to produce predicted pixel data.

[0059] Instead of encoding the complete actual MVs in the bitstream, the video encoder 500 uses MV prediction to generate predicted MVs, and the difference between the MVs used for motion compensation and the predicted MVs is encoded as residual motion data and stored in the bitstream 595.

[0060] The MV prediction module 575 generates the predicted MVs based on reference MVs that were generated for encoding previously video frames, i.e., the motion compensation MVs that were used to perform motion compensation. The MV prediction module 575 retrieves reference MVs from previous video frames from the MV buffer 565. The video encoder 500 stores the MVs generated for the current video frame in the MV buffer 565 as reference MVs for generating predicted MVs.

[0061] The MV prediction module 575 uses the reference MVs to create the predicted MVs. The predicted MVs can be computed by spatial MV prediction or temporal MV prediction. The difference between the predicted MVs and the motion compensation MVs (MC MVs) of the current frame (residual motion data) are encoded into the bitstream 595 by the entropy encoder 590.

[0062] The entropy encoder 590 encodes various parameters and data into the bitstream 595 by using entropy-coding techniques such as context-adaptive binary arithmetic coding (CABAC) or Huffman encoding. The entropy encoder 590 encodes various header elements, flags, along with the quantized transform coefficients 512, and the residual motion data as syntax elements into the bitstream 595. The bitstream 595 is in turn stored in a storage device or transmitted to a decoder over a communications medium such as a network.

[0063] The in-loop filter 545 performs filtering or smoothing operations on the reconstructed pixel data 517 to reduce the artifacts of coding, particularly at boundaries of pixel blocks. In some embodiments, the filtering or smoothing operations performed by the in-loop filter 545 include deblock filter (DBF or DBK) , sample adaptive offset (SAO) , and / or adaptive loop filter (ALF) . In some embodiments, luma mapping chroma scaling (LMCS) is performed before the loop filters.

[0064] In some embodiments, the in-loop filter 545 performs the post-processing operations that includes the deblock filter followed by several additional filtering stages. The deblock filter revises pixel samples along processing edges (which may be block boundaries) to smooth artifacts. The filtering operations after the deblock filter are performed in a stripe-by-stripe manner and can be performed by a stripe-based hardware. When referencing pixels in the same stripe, a filter operation uses the result from a previous filter. However, when referencing pixels from a different stripe, a filter operation uses the result from the deblock filter, particularly when referencing samples below the lower bound of the stripe. In some embodiments, the maximum number of samples above a processing edge to be revised by deblock filtering is set to Mpo, but reduced to Mps if the processing edge separates deblocking process units and is close to a lower bound of a stripe. The post-processing operations of the filter stages are further described in Section I above.

[0065] FIG. 6 conceptually illustrates a process 600 that implement deblock filtering with changing maximum number of samples above a processing edge. In some embodiments, one or more processing units (e.g., a processor) of a computing device implementing the encoder 500 performs the process 600 by executing instructions stored in a computer readable medium. In some embodiments, an electronic apparatus implementing the encoder 500 performs the process 600.

[0066] The encoder receives (at block 610) data to be encoded as a current picture of a video. The encoder reconstructs (at block 620) the current picture based on the received data. The encoder applies (at block 630) deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units (e.g., 64x64 blocks) . A processing edge may be an edge of a transform unit, a prediction unit, or a coding block (e.g., macroblock, CU, CTU) . In some embodiments, the deblocking operations are pipelined in hardware with each pipeline stage processing one 64x64 process unit.

[0067] The encoder applies (at block 640) a sequence of additional filtering operations on result of the deblock filtering operations for a stripe of the current picture. Each additional filtering operation uses result from a previous filtering operation in the sequence when referencing pixels in a first stripe of the current picture, and uses result from the deblock filter operation when referencing pixels in a second stripe of the current picture. A range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo or reduced to Mps when the processing edge separates two deblocking process units. Mps is 6 and Mpo is 10 for some embodiments. Mps is 8 in some embodiments.

[0068] In some embodiments, when a first processing edge separates deblocking process units and is the closest processing edge (e.g., processing edge 422) below the lower bound of the first stripe (e.g., lower bound 410) , Mps is selected to be the range of pixel positions above the first processing edge to be modified by the deblock filtering operation. In some embodiments, when Nreq is a range of pixel positions below the lower bound of the first stripe that are to be used by the sequence of additional filtering operations, Mps is less than a difference between Nreq and a stripe offset between the processing edge and the lower bound of the first stripe (Mps < stripe offset –Nreq) .

[0069] In some embodiments, Mps is smaller than a range of pixel positions below the first processing edge (e.g., processing edge 422 in FIG. 4B) to be modified by the deblock filtering operation. The range of pixel positions below the first processing edge to be modified by the deblock filtering operation is Mpo. In some embodiments, Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge (e.g., processing edge 421) being above the lower bound of the first stripe, and the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.

[0070] In some embodiments, Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge (e.g., processing edge 423) being below the first processing edge (e.g., processing edge 422) and further away from the lower bound of the first stripe (e.g., stripe bound 410) , and the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.

[0071] The encoder stores (at block 650) result of the sequence of additional filtering operations as filtered current picture for encoding subsequent pictures. III. Example Video Decoder

[0072] In some embodiments, an encoder may signal (or generate) one or more syntax element in a bitstream, such that a decoder may parse said one or more syntax element from the bitstream.

[0073] FIG. 7 illustrates an example video decoder 700 that implement deblock filtering. As illustrated, the video decoder 700 is an image-decoding or video-decoding circuit that receives a bitstream 795 and decodes the content of the bitstream into pixel data of video frames for display. The video decoder 700 has several components or modules for decoding the bitstream 795, including some components selected from an inverse quantization module 714, an inverse transform module 715, an intra-prediction module 725, a motion compensation module 730, an in-loop filter 745, a decoded picture buffer 750, a MV buffer 765, a MV prediction module 775, and a parser 790. The motion compensation module 730 is part of an inter-prediction module 740. The intra-prediction module 725 is part of a current picture prediction module 720, which uses current picture reconstructed samples as reference samples for prediction of the current block.

[0074] In some embodiments, the modules 714 –790 are modules of software instructions being executed by one or more processing units (e.g., a processor) of a computing device. In some embodiments, the modules 714 –790 are modules of hardware circuits implemented by one or more ICs of an electronic apparatus. Though the modules 714 –790 are illustrated as being separate modules, some of the modules can be combined into a single module.

[0075] The parser 790 (or entropy decoder) receives the bitstream 795 and performs initial parsing according to the syntax defined by a video-coding or image-coding standard. The parsed syntax element includes various header elements, flags, as well as quantized data (or quantized coefficients) 712. The parser 790 parses out the various syntax elements by using entropy-coding techniques such as context-adaptive binary arithmetic coding (CABAC) or Huffman encoding.

[0076] The inverse quantization module 714 de-quantizes the quantized data (or quantized coefficients) 712 to obtain transform coefficients, and the inverse transform module 715 performs inverse transform on the transform coefficients 718 to produce reconstructed residual signal 719. The reconstructed residual signal 719 is added with predicted pixel data 713 from the intra-prediction module 725 or the motion compensation module 730 to produce decoded pixel data 717. The decoded pixels data are filtered by the in-loop filter 745 and stored in the decoded picture buffer 750. In some embodiments, the decoded picture buffer 750 is a storage external to the video decoder 700. In some embodiments, the decoded picture buffer 750 is a storage internal to the video decoder 700.

[0077] The intra-prediction module 725 receives intra-prediction data from bitstream 795 and according to which, produces the predicted pixel data 713 from the decoded pixel data 717 stored in the decoded picture buffer 750. In some embodiments, the decoded pixel data 717 is also stored in a line buffer 727 (or intra prediction buffer) for intra-picture prediction and spatial MV prediction.

[0078] In some embodiments, the content of the decoded picture buffer 750 is used for display. A display device 705 either retrieves the content of the decoded picture buffer 750 for display directly, or retrieves the content of the decoded picture buffer to a display buffer. In some embodiments, the display device receives pixel values from the decoded picture buffer 750 through a pixel transport.

[0079] The motion compensation module 730 produces predicted pixel data 713 from the decoded pixel data 717 stored in the decoded picture buffer 750 according to motion compensation MVs (MC MVs) . These motion compensation MVs are decoded by adding the residual motion data received from the bitstream 795 with predicted MVs received from the MV prediction module 775.

[0080] The MV prediction module 775 generates the predicted MVs based on reference MVs that were generated for decoding previous video frames, e.g., the motion compensation MVs that were used to perform motion compensation. The MV prediction module 775 retrieves the reference MVs of previous video frames from the MV buffer 765. The video decoder 700 stores the motion compensation MVs generated for decoding the current video frame in the MV buffer 765 as reference MVs for producing predicted MVs.

[0081] The in-loop filter 745 performs filtering or smoothing operations on the decoded pixel data 717 to reduce the artifacts of coding, particularly at boundaries of pixel blocks. In some embodiments, the filtering or smoothing operations performed by the in-loop filter 745 include deblock filter (DBF) , sample adaptive offset (SAO) , and / or adaptive loop filter (ALF) . In some embodiments, luma mapping chroma scaling (LMCS) is performed before the loop filters.

[0082] In some embodiments, the in-loop filter 745 performs the post-processing operations that includes the deblock filter followed by several additional filtering stages. The deblock filter revises pixel samples along processing edges (which may be block boundaries) to smooth artifacts. The filtering operations after the deblock filter are performed in a stripe-by-stripe manner and can be performed by a stripe-based hardware. When referencing pixels in the same stripe, a filter operation uses the result from a previous filter. However, when referencing pixels from a different stripe, a filter operation uses the result from the deblock filter, particularly when referencing samples below the lower bound of the stripe. In some embodiments, the maximum number of samples above a processing edge to be revised by deblock filtering is set to Mpo, but reduced to Mps if the processing edge separates 64x64 deblocking process units and is close to a lower bound of a stripe. The post-processing operations of the filter stages are further described in Section I above.

[0083] FIG. 8 conceptually illustrates a process 800 that implement deblock filtering with changing maximum number of samples above a processing edge. In some embodiments, one or more processing units (e.g., a processor) of a computing device implementing the decoder 700 performs the process 800 by executing instructions stored in a computer readable medium. In some embodiments, an electronic apparatus implementing the decoder 700 performs the process 800.

[0084] The decoder receives (at block 810) data to be decoded as a current picture of a video. The decoder reconstructs (at block 820) the current picture based on the received data. The decoder applies (at block 830) deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units (e.g., 64x64 blocks) . A processing edge may be an edge of a transform unit. In some embodiments, the deblocking operations are pipelined in hardware with each pipeline stage processing one 64x64 process unit.

[0085] The decoder applies (at block 840) a sequence of additional filtering operations on result of the deblock filtering operations for a stripe of the current picture. Each additional filtering operation uses result from a previous filtering operation in the sequence when referencing pixels in a first stripe of the current picture, and uses result from the deblock filter operation when referencing pixels in a second stripe of the current picture. A range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo or reduced to Mps when the processing edge separates two deblocking process units. Mps is 6 and Mpo is 10 for some embodiments. Mpo is 8 in some embodiments.

[0086] In some embodiments, when a first processing edge separates deblocking process units and is the closest processing edge (e.g., processing edge 422) below the lower bound of the first stripe (e.g., lower bound 410) , Mps is selected to be the range of pixel positions above the first processing edge to be modified by the deblock filtering operation. In some embodiments, when Nreq is a range of pixel positions below the lower bound of the first stripe that are to be used by the sequence of additional filtering operations, Mps is less than a difference between Nreq and a stripe offset between the processing edge and the lower bound of the first stripe (Mps < stripe offset –Nreq) .

[0087] In some embodiments, Mps is smaller than a range of pixel positions below the first processing edge (e.g., processing edge 422 in FIG. 4B) to be modified by the deblock filtering operation. The range of pixel positions below the first processing edge to be modified by the deblock filtering operation is Mpo. In some embodiments, Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge (e.g., processing edge 421) being above the lower bound of the first stripe, and the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.

[0088] In some embodiments, Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge (e.g., processing edge 423) being below the first processing edge (e.g., processing edge 422) and further away from the lower bound of the first stripe (e.g., stripe bound 410) , and the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.

[0089] The decoder provides (at block 850) the result of the sequence of additional filtering operations as filtered current picture for display or output. IV. Example Electronic System

[0090] Many of the above-described features and applications are implemented as software processes that are specified as a set of instructions recorded on a computer readable storage medium (also referred to as computer readable medium) . When these instructions are executed by one or more computational or processing unit (s) (e.g., one or more processors, cores of processors, or other processing units) , they cause the processing unit (s) to perform the actions indicated in the instructions. Examples of computer readable media include, but are not limited to, CD-ROMs, flash drives, random-access memory (RAM) chips, hard drives, erasable programmable read only memories (EPROMs) , electrically erasable programmable read-only memories (EEPROMs) , etc. The computer readable media does not include carrier waves and electronic signals passing wirelessly or over wired connections.

[0091] In this specification, the term “software” is meant to include firmware residing in read-only memory or applications stored in magnetic storage which can be read into memory for processing by a processor. Also, in some embodiments, multiple software inventions can be implemented as sub-parts of a larger program while remaining distinct software inventions. In some embodiments, multiple software inventions can also be implemented as separate programs. Finally, any combination of separate programs that together implement a software invention described here is within the scope of the present disclosure. In some embodiments, the software programs, when installed to operate on one or more electronic systems, define one or more specific machine implementations that execute and perform the operations of the software programs.

[0092] FIG. 9 conceptually illustrates an electronic system 900 with which some embodiments of the present disclosure are implemented. The electronic system 900 may be a computer (e.g., a desktop computer, personal computer, tablet computer, etc. ) , phone, PDA, or any other sort of electronic device. Such an electronic system includes various types of computer readable media and interfaces for various other types of computer readable media. Electronic system 900 includes a bus 905, processing unit (s) 910, a graphics-processing unit (GPU) 915, a system memory 920, a network 925, a read-only memory 930, a permanent storage device 935, input devices 940, and output devices 945.

[0093] The bus 905 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the electronic system 900. For instance, the bus 905 communicatively connects the processing unit (s) 910 with the GPU 915, the read-only memory 930, the system memory 920, and the permanent storage device 935.

[0094] From these various memory units, the processing unit (s) 910 retrieves instructions to execute and data to process in order to execute the processes of the present disclosure. The processing unit (s) may be a single processor or a multi-core processor in different embodiments. Some instructions are passed to and executed by the GPU 915. The GPU 915 can offload various computations or complement the image processing provided by the processing unit (s) 910.

[0095] The read-only-memory (ROM) 930 stores static data and instructions that are used by the processing unit (s) 910 and other modules of the electronic system. The permanent storage device 935, on the other hand, is a read-and-write memory device. This device is a non-volatile memory unit that stores instructions and data even when the electronic system 900 is off. Some embodiments of the present disclosure use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 935.

[0096] Other embodiments use a removable storage device (such as a floppy disk, flash memory device, etc., and its corresponding disk drive) as the permanent storage device. Like the permanent storage device 935, the system memory 920 is a read-and-write memory device. However, unlike storage device 935, the system memory 920 is a volatile read-and-write memory, such a random access memory. The system memory 920 stores some of the instructions and data that the processor uses at runtime. In some embodiments, processes in accordance with the present disclosure are stored in the system memory 920, the permanent storage device 935, and / or the read-only memory 930. For example, the various memory units include instructions for processing multimedia clips in accordance with some embodiments. From these various memory units, the processing unit (s) 910 retrieves instructions to execute and data to process in order to execute the processes of some embodiments.

[0097] The bus 905 also connects to the input and output devices 940 and 945. The input devices 940 enable the user to communicate information and select commands to the electronic system. The input devices 940 include alphanumeric keyboards and pointing devices (also called “cursor control devices” ) , cameras (e.g., webcams) , microphones or similar devices for receiving voice commands, etc. The output devices 945 display images generated by the electronic system or otherwise output data. The output devices 945 include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD) , as well as speakers or similar audio output devices. Some embodiments include devices such as a touchscreen that function as both input and output devices.

[0098] Finally, as shown in FIG. 9, bus 905 also couples electronic system 900 to a network 925 through a network adapter (not shown) . In this manner, the computer can be a part of a network of computers (such as a local area network ( “LAN” ) , a wide area network ( “WAN” ) , or an Intranet, or a network of networks, such as the Internet. Any or all components of electronic system 900 may be used in conjunction with the present disclosure.

[0099] Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer-readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media) . Some examples of such computer-readable media include RAM, ROM, read-only compact discs (CD-ROM) , recordable compact discs (CD-R) , rewritable compact discs (CD-RW) , read-only digital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM) , a variety of recordable / rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc. ) , flash memory (e.g., SD cards, mini-SD cards, micro-SD cards, etc. ) , magnetic and / or solid state hard drives, read-only and recordable Blu- discs, ultra-density optical discs, any other optical or magnetic media, and floppy disks. The computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter.

[0100] While the above discussion primarily refers to microprocessor or multi-core processors that execute software, many of the above-described features and applications are performed by one or more integrated circuits, such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) . In some embodiments, such integrated circuits execute instructions that are stored on the circuit itself. In addition, some embodiments execute software stored in programmable logic devices (PLDs) , ROM, or RAM devices.

[0101] As used in this specification and any claims of this application, the terms “computer” , “server” , “processor” , and “memory” all refer to electronic or other technological devices. These terms exclude people or groups of people. For the purposes of the specification, the terms display or displaying means displaying on an electronic device. As used in this specification and any claims of this application, the terms “computer readable medium, ” “computer readable media, ” and “machine readable medium” are entirely restricted to tangible, physical objects that store information in a form that is readable by a computer. These terms exclude any wireless signals, wired download signals, and any other ephemeral signals.

[0102] While the present disclosure has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the present disclosure can be embodied in other specific forms without departing from the spirit of the present disclosure. In addition, a number of the figures (including FIG. 6 and FIG. 8) conceptually illustrate processes. The specific operations of these processes may not be performed in the exact order shown and described. The specific operations may not be performed in one continuous series of operations, and different specific operations may be performed in different embodiments. Furthermore, the process could be implemented using several sub-processes, or as part of a larger macro process. Thus, one of ordinary skill in the art would understand that the present disclosure is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims. Additional Notes

[0103] The herein-described subject matter sometimes illustrates different components contained within, or connected with, different other components. It is to be understood that such depicted architectures are merely examples, and that in fact many other architectures can be implemented which achieve the same functionality. In a conceptual sense, any arrangement of components to achieve the same functionality is effectively "associated" such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as "associated with" each other such that the desired functionality is achieved, irrespective of architectures or intermediate components. Likewise, any two components so associated can also be viewed as being "operably connected" , or "operably coupled" , to each other to achieve the desired functionality, and any two components capable of being so associated can also be viewed as being "operably couplable" , to each other to achieve the desired functionality. Specific examples of operably couplable include but are not limited to physically mateable and / or physically interacting components and / or wirelessly interactable and / or wirelessly interacting components and / or logically interacting and / or logically interactable components.

[0104] Further, with respect to the use of substantially any plural and / or singular terms herein, those having skill in the art can translate from the plural to the singular and / or from the singular to the plural as is appropriate to the context and / or application. The various singular / plural permutations may be expressly set forth herein for sake of clarity.

[0105] Moreover, it will be understood by those skilled in the art that, in general, terms used herein, and especially in the appended claims, e.g., bodies of the appended claims, are generally intended as “open” terms, e.g., the term “including” should be interpreted as “including but not limited to, ” the term “having” should be interpreted as “having at least, ” the term “includes” should be interpreted as “includes but is not limited to, ” etc. It will be further understood by those within the art that if a specific number of an introduced claim recitation is intended, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases "at least one" and "one or more" to introduce claim recitations. However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles "a" or "an" limits any particular claim containing such introduced claim recitation to implementations containing only one such recitation, even when the same claim includes the introductory phrases "one or more" or "at least one" and indefinite articles such as "a" or "an, " e.g., “a” and / or “an” should be interpreted to mean “at least one” or “one or more; ” the same holds true for the use of definite articles used to introduce claim recitations. In addition, even if a specific number of an introduced claim recitation is explicitly recited, those skilled in the art will recognize that such recitation should be interpreted to mean at least the recited number, e.g., the bare recitation of "two recitations, " without other modifiers, means at least two recitations, or two or more recitations. Furthermore, in those instances where a convention analogous to “at least one of A, B, and C, etc. ” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention, e.g., “a system having at least one of A, B, and C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and / or A, B, and C together, etc. In those instances where a convention analogous to “at least one of A, B, or C, etc. ” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention, e.g., “a system having at least one of A, B, or C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and / or A, B, and C together, etc. It will be further understood by those within the art that virtually any disjunctive word and / or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase “A or B” will be understood to include the possibilities of “A” or “B” or “A and B. ”

[0106] From the foregoing, it will be appreciated that various implementations of the present disclosure have been described herein for purposes of illustration, and that various modifications may be made without departing from the scope and spirit of the present disclosure. Accordingly, the various implementations disclosed herein are not intended to be limiting, with the true scope and spirit being indicated by the following claims.

Claims

1.A video coding method comprising:receiving data to be encoded or decoded as a current picture of a video;reconstructing the current picture based on the received data;applying deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units;applying a sequence of additional filtering operations on result of the deblock filtering operations for a first stripe of the current picture, wherein a range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo and reduced to Mps when the processing edge separates two deblocking process units; andproviding result of the sequence of additional filtering operations as filtered current picture to be stored or outputted.2.The video coding method of claim 1, wherein a first processing edge separates two deblocking process units and is below the lower bound of the first stripe, and the range of pixel positions above the first processing edge to be modified by the deblock filtering operation is set to Mps, wherein the deblocking process units are 64*64 blocks.3.The video coding method of claim 2, wherein Nreq is a range of pixel positions below the lower bound of the first stripe that are to be used by the sequence of additional filtering operations, wherein Mps is less than a difference between Nreq and a stripe offset from the processing edge to the lower bound of the first stripe.4.The video coding method of claim 2, wherein Mps is smaller than a range of pixel positions below the first processing edge to be modified by the deblock filtering operation.5.The video coding method of claim 1, wherein the range of pixel positions below the first processing edge to be modified by the deblock filtering operation is Mpo.6.The video coding method of claim 2, wherein Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge being above the lower bound of the first stripe.7.The video coding method of claim 6, wherein the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.8.The video coding method of claim 2, wherein Mps is smaller than a range of pixel positions above a second processing edge to be modified by the deblock filtering operation, the second processing edge being below the first processing edge and further away from the lower bound of the first stripe.9.The video coding method of claim 8, wherein the range of pixel positions above the second processing edge to be modified by the deblock filtering operation is Mpo.10.The video coding method of claim 1, wherein Mps is 6 and Mpo is 8.11.The video coding method of claim 1, wherein the processing edges are boundaries of transform units.12.The video coding method of claim 1, wherein each additional filtering operation uses result from a previous filtering operation in the sequence when referencing pixels in a first stripe of the current picture, and uses result from the deblock filter operation when referencing pixels in a second stripe of the current picture.13.An electronic apparatus comprising:a video coder circuit configured to perform operations comprising:receiving data to be encoded or decoded as a current picture of a video;reconstructing the current picture based on the received data;applying deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units;applying a sequence of additional filtering operations on result of the deblock filtering operations for a first stripe of the current picture, wherein a range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo and reduced to Mps when the processing edge separates two deblocking process units; andproviding result of the sequence of additional filtering operations as filtered current picture to be stored or outputted.14.A video decoding method comprising:receiving data to be decoded as a current picture of a video;reconstructing the current picture based on the received data;applying deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units;applying a sequence of additional filtering operations on result of the deblock filtering operations for a first stripe of the current picture, wherein a range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo and reduced to Mps when the processing edge separates two deblocking process units; andproviding result of the sequence of additional filtering operations as filtered current picture for display or output.15.A video encoding method comprising:receiving data to be encoded as a current picture of a video;reconstructing the current picture based on the received data;applying deblock filtering operations on processing edges of the reconstructed current picture in deblocking process units;applying a sequence of additional filtering operations on result of the deblock filtering operations for a first stripe of the current picture, wherein a range of pixel positions above a processing edge to be modified by the deblock filtering operation is set to Mpo and reduced to Mps when the processing edge separates two deblocking process units; andstoring result of the sequence of additional filtering operations as filtered current picture for encoding subsequent pictures.