Time-domain gradable video encoding method for implementing real-time double-frame reference

A video coding and time-domain technology, applied in the field of video coding, to achieve the effect of flexible expansion and delay limitation

Inactive Publication Date: 2008-07-16
WUHAN UNIV
0 Cites 23 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0009] (1) Compatibility with non-scalable coding standards
[0010] (2) Refe...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Method used

In the above-mentioned example of the present invention, a series of sequences have been carried out coding performance test, and test result shows: adopt the real-time two-frame reference time domain scalable method that the p...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention discloses a time domain gradable video coding method which can realize the real time dual frame reference. The method does not use or seldom uses a bidirectional prediction frame in the coding process. In the coding process, a current frame can use two reference frames at most; therefore, a coded bit rate has the hierarchy on the time domain and can satisfy the requirements of the frame rate adjustment and the time delay limitation. The realization method is as follows: according to the display sequence of a current coding image, the time domain level of the current image is calculated; next, according to a reference frame choosing strategy in the invention, a reference image of the current frame is obtained; after the current image is coded, a reference frame cache is updated according to a reference frame updating strategy in the invention. The invention can realize the time domain gradable coding of the real time dual frame reference video stream and ensures that the frame rate of the bit rate can be flexed flexibly; moreover, compared with the prior coding standard, the invention can limit the time delay.

Application Domain

Technology Topic

Frame rateDisplay Order +7

Image

  • Time-domain gradable video encoding method for implementing real-time double-frame reference
  • Time-domain gradable video encoding method for implementing real-time double-frame reference
  • Time-domain gradable video encoding method for implementing real-time double-frame reference

Examples

  • Experimental program(1)

Example Embodiment

[0029] The present invention provides a time-domain scalable video encoding method for real-time dual-frame reference, specifically: the video image encoding sequence is consistent with the display sequence, and the current frame uses at most two reference frames during the encoding process, and the encoded The code stream is hierarchical in the time domain. During encoding, the time domain hierarchy of the current image is first calculated according to the display order of the current encoded image, and then the reference image of the current frame is obtained according to the reference frame selection strategy. After the current image is encoded, the reference frame buffer is updated according to the reference frame update strategy.
[0030] The present invention will be further described below in conjunction with the embodiments and drawings, but the present invention is not limited.
[0031]The invention provides a real-time double-frame reference real-time scalable coding method based on the existing IPP...P non-scalable video coding standard video stream. The theoretical basis is: use the current time domain level P frame as the reference frame of the same layer and the next time domain level P frame, thus forming the P frame generation process in an image group is a hierarchical iterative structure (see Figure 1), the strategy shown in Figure 6 is used to calculate the time domain level of the current frame. When obtaining the reference frame of the current encoded frame, the reference frame selection strategy shown in Figure 7 is adopted. When one frame is encoded and decoded The reference frame update strategy shown in Figure 8 is adopted, and the encoding and decoding are performed in sequence according to the display order of the current frame. Compared with the existing IPP...P non-scalable video coding process, the time domain distance between the reference frame and the coded frame is shortened, so the correlation between the reference frame and the coded frame can be better utilized to achieve The frame rate is adjustable, and the coding and decoding delay can be greatly reduced at the same time to achieve zero delay (see Figure 2 and Figure 4). At the same time, the present invention is also applicable to the time-domain scalable coding structure when the GOP size is not a multiple of an integer power of 2 as shown in FIG. 2; and the present invention can also insert between all adjacent frames in each GOP One or two B frames, as shown in Figure 5, is a schematic diagram of the time domain hierarchical structure after inserting one B frame.
[0032] In the present invention, the code stream is layered in time domain, and the bottom layer (basic layer) is compatible with the non-scalable video coding standard. The time domain level calculation and labeling of all coded frames in the current image group are carried out according to the current frame. The display sequence starts from the first frame (display order is 0), and performs intra-frame prediction coding (in this case, I frame), and then the first P frame of each time domain level and the adjacent I frame at the same level P frames all use the I frame as the reference frame, and the remaining P frames are selected with the same or smaller time domain level, and the display order is the one or two forward frames closest to the display order of the current frame as the reference. In this process , The P frame with the highest time domain level (that is, the P frame with an odd display sequence number) is not used as a reference. During the reference frame update process, the reconstructed images are sequentially stored in the buffer according to the sequence number in the reference frame buffer until the buffer is filled; when the reference frame buffer is filled, the reconstructed image of the current frame needs to be replaced in the reference frame buffer to meet the conditions The time domain level of this frame is the same as that of the current frame, and its display order is the smallest frame relative to the reference frame in the buffer.
[0033] 1. The time-domain scalable video coding method provided by the present invention adopts a method including the following steps:
[0034] (1) Time domain layering of code stream:
[0035] Divided into basic layer and enhancement layer. The basic layer adopts the non-scalable video coding standard structured as IPP...P for encoding, corresponding to the lowest time-domain resolution displayed by video transmission and terminal decoding, and the zeroth layer of the real-time domain level; the enhancement layer corresponds to the P frame , Determine its time domain level according to its respective display order, and the time domain layer sequence number and the enhancement layer sequence number are one-to-one correspondence, and then through the flexible selection of P frames to achieve temporal scalability; in coding a picture When the image group is displayed, it is encoded in real time according to its display order.
[0036] (2) Check the legality of the time domain gradable parameter settings in the configuration file:
[0037] Whether the size of the specific GOP is an integer power of 2, GOP (group of picture) is the English abbreviation of group of pictures; if it is checked that the parameter setting is illegal, the program exits and the encoding process fails.
[0038] (3) Calculate the time domain level of each coded frame in the current image group, mark the coded frame at the time domain level, and update the coding configuration parameters.
[0039] In this process, the time domain hierarchy of the I frame and P frame of the base layer is set to 0, and the remaining P frames are calculated according to the time domain hierarchy calculation algorithm in the hierarchy.
[0040] To update the original encoding configuration parameters means to set the encoding image type to frame, the frequency of frame skipping, and the number of P frames to be inserted between I frame and P frame or P frame is the size of the image group minus 1 ; At the same time, the reference frame storage unit is updated.
[0041] (4) Obtain the reference frame of the current coded image:
[0042] If the current frame is an I frame, there is no reference frame, and the intra-frame coding and decoding process in the existing standard is directly executed; if the current frame is a P frame, the reference frame is one or two forward reference frames that meet the conditions:
[0043] 1) Its time domain level is lower than or equal to the current frame;
[0044] 2) The display order is the display order closest to the current frame.
[0045] That is, taking the current coded frame as the starting point, search forward for the one or two images that are the closest to the current coded frame and whose time domain level is lower than or equal to the current coded frame as the reference frame of the current coded frame.
[0046] (5) Perform motion prediction and motion compensation, discrete cosine transform, quantization, and entropy coding of residual information, reference frame index and motion vector on the current coded image. This process is the same as that of non-scalable video coding.
[0047] (6) Save the current frame (except the P frame with the highest temporal level) to reconstruct the image into a temporary coded reconstructed image array. This array will save all frames whose temporal level of a group of images is lower than the highest temporal level The reconstructed image and the I frame or P frame reconstructed from the previous image group, so that the reference frame can be correctly obtained in step 4;
[0048] (7) Repeat the process from step 4 to step 6 until the last image of the required time domain level is reached;
[0049] (8) Save the reconstructed image:
[0050] In this process, it is particularly necessary to determine the conditions for the reconstructed frame to be input to the reconstructed image file. If the conditions are met, all reconstructed frames whose time domain level is lower than the highest time domain level in the reconstructed image array of the image group are output, this image The group coding process ends, and the coding process of the next group of pictures is entered; if the conditions are not met, the coding process of the current group of pictures is continued.
[0051] 2. The specific implementation process of the time-domain scalable video coding method provided by the present invention:
[0052] (1) Corresponding to step one of the above method, consistent with the non-scalable video coding process.
[0053] (2) Check the size of the image group. Suppose the size of the image group is gop_size, and this parameter should satisfy:
[0054] gop_size=2 x (0≤x≤max_temporal_level)
[0055] In the above formula, max_temporal_level is the maximum number of temporal levels, and x must be an integer.
[0056] Let current_temporal_level be the temporal level of the current encoded image, PicList[i].temporal_level is the temporal level of a frame in the reference frame buffer, img->tr is the display sequence number of the current frame, and PicList[i].pic_distance is Refer to the display sequence number of a frame in the frame buffer.
[0057] (3) Calculate the temporal hierarchy of each coded image in the current image group, which is one of the cores of the present invention. The time domain level of the current image is determined by the display order of the current image and the number of time domain levels that need to be realized. The specific calculation method of the time domain level is:
[0058] If the display sequence number is n times the size of gop_size, the time domain level of the current frame is 0; n = 1, 2, 3, 4, 5,...;
[0059] If the above conditions are not met, the judgment shall be made according to the following methods:
[0060] 1) For all P frames with an odd display sequence number, the time domain level is the highest, which is the logarithmic value based on the base 2 of the GOP size;
[0061] 2) For all P frames with a display sequence number of 2n, the time domain level is the time domain level value of each P frame in step 1) minus one; n=1, 3, 5, 7,...;
[0062] 3) For all P frames with a display sequence number of 4n, the time domain level is the time domain level value of each P frame in step 2) minus one; n=1, 3, 5, 7,...;
[0063] 4) For all P frames when the display sequence number is 8n, the time domain level is the time domain level value of each P frame in step 3) minus one; n=1, 3, 5, 7,...;
[0064] By analogy, all frames can be graded in time domain and hierarchically marked, and the image coding type of the current frame is frame coding.
[0065] (4) Obtaining the reference frame of the coded image. This process is also one of the cores of the present invention. To obtain the reference frame of the current coded image, the specific selection strategy is:
[0066] In this process, first define the structure DecodedPicture:
[0067] {
[0068] byte ** imgY;
[0069] byte *** imgUV;
[0070] int pic_distance;
[0071] int temporal_level;
[0072] }DecodedPicture;
[0073] When the global variable memory is allocated in the encoding main program, an array of DecodedPicture type needs to be allocated
[0074] The size of the memory space of PicList[gop_size], this array is also called the image group reconstruction image array, used to store the reconstructed image of each coded frame in the image group during the encoding process. When encoding P-frames, in the process of acquiring one or two required reference frames, the present invention adopts a nearby search algorithm to achieve:
[0075] 1) The time domain level of the reference frame is lower than or equal to the current frame;
[0076] 2) The reference frame display sequence is the two forward reference frames closest to the current frame in the condition 1). If only one forward reference frame meets the condition at this time, then there is only one reference frame.
[0077] That is, taking the current coded frame as the starting point, search forward for the two images with the closest distance between the reconstructed image array and the current coded frame and the time domain level is lower than or equal to the current coded frame as the reference frame of the current coded frame (if this When only one forward reference frame meets the condition, then there is only one reference frame). After finding the reference frame, if sub-pixel interpolation is needed, then sub-pixel interpolation is performed. If the current coded frame is not a P frame of the enhancement layer (that is, a P frame whose time domain level is greater than 0), obtain a reference image corresponding to the current frame type specified by the non-scalable IPP...P coding standard.
[0078] (5) The process is the same as the coding process of the non-scalable video coding standard, and one or two reference frames obtained in step (4) are used in accordance with the flow of the non-scalable video coding scheme.
[0079](6) Save the reconstructed image obtained in step (5) into the PicList array so as to obtain the reference frame of the coded image of the next level. The process of saving the reconstructed image is also one of the cores of the present invention.
[0080] The method to update the reference frame storage unit is: in the process of saving the reconstructed image, it is particularly necessary to determine the condition of the reconstructed frame input to the reconstructed image file. If the condition is met, the reconstructed image is saved according to the following reference frame update method. This image The group coding process ends, and the coding process of the next group of pictures is entered; if the conditions are not met, the coding process of the current group of pictures is continued. The reference frame update strategy can be divided into the following two situations:
[0081] 1) When the reference frame buffer is not full, the reconstructed images are stored in the reference frame buffer according to the number in the buffer according to the coding order, until they are filled;
[0082] 2) When the reference frame buffer is full, for the current frame to reconstruct the image, a certain frame in the reference frame buffer needs to be replaced. The replacement criterion is:
[0083] a) The temporal level of the replaced frame is equal to the temporal level of the reconstructed image of the current frame;
[0084] b) The display sequence number of the replaced frame is the smallest among all reference frames in the reference frame buffer.
[0085] (7) Repeat steps (3) to (5) until the last coded image in the display sequence to be reached.
[0086] In the case of low latency, the hierarchical P-frame technology of the present invention can insert B-frames, so one or two B-frames can be inserted between two adjacent frames in each GOP, thereby forming a more flexible coding structure .
[0087] 3. Realize the effect:
[0088] In the above-mentioned example of the present invention, a series of sequences have been tested for coding performance, and the test results show that the sequence is coded using the real-time dual-frame reference time-domain scalable method provided by the present invention, and the frame rate can be adjusted, and It can greatly reduce the time delay, and its complexity is basically not increased compared with the original IPPP... structure.
[0089] From the analysis of the performance test curve shown in Figure 9, it can be obtained that, compared with the traditional IPP...P non-scalable coding structure, the PSNR value of the image is encoded with a gain of 0.125DB after the coding method of the hierarchical P frame is used.
[0090] The time-domain scalable video coding method provided by the present invention can also be applied to a time-domain scalable coding structure when the GOP size is not a multiple of an integer power of 2.
[0091] references
[0092] 1.Applications and Requirements for Scalable Video Coding.ISO/IEC JTC1/SC29/WG11N6880.January 2005, Hongkong, China.
[0093] 2. J.R. Ohm, "Three-dimensional subband coding with motion compensation," IEEE Transaction on Image Processing, vol. 3, no. 5, pp. 559-571, September 1994.
[0094] A. Secker and D. Taubman, "Lifting-based invertible motion adaptive transform (LIMAT) framework for highly scalable video compression," IEEE Transaction on Image Processing, vol.12, no.12, December 2003.
[0095] H. Schwarz, D. Marpe, and T. Wiegand, "Analysis of Hierarchical B Pictures and MCTF," in Proceeding of IEEE International Conference on Multimedia and Expo, pp. 1929-1932, July 2006, Toronto, Canada.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Classification and recommendation of technical efficacy words

Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products