Method, apparatus and device for determining decoding configuration parameters, and storage medium
By adjusting the decoding and rendering configuration parameters of the encoding device, the problem of insufficient encoding and decoding quality in low-latency and high-frame-rate scenarios was solved, achieving more efficient decoding performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2022-01-27
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies cannot meet the decoding configuration parameter requirements of video applications in low-latency and high-frame-rate scenarios by changing the type of decoding chip, resulting in insufficient encoding and decoding quality.
By determining the initial decoding and rendering configuration information of the encoding device, and adjusting the decoding and rendering parameters when the threshold is not met, the configuration is optimized to meet the requirements of low latency and high resolution.
It improves encoding and decoding quality, especially in low-latency and high-frame-rate scenarios, achieving more efficient decoding performance.
Smart Images

Figure CN116567244B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to a method, apparatus, device, and storage medium for determining decoding configuration parameters. Background Technology
[0002] Video has many applications, such as video calls, video players, and screen sharing. Different scenarios have different requirements for video. For example, low latency is required in video calls, high resolution is required in video playback, and both high resolution and low latency are required in screen sharing.
[0003] To meet the video requirements of different scenarios, this can be achieved by configuring the decoding parameters of the decoding device. For example, by comparing the decoding latency of different decoding chips, the chip with the lowest decoding latency can be selected as the target decoding chip to achieve low latency. Alternatively, by comparing the highest resolution supported by different decoding chips, the chip with the highest resolution can be selected as the target decoding chip to achieve high resolution.
[0004] In other words, the current focus is more on the impact of the type of decoding chip on the decoding configuration. However, for low-latency and high-frame-rate scenarios, changing the type of decoding chip cannot provide the required decoding configuration parameters. Summary of the Invention
[0005] This application provides a method, apparatus, device, and storage medium for determining decoding configuration parameters to obtain target decoding and rendering configuration information that meets the requirements of low latency and high frame rate scenarios, thereby improving decoding performance.
[0006] In a first aspect, this application provides a method for determining decoding configuration parameters, applied to an encoding device, including:
[0007] Determine the initial decoding and rendering configuration information of the encoding device, wherein the initial decoding and rendering configuration information includes the initial decoding parameters and initial rendering parameters of the encoding device;
[0008] Decode the test bitstream under the initial decoding configuration information to obtain the initial decoding output frame rate and initial single-frame decoding latency corresponding to the initial decoding rendering configuration information;
[0009] If at least one of the initial decoding output frame rate and the initial single-frame decoding delay does not meet the corresponding threshold, then at least one of the initial decoding parameters and the initial rendering parameters is adjusted to obtain the target decoding rendering configuration information of the encoding device, wherein the decoding output frame rate and single-frame decoding delay corresponding to the target decoding rendering configuration both meet the corresponding threshold.
[0010] Secondly, this application provides a device for determining decoding configuration parameters, applied to an encoding device, comprising:
[0011] A determining unit is configured to determine the initial decoding and rendering configuration information of the encoding device, wherein the initial decoding and rendering configuration information includes the initial decoding parameters and initial rendering parameters of the encoding device;
[0012] The detection unit is used to decode the test bitstream under the initial decoding configuration information to obtain the initial decoding output frame rate and initial single-frame decoding delay corresponding to the initial decoding rendering configuration information.
[0013] An adjustment unit is configured to adjust at least one of the initial decoding parameters and initial rendering parameters if at least one of the initial decoding output frame rate and the initial single-frame decoding delay does not meet the corresponding threshold, thereby obtaining the target decoding rendering configuration information of the encoding device, wherein the decoding output frame rate and single-frame decoding delay corresponding to the target decoding rendering configuration both meet the corresponding threshold.
[0014] Thirdly, an electronic device is provided, comprising: a processor and a memory for storing a computer program, the processor for calling and running the computer program stored in the memory to perform the method of the first aspect.
[0015] Fourthly, a computer-readable storage medium is provided for storing a computer program that causes a computer to perform the method of the first aspect.
[0016] Fifthly, a chip is provided for implementing the methods of the first aspect or its various implementations described above. Specifically, the chip includes a processor for retrieving and running a computer program from a memory, causing a device equipped with the chip to perform the methods of the first aspect or its various implementations described above.
[0017] In a sixth aspect, a computer program product is provided, including computer program instructions that cause a computer to perform the methods described in the first aspect or its various implementations.
[0018] In a seventh aspect, a computer program is provided that, when run on a computer, causes the computer to perform the methods described in the first aspect or its various implementations.
[0019] In summary, in this application, the encoding device determines its initial decoding and rendering configuration information, which includes initial decoding parameters and initial rendering parameters. Under this initial decoding configuration information, the test bitstream is decoded to obtain the initial decoding output frame rate and initial single-frame decoding latency corresponding to the initial decoding and rendering configuration information. If at least one of the initial decoding output frame rate and initial single-frame decoding latency does not meet the corresponding threshold, at least one of the initial decoding parameters and initial rendering parameters is adjusted to obtain the target decoding and rendering configuration information of the encoding device. The decoding output frame rate and single-frame decoding latency corresponding to the target decoding and rendering configuration both meet the corresponding thresholds. That is, in this embodiment, the decoding and rendering parameters of the encoding device are fully considered when determining the target decoding and rendering configuration. By adjusting the decoding and rendering parameters, a target decoding and rendering configuration information that satisfies low latency and high resolution scenarios is determined. Using this target decoding and rendering configuration information for encoding and decoding can improve the quality of encoding and decoding. In addition, in some embodiments, the initial decoding and rendering configuration information is the optimal decoding and rendering configuration information for multiple devices. By performing a limited number of probes based on the initial decoding and rendering configuration information, the target decoding and rendering configuration information of the encoding device can be found from the complex configuration combinations, which is highly efficient. Attached Figure Description
[0020] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0021] Figure 1 This is a schematic diagram illustrating an application scenario according to an embodiment of this application;
[0022] Figure 2 This is a schematic block diagram of the video encoder provided in the embodiments of this application;
[0023] Figure 3 This is a schematic block diagram of the video decoder provided in the embodiments of this application;
[0024] Figure 4 A flowchart illustrating a method for determining decoding configuration parameters according to an embodiment of this application;
[0025] Figure 5 This is a schematic diagram illustrating the detection of decoding and rendering parameters according to an embodiment of this application;
[0026] Figure 6 A flowchart illustrating a method for determining decoding configuration parameters according to an embodiment of this application;
[0027] Figure 7 A schematic diagram of the structure of a device for determining decoding configuration parameters provided in an embodiment of this application;
[0028] Figure 8 This is a schematic block diagram of the electronic device provided in the embodiments of this application. Detailed Implementation
[0029] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0030] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "possessing," and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or server that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or devices.
[0031] First, the relevant concepts involved in the embodiments of this application will be introduced:
[0032] Decoding frame stacking: refers to the phenomenon that a hardware decoder only starts to output decoded images after a certain number of video frames have been input.
[0033] Decoding single-frame latency: refers to the time difference between when a video frame is input to the hardware decoder and when the hardware decoder outputs that video frame. For decoding chips that do not store frames, decoding single-frame latency is the actual decoding latency. For decoding chips that store frames, decoding single-frame latency includes the storage time of the previous few frames.
[0034] Decoding input frame rate: refers to the frequency at which the decoder sends the video bitstream.
[0035] Decoding output frame rate: refers to the frequency at which the decoder outputs video image frames.
[0036] Video encoding single-frame reference: This means that when encoding a video frame, the image content of the current frame and the previous frame are only referenced.
[0037] Multi-frame reference in video encoding refers to the process of referencing the image content of the current frame and several previous frames during video encoding. For some frame-stacking models, multi-frame reference in video encoding can lead to an increase in the number of frames stored by the chip.
[0038] Figure 1 This is a schematic diagram of an application scenario according to an embodiment of this application, including a cloud server 101 and a terminal device 102. The cloud server 101 can be understood as an encoding device, and the terminal device 102 can be understood as a decoding device.
[0039] The cloud server 101 is used to encode (can be understood as compression) video data to generate a bitstream and transmit the bitstream to the terminal device 102.
[0040] The cloud server 101 in this application embodiment can be understood as a device with video encoding function, and the terminal device 102 can be understood as a device with video decoding function. That is, the cloud server 101 and the terminal device 102 in this application embodiment include a wider range of devices, such as smartphones, desktop computers, mobile computing devices, laptops (e.g., laptops), tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, in-vehicle computers, etc.
[0041] In some embodiments, cloud server 101 may transmit encoded video data (such as a bitstream) to terminal device 102 via channel 103. Channel 103 may include one or more media and / or devices capable of transmitting encoded video data from cloud server 101 to terminal device 102.
[0042] In one example, channel 103 includes one or more communication media that enable cloud server 101 to transmit encoded video data directly to terminal device 102 in real time. In this example, cloud server 101 can modulate the encoded video data according to a communication standard and transmit the modulated video data to terminal device 102. The communication media includes wireless communication media, such as radio frequency spectrum; optionally, the communication media may also include wired communication media, such as one or more physical transmission lines.
[0043] In another example, channel 103 includes a storage medium that can store video data encoded by cloud server 101. The storage medium includes various local access data storage media, such as optical discs, DVDs, flash memory, etc. In this example, terminal device 102 can retrieve the encoded video data from this storage medium.
[0044] In another example, channel 103 may include a storage server that can store the video data encoded by cloud server 101. In this example, terminal device 102 can download the stored encoded video data from the storage server. Optionally, the storage server can store and transmit the encoded video data to terminal device 102, such as a web server (e.g., for a website), a file transfer protocol (FTP) server, etc.
[0045] In some embodiments, the cloud server 101 includes a video encoder and an output interface. The output interface may include a modulator / demodulator (modem) and / or a transmitter.
[0046] In some embodiments, the cloud server 101 may include a video source in addition to a video encoder and an input interface.
[0047] The video source may include at least one of a video capture device (e.g., a video camera), a video archive, a video input interface, and a computer graphics system, wherein the video input interface is used to receive video data from a video content provider, and the computer graphics system is used to generate the video data.
[0048] A video encoder encodes video data from a video source to produce a bitstream. Video data may include one or more pictures or a sequence of pictures. The bitstream contains the encoding information of the pictures or picture sequences in the form of a bitstream. The encoding information may include encoded image data and associated data. Associated data may include a sequence parameter set (SPS), a picture parameter set (PPS), and other syntax structures. An SPS may contain parameters applied to one or more sequences. A PPS may contain parameters applied to one or more pictures. A syntax structure refers to a set of zero or more syntax elements arranged in a specified order within the bitstream.
[0049] The video encoder transmits the encoded video data directly to the terminal device 102 via the output interface. The encoded video data can also be stored on a storage medium or a storage server for later retrieval by the terminal device 102.
[0050] In some embodiments, the terminal device 102 includes an input interface and a video decoder.
[0051] In some embodiments, the terminal device 102 may include a display device in addition to an input interface and a video decoder.
[0052] The input interface includes a receiver and / or a modem. The input interface can receive encoded video data through the channel.
[0053] A video decoder is used to decode encoded video data to obtain decoded video data, and then transmits the decoded video data to a display device.
[0054] The display device displays the decoded video data. The display device can be integrated with the terminal device or external to the terminal device. The display device can include various display devices, such as liquid crystal displays (LCDs), plasma displays, organic light-emitting diode (OLED) displays, or other types of display devices.
[0055] Optionally, cloud server 101 can be one or more. When there are multiple cloud servers 101, at least two servers are used to provide different services, and / or at least two servers are used to provide the same service, such as providing the same service in a load-balanced manner. This application embodiment does not limit this.
[0056] Optionally, the aforementioned cloud server 101 can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN (Content Delivery Network), and big data and artificial intelligence platforms. Cloud server 101 can also become a node in a blockchain.
[0057] In some embodiments, the cloud server 101 is a cloud server with powerful computing resources, characterized by high virtualization and high distribution.
[0058] In some embodiments, this application can be applied to the fields of image encoding and decoding, video encoding and decoding, hardware video encoding and decoding, dedicated circuit video encoding and decoding, and real-time video encoding and decoding. For example, the solution of this application can be combined with audio video coding standards (AVS), such as H.264 / Audio Video Coding (AVC) standard, H.265 / High Efficiency Video Coding (HEVC) standard, and H.266 / Versatile Video Coding (VVC) standard. Alternatively, the solution of this application can be combined with other proprietary or industry standards, including ITU-TH.261, ISO / IEC MPEG-1 Visual, ITU-TH.262 or ISO / IEC MPEG-2 Visual, ITU-TH.263, ISO / IEC MPEG-4 Visual, ITU-TH.264 (also known as ISO / IEC MPEG-4 AVC), which include Scalable Video Coding (SVC) and Multi-View Video Coding (MVC) extensions. It should be understood that the technology in this application is not limited to any particular codec standard or technology.
[0059] The video coding framework involved in the embodiments of this application is described below.
[0060] Figure 2 This is a schematic block diagram of a video encoder provided in an embodiment of this application. It should be understood that the video encoder 200 can be used for lossy compression of images or lossless compression of images. The lossless compression can be visually lossless compression or mathematically lossless compression.
[0061] The video encoder 200 can be applied to image data in luminance-chrominance (YCbCr, YUV) format.
[0062] For example, the video encoder 200 reads video data and divides each frame into several coding tree units (CTUs). In some examples, CTUs may be called "tree blocks," "largest coding unit" (LCU), or "coding treeblock" (CTB). Each CTU can be associated with a pixel block of equal size within the image. Each pixel can correspond to one luminance (luma) sample and two chrominance (chroma) samples. Therefore, each CTU can be associated with one luminance sample block and two chrominance sample blocks. The size of a CTU is, for example, 128×128, 64×64, 32×32, etc. A CTU can be further divided into several coding units (CUs) for encoding. CUs can be rectangular or square blocks. The CU can be further divided into prediction units (PUs) and transform units (TUs), thus separating encoding, prediction, and transformation for more flexible processing. In one example, the CTU is divided into CUs using a quadtree structure, and the CUs are further divided into TUs and PUs using a quadtree structure.
[0063] The video encoder and decoder support various PU sizes. Assuming a specific CU size of 2N×2N, the video encoder and decoder can support PU sizes of 2N×2N or N×N for intra-frame prediction, and also support symmetric PUs of 2N×2N, 2N×N, N×2N, N×N, or similar sizes for inter-frame prediction. The video encoder and decoder can also support asymmetric PUs of 2N×nU, 2N×nD, nL×2N, and nR×2N for inter-frame prediction.
[0064] In some embodiments, such as Figure 2 As shown, the video encoder 200 may include: a prediction unit 210, a residual unit 220, a transform / quantization unit 230, an inverse transform / quantization unit 240, a reconstruction unit 250, a loop filtering unit 260, a decoded image buffer 270, and an entropy coding unit 280. It should be noted that the video encoder 200 may contain more, fewer, or different functional components.
[0065] Optionally, in this application, the current block can be referred to as the current coding unit (CU) or the current prediction unit (PU), etc. The prediction block can also be referred to as the predicted image block or the image prediction block, and the reconstructed image block can also be referred to as the reconstruction block or the image reconstruction block.
[0066] In some embodiments, the prediction unit 210 includes an inter-frame prediction unit 211 and an intra-frame estimation unit 212. Because there is a strong correlation between adjacent pixels in a frame of a video, intra-frame prediction is used in video encoding and decoding techniques to eliminate spatial redundancy between adjacent pixels. Because there is a strong similarity between adjacent frames in a video, inter-frame prediction is used in video encoding and decoding techniques to eliminate temporal redundancy between adjacent frames, thereby improving coding efficiency.
[0067] Inter-frame prediction unit 211 can be used for inter-frame prediction. Inter-frame prediction can refer to image information from different frames. Inter-frame prediction uses motion information to find reference blocks from reference frames and generates prediction blocks based on the reference blocks to eliminate temporal redundancy. The frames used for inter-frame prediction can be P-frames and / or B-frames. P-frames refer to forward prediction frames, and B-frames refer to bidirectional prediction frames. Motion information includes a list of reference frames, the reference frame index, and motion vectors. Motion vectors can be integer-pixel or fractional-pixel. If the motion vector is fractional-pixel, interpolation filtering needs to be used in the reference frame to create the required fractional-pixel blocks. Here, the integer-pixel or fractional-pixel blocks in the reference frame found based on the motion vectors are called reference blocks. Some techniques directly use the reference blocks as prediction blocks, while others process the reference blocks to generate prediction blocks. Processing the reference blocks to generate prediction blocks can also be understood as using the reference blocks as prediction blocks and then processing them to generate new prediction blocks.
[0068] The most commonly used inter-frame prediction methods currently include geometric partitioning mode (GPM) in the VVC video codec standard and angular weighted prediction (AWP) in the AVS3 video codec standard. These two intra-frame prediction modes share common principles.
[0069] Intra-frame estimation unit 212 refers only to information from the same frame image to predict pixel information within the current code image block, thereby eliminating spatial redundancy. The frame used for intra-frame prediction can be an I-frame.
[0070] HEVC uses 35 intra-frame prediction modes: Planar, DC, and 33 angle modes. VVC uses 67 intra-frame prediction modes: Planar, DC, and 65 angle modes. AVS3 uses 66 intra-frame prediction modes: DC, Plane, Bilinear, and 63 angle modes.
[0071] In some embodiments, the intra-frame estimation unit 212 may be implemented using intra-frame block copying technology and intra-frame string copying technology.
[0072] The residual unit 220 can generate a residual block of the CU based on the pixel block of the CU and the prediction block of the PU of the CU. For example, the residual unit 220 can generate a residual block of the CU such that each sample in the residual block has a value equal to the difference between the sample in the pixel block of the CU and the corresponding sample in the prediction block of the PU of the CU.
[0073] Transform / quantization unit 230 can quantize transform coefficients. Transform / quantization unit 230 can quantize transform coefficients associated with the TU of the CU based on the quantization parameter (QP) value associated with the CU. Video encoder 200 can adjust the degree of quantization applied to the transform coefficients associated with the CU by adjusting the QP value associated with the CU.
[0074] The inverse transform / quantization unit 240 can apply inverse quantization and inverse transform to the quantized transform coefficients to reconstruct the residual block from the quantized transform coefficients.
[0075] The reconstruction unit 250 can add samples of the reconstructed residual block to corresponding samples of one or more prediction blocks generated by the prediction unit 210 to produce a reconstructed image block associated with the TU. By reconstructing the sampled blocks of each TU of the CU in this way, the video encoder 200 can reconstruct the pixel blocks of the CU.
[0076] The loop filtering unit 260 can perform deblocking filtering operations to reduce the block effect of pixel blocks associated with the CU.
[0077] In some embodiments, the loop filtering unit 260 includes a deblocking filtering unit and a sample adaptive compensation / adaptive loop filtering (SAO / ALF) unit, wherein the deblocking filtering unit is used to remove block effects and the SAO / ALF unit is used to remove ringing effects.
[0078] The decoded image buffer 270 can store reconstructed pixel blocks. The inter-frame prediction unit 211 can use a reference image containing the reconstructed pixel blocks to perform inter-frame prediction on PUs of other images. In addition, the intra-frame estimation unit 212 can use the reconstructed pixel blocks in the decoded image buffer 270 to perform intra-frame prediction on other PUs in the same image as the CU.
[0079] Entropy coding unit 280 can receive quantized transform coefficients from transform / quantization unit 230. Entropy coding unit 280 can perform one or more entropy coding operations on the quantized transform coefficients to produce entropy-coded data.
[0080] Figure 3 This is a schematic block diagram of the video decoder provided in the embodiments of this application.
[0081] like Figure 3As shown, the video decoder 300 includes: an entropy decoding unit 310, a prediction unit 320, an inverse quantization / transformation unit 330, a reconstruction unit 340, a loop filtering unit 350, and a decoded image buffer 360. It should be noted that the video decoder 300 may contain more, fewer, or different functional components.
[0082] The video decoder 300 can receive a bitstream. The entropy decoding unit 310 can parse the bitstream to extract syntax elements from it. As part of parsing the bitstream, the entropy decoding unit 310 can parse the entropy-encoded syntax elements in the bitstream. The prediction unit 320, the dequantization / transform unit 330, the reconstruction unit 340, and the loop filtering unit 350 can decode the video data based on the syntax elements extracted from the bitstream, i.e., generate decoded video data.
[0083] In some embodiments, the prediction unit 320 includes an intra-frame prediction unit 322 and an inter-frame prediction unit 321.
[0084] Intra-prediction unit 322 can perform intra-prediction to generate prediction blocks for the PU. Intra-prediction unit 322 can use an intra-prediction mode to generate prediction blocks for the PU based on pixel blocks of spatially adjacent PUs. Intra-prediction unit 322 can also determine the intra-prediction mode of the PU based on one or more syntax elements parsed from the bitstream.
[0085] Inter-frame prediction unit 321 can construct a first reference image list (list 0) and a second reference image list (list 1) based on the syntax elements parsed from the bitstream. Furthermore, if the PU uses inter-frame prediction coding, the entropy decoding unit 310 can parse the motion information of the PU. Inter-frame prediction unit 321 can determine one or more reference blocks of the PU based on the motion information of the PU. Inter-frame prediction unit 321 can generate prediction blocks for the PU based on one or more reference blocks of the PU.
[0086] The dequantization / transformation unit 330 reversibly quantizes (i.e., dequantizes) the transform coefficients associated with the TU. The dequantization / transformation unit 330 can use the QP value associated with the CU of the TU to determine the degree of quantization.
[0087] After the inverse quantization transform coefficients, the inverse quantization / transformation unit 330 can apply one or more inverse transforms to the inverse quantization transform coefficients to generate a residual block associated with the TU.
[0088] The reconstruction unit 340 uses the residual block associated with the TU of the CU and the prediction block of the PU of the CU to reconstruct the pixel block of the CU. For example, the reconstruction unit 340 can add the sample of the residual block to the corresponding sample of the prediction block to reconstruct the pixel block of the CU, thereby obtaining the reconstructed image block.
[0089] The loop filter unit 350 can perform deblocking filtering operations to reduce the block effect of pixel blocks associated with the CU.
[0090] The video decoder 300 can store the reconstructed image of the CU in the decoded image buffer 360. The video decoder 300 can use the reconstructed image in the decoded image buffer 360 as a reference image for subsequent prediction, or transmit the reconstructed image to a display device for presentation.
[0091] The basic process of video encoding and decoding is as follows: At the encoding end, a frame image is divided into blocks. For the current block, the prediction unit 210 uses intra-frame prediction or inter-frame prediction to generate a prediction block for the current block. The residual unit 220 can calculate a residual block based on the prediction block and the original block of the current block, that is, the difference between the prediction block and the original block of the current block. This residual block can also be called residual information. This residual block is transformed and quantized by the transform / quantization unit 230, which can remove information that is not sensitive to the human eye to eliminate visual redundancy. Optionally, the residual block before transformation and quantization by the transform / quantization unit 230 can be called a temporal residual block, and the temporal residual block after transformation and quantization by the transform / quantization unit 230 can be called a frequency residual block or a frequency domain residual block. The entropy coding unit 280 receives the quantized change coefficients output by the change quantization unit 230, and can perform entropy coding on the quantized change coefficients to output a bitstream. For example, the entropy coding unit 280 can eliminate character redundancy based on the target context model and the probability information of the binary bitstream.
[0092] At the decoding end, the entropy decoding unit 310 can parse the bitstream to obtain the prediction information and quantization coefficient matrix of the current block. The prediction unit 320 uses intra-frame prediction or inter-frame prediction to generate the prediction block of the current block based on the prediction information. The dequantization / transform unit 330 uses the quantization coefficient matrix obtained from the bitstream to perform dequantization and inverse transform on the quantization coefficient matrix to obtain the residual block. The reconstruction unit 340 adds the prediction block and the residual block to obtain the reconstructed block. The reconstructed blocks form the reconstructed image. The loop filtering unit 350 performs loop filtering on the reconstructed image based on the image or based on the blocks to obtain the decoded image. The encoding end also needs similar operations to the decoding end to obtain the decoded image. This decoded image can also be called the reconstructed image, which can be used as a reference frame for inter-frame prediction in subsequent frames.
[0093] It should be noted that the block partitioning information determined at the encoding end, as well as mode information or parameter information such as prediction, transform, quantization, entropy coding, and loop filtering, are carried in the bitstream when necessary. The decoding end determines the same block partitioning information, prediction, transform, quantization, entropy coding, and loop filtering mode information or parameter information as the encoding end by parsing the bitstream and analyzing existing information, thereby ensuring that the decoded image obtained by the encoding end is the same as the decoded image obtained by the decoding end.
[0094] The above describes the basic flow of a video codec under a block-based hybrid coding framework. With the development of technology, some modules or steps of this framework or flow may be optimized. This application is applicable to the basic flow of a video codec under this block-based hybrid coding framework, but is not limited to this framework and flow.
[0095] In some embodiments, the present invention can be applied to various scenarios that require determining decoding configuration parameters, including but not limited to cloud technology (e.g., cloud gaming), artificial intelligence, smart transportation, and assisted driving.
[0096] In some embodiments, the method of this application can be applied to edge-cloud collaborative coding. Edge-cloud collaborative coding refers to a scheme in which the cloud and the terminal collaborate to compress video. Since the computing power of video content producers (cloud) and video content consumers (terminal) differs, a relatively complex video compression task can be completed collaboratively between the two ends. This utilizes the resources and powerful computing capabilities (such as encoding capabilities) of the cloud to reduce the amount of data transmitted over the network, while also effectively utilizing the computing power (such as decoding capabilities) of the terminal. It can be used in scenarios such as cloud gaming.
[0097] In some embodiments, video encoding is coordinated, and the optimal encoding and decoding configuration and strategy are selected based on the encoding and decoding capabilities of the smart terminal, combined with the game type and user network type.
[0098] The edge-cloud collaboration protocol refers to a unified protocol for data interaction between cloud servers and smart terminals.
[0099] The smart terminal collaboration interface refers to the interface between the smart terminal software and hardware modules. Through this interface, it is possible to effectively interact with the smart terminal, configure video encoding and rendering parameters, and obtain the real-time operating performance of the hardware.
[0100] Decoding performance refers to the highest supported decoding frame rate and single-frame decoding latency for a given video size under a specific decoding protocol. Video sizes are defined as follows: 360p, 576p, 720p, 1080p, 2k, 4k. Video frame rates are defined as follows: 30fps, 40fps, 50fps, 60fps, 90fps, 120fps.
[0101] The definitions of video resolution and video frame rate for terminal devices are shown in Tables 1 and 2.
[0102] Table 1 Definition of Video Resolution for Terminal Devices
[0103] Video resolution Enumeration definition 360p 0x1 576p 0x2 720p 0x4 1080p 0x8 2k 0x10 4k 0x20
[0104] Table 2 Definitions of Video Resolution and Video Frame Rate for Terminal Devices
[0105] Video frame rate Enumeration definition 30fps 0x1 40fps 0x2 50fps 0x4 60fps 0x8 90fps 0x10 120fps 0x20
[0106] Optionally, the decoding performance supported by the terminal device is given in the form of a triple. The first element is an enumerated definition of the video resolution, the second element is an enumerated definition of the video frame rate, and the third element is the single-frame decoding latency under the video resolution and video frame rate. For example, if the single-frame decoding latency of H264 for device A at 720p@60fps is 10ms, it is represented as (4,8,10).
[0107] The video encoding collaborative optimization scheme involves the cloud server determining the set of encoding functions to be enabled based on the game type and network conditions, and then determining the optimal encoding configuration for the current device based on the device type and encoding capabilities reported by the smart terminal.
[0108] In some embodiments, the data structure requirements for the decoding capability of the terminal device are shown in Table 3.
[0109] Table 3 Data Structure Requirements for Terminal Device Decoding Capability
[0110]
[0111]
[0112] Based on the decoding capabilities of the smart terminal, and combined with the game type and user network conditions, the cloud server determines the optimal decoding protocol, decoding resolution, video frame rate and other encoding and decoding configurations for the current device, as well as the number of video encoding reference frames and SVC enabling encoding and decoding strategies.
[0113] Currently, video applications can be broadly categorized into three types: video call applications, video player applications, and screen sharing applications.
[0114] Video call applications require low latency and do not have specific requirements for high resolution and high frame rate. Video resolution is typically between 480p and 720p, and the frame rate is 30fps. Video call applications are more concerned with single-frame decoding latency, comparing the single-frame decoding latency of different decoding chips (H.264 / H.265 / …) to find the configuration with the lowest latency. Some decoding chips exhibit frame stacking during decoding, and this factor is also considered when selecting the optimal configuration.
[0115] Video players require high resolution and only need a video frame rate of 30fps. There are no strict requirements for single-frame decoding latency, and they are not concerned about the chip's frame rate retention during decoding, as long as the decoded output frame rate remains stable. These applications are more concerned with the highest resolution supported by the decoding chip (H.264 / H.265 / ...).
[0116] Screen sharing requires high resolution and low latency, with a decoding frame rate typically required to be between 15-30fps. These applications consider not only the highest resolution supported by the decoding chip (H264 / H265 / …), but also the chip's frame rate handling capabilities.
[0117] In summary, existing technical solutions primarily consider the impact of the decoding chip type (H264 / H265 / ...) on decoding configuration. However, for low-latency and high-frame-rate scenarios, changing the decoding chip type cannot yield the required decoding configuration parameters.
[0118] To address the aforementioned technical issues, this application embodiment fully considers the decoding and rendering parameters of the decoding device when determining the decoding configuration parameters. By adjusting the decoding and rendering parameters, a target decoding and rendering configuration information that satisfies low latency and high resolution scenarios is determined. When using this target decoding and rendering configuration information for encoding and decoding, the quality of encoding and decoding can be improved.
[0119] The technical solutions of the embodiments of this application will be described in detail below through some examples. The following embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0120] Figure 4 This is a flowchart illustrating a method for determining decoding configuration parameters according to an embodiment of this application. This method can be applied to... Figure 1 The terminal device 102 shown or applied to Figure 3 The decoder shown is, in other words, the embodiment of this application is applied to the decoding end.
[0121] like Figure 4 As shown, the embodiments of this application include the following steps:
[0122] S401. Determine the initial decoding and rendering configuration information of the terminal device.
[0123] The initial decoding and rendering configuration information includes the initial decoding parameters and initial rendering parameters of the terminal device.
[0124] In this embodiment of the application, in order to meet the requirements of high resolution, high frame rate and low latency, the optimal decoding and rendering configuration is selected by detecting the decoding and rendering performance of different decoding configurations on the terminal device, so as to give full play to the current terminal device’s maximum capabilities and thus meet the needs of specific business scenarios.
[0125] In some embodiments, the decoding parameters of the terminal device include at least one of the following: decoding chip type, video resolution, and decoding input frame rate.
[0126] In some embodiments, the rendering parameters of the terminal device include at least one of the following: rendering control type, rendering frame dropping strategy, and encoding bitstream parameters.
[0127] The decoding chip types include H264, H265, VP9, and AVS.
[0128] Video resolutions include 720p, 1080p, 2K, and 4K.
[0129] Decoding input frame rate: includes 50fps, 60fps, etc.
[0130] Rendering control types vary across different systems. Taking Windows as an example, there are two types of rendering controls: Direct3D (Direct3D, or D3D for short, is a 3D graphics programming interface developed by Microsoft for the Microsoft Windows operating system) and OpenGL (Open Graphics Library). In some embodiments, the rendering control type is also referred to as the rendering window type.
[0131] Rendering frame dropping strategies include both frame dropping and no frame dropping. Monitors refresh the screen based on the vertical sync signal. For a 60Hz display, the vertical sync signal refreshes every 16ms. Rendering frame dropping means that multiple video images are continuously sent within two vertical sync signals, but only the last sent video image is displayed. Enabling frame dropping will reduce the smoothness of the image. Currently, most monitors have a rendering frequency of 60Hz. In the case of no frame dropping, the maximum decoding frame rate can only support 60fps. However, if there is decoding jitter or network jitter, the cumulative latency will gradually increase the decoding and rendering latency. In this case, rendering frame dropping needs to be enabled to reduce overall latency. For a decoding frame rate of 50fps, since it does not reach the device's maximum rendering frequency, the rendering frame dropping strategy does not need to be enabled.
[0132] Encoded bitstream parameters include single-frame references and multi-frame references for video encoding. Specific coding semantics affect single-frame decoding latency. For some chips, the number of encoded reference frames affects the number of frames the chip decodes, thus impacting decoding latency. Single-frame references for video encoding are necessary for these types of chips.
[0133] like Figure 5 As shown in the embodiments of this application, the decoding and rendering capabilities of the terminal device can be obtained through detection, which can include static capability detection and dynamic capability detection. Static capability detection mainly involves decoding capability detection and rendering capability detection. Decoding capability detection mainly includes detecting the decoding chip type and decoding chip capabilities. Rendering capability detection mainly includes detecting the rendering control type and frame dropping strategy.
[0134] Static capability probing refers to detailed hardware parameter information that can be directly obtained from the hardware interface. Decoding parameters obtainable through static capability probing include the types of decoding chips supported by the terminal device (such as H.264 / H.265 / VP9 / AVS, etc.), the maximum resolution supported by each decoding chip, and the decoding capabilities of each chip. The decoding capabilities of a decoding chip include its Size, Profile, and Level. The Profile describes the video compression characteristics of the decoding chip (e.g., CABAC, number of color samples, etc.). The Level describes the characteristics of the decoding chip itself (e.g., bitrate, resolution, frame rate, etc.). Simply put, a higher Profile indicates more advanced compression features. A higher Level indicates higher video bitrate, resolution, and frame rate. Rendering parameters obtainable through static capability probing include the types of rendering controls supported by the terminal device (Direct3D and OpenGL on Windows platforms, etc.), and the rendering frame dropping strategies supported by each rendering control.
[0135] In this embodiment of the application, the decoding parameters and rendering parameters of the terminal device can be obtained through static capability detection. After obtaining the decoding parameters and rendering parameters of the terminal device, various decoding and rendering configuration combinations of the terminal device can be obtained.
[0136] Dynamic capability detection refers to the process of receiving a video stream in real time, decoding and rendering it on the terminal device under the current configuration, and obtaining data such as the terminal device's decoding output frame rate and single-frame decoding latency. The single-frame decoding latency of most terminal devices varies with the decoding input frame rate; generally, the higher the decoding input frame rate and the faster the decoding speed, the lower the single-frame decoding latency. This is because increasing the decoding input frame rate increases the device's operating frequency, leading to faster decoding. Therefore, different decoding input frame rates will have different single-frame decoding latencies, and the decoding input frame rate needs to be detected as a configuration during the terminal device's dynamic capability detection phase.
[0137] For example, on a Windows platform device, it's necessary to find the optimal decoding and rendering configuration that satisfies high resolution, high frame rate, and low latency. The device's static capabilities are as follows: decoding chip types include H.264 and H.265; decoding resolutions support 720p and 1080p; rendering control types include Direct3D and OpenGL; and frame dropping strategies support both dropping and not dropping frames, resulting in 16 possible combinations. During the dynamic probing phase, it's necessary to probe the device's performance at two decoding input frame rates (50fps and 60fps), as well as the impact of single-frame and multi-frame references in video encoding, resulting in a total of 64 combinations. To explore the chip's maximum decoding and rendering capabilities, it's necessary to try all the above decoding and rendering configurations.
[0138] In some embodiments, the initial decoding and rendering configuration information of this application is any combination of the decoding parameters and rendering parameters of the terminal device, such as any combination of the 64 combinations mentioned above. That is, in this application embodiment, any combination of the 64 combinations mentioned above is used as the initial decoding and rendering configuration information. It is determined whether the decoding output frame rate and single-frame decoding latency corresponding to the initial decoding and rendering configuration information meet the corresponding thresholds. If they do not meet the thresholds, the initial decoding and rendering configuration information is adjusted. Then, it is determined whether the decoding output frame rate and single-frame decoding latency corresponding to the adjusted decoding and rendering configuration information meet the corresponding thresholds. This step is repeated until decoding and rendering configuration information in which both the decoding output frame rate and single-frame decoding latency meet the corresponding thresholds is obtained. This decoding and rendering configuration information is saved as the target decoding and rendering configuration information of the terminal device, and the decoding operation is directly loaded when the device starts up next time.
[0139] In some embodiments, in order to improve the speed of determining decoding and rendering configuration information, the initial decoding and rendering configuration information in this application embodiment is the target decoding and rendering configuration information with the highest probability of occurrence among the target decoding and rendering configuration information of N terminal devices, where N is a positive integer greater than 1.
[0140] In other words, the initial decoding and rendering configuration information in this application embodiment is obtained by conducting a complete test on a certain number of terminal devices in advance, acquiring the target decoding and rendering configuration information (i.e., the optimal decoding and rendering configuration information) of these devices, and selecting the target decoding and rendering configuration information that appears most frequently as the initial decoding and rendering configuration information. For example, to select the initial decoding and rendering configuration information for the Windows platform, it is necessary to conduct a complete test on a certain range of Windows platforms in advance, and select the optimal decoding and rendering configuration information that appears most frequently as the initial decoding and rendering configuration information for the Windows platform.
[0141] For example, the initial decoding and rendering configuration information available on the Windows platform is: H264+1080p+60fps+D3D9+no frame dropping+video encoding multi-frame reference, that is, the decoding chip type is H264, the video resolution is 1080p, the decoding frame rate is 60fps, the rendering control type is D3D9, the frame dropping strategy is no frame dropping, and the encoding bitrate parameter is video encoding multi-frame reference.
[0142] In this embodiment of the application, the initial decoding and rendering configuration information of the terminal device can be determined according to the above method. For ease of description, the decoding parameters included in the initial decoding and rendering configuration information are denoted as initial decoding parameters, and the rendering parameters included in the initial decoding and rendering configuration information are denoted as initial rendering parameters. Assuming the initial decoding and rendering configuration information is: H264+1080p+60fps+D3D9+no frame dropping+video encoding multi-frame reference, then the initial decoding parameters are: H264+1080p+60fps, and the initial rendering parameters are D3D9+no frame dropping+video encoding multi-frame reference.
[0143] S402. Decode the test bitstream under the initial decoding configuration information to obtain the initial decoding output frame rate and initial single-frame decoding delay corresponding to the initial decoding rendering configuration information.
[0144] After determining the initial decoding and rendering configuration information of the terminal device, it is judged whether the initial decoding and rendering configuration information meets the requirements. Specifically, a test bitstream is input to the terminal device, and the terminal device decodes the test bitstream under the initial decoding configuration information to obtain the decoding output frame rate and single-frame decoding latency corresponding to the initial decoding and rendering configuration information.
[0145] In this embodiment of the application, for ease of description, the decoding output frame rate corresponding to the initial decoding rendering configuration information is recorded as the initial decoding output frame rate, and the single-frame decoding delay corresponding to the initial decoding rendering configuration information is recorded as the initial single-frame decoding delay.
[0146] S403. If at least one of the initial decoding output frame rate and single-frame decoding delay does not meet the corresponding threshold, then at least one of the initial decoding parameters and initial rendering parameters is adjusted to obtain the target decoding and rendering configuration information of the terminal device.
[0147] Among them, the decoding output frame rate and single-frame decoding latency corresponding to the target decoding rendering configuration both meet the corresponding thresholds.
[0148] Based on the above steps, after determining the initial decoding output frame rate and initial single-frame decoding latency corresponding to the initial decoding rendering configuration information, the initial decoding output frame rate is compared with the corresponding threshold to determine whether the initial decoding output frame rate meets the corresponding threshold. The initial single-frame decoding latency is also compared with the corresponding threshold to determine whether the initial single-frame decoding latency meets the corresponding threshold.
[0149] In some embodiments, the initial decoding and rendering configuration information is the target decoding and rendering configuration information with the highest probability among the target decoding and rendering configuration information of N preset terminal devices. Since this initial decoding and rendering configuration information is derived from a certain number of devices, it has a certain degree of universality, and for most devices, the initial decoding and rendering configuration information is the optimal configuration. For the remaining devices, only some configuration adjustments need to be made to the initial decoding and rendering configuration information to find the optimal configuration. Whether configuration adjustments are needed is determined based on the decoding output frame rate and single-frame decoding latency detected each time.
[0150] If at least one of the initial decoding output frame rate and the initial single-frame decoding delay does not meet the corresponding threshold, then at least one of the initial decoding parameters and the initial rendering parameters is adjusted to obtain the target decoding rendering configuration information that meets the threshold.
[0151] Example 1: If the initial decoded output frame rate does not meet the corresponding threshold, adjust the initial decoding parameters and use the adjusted decoding parameters to decode the test bitstream. Determine the decoded output frame rate corresponding to the adjusted decoding parameters and check if this decoded output frame rate meets the corresponding threshold. If it does, fix the adjusted decoding parameters. If the decoded output frame rate corresponding to the adjusted decoding parameters does not meet the corresponding threshold, adjust the decoding parameters again and use the adjusted decoding parameters to decode the test bitstream again. Determine the decoded output frame rate corresponding to the adjusted decoding parameters this time and check if the decoded output frame rate corresponding to the adjusted decoding parameters this time meets the corresponding threshold. If it does, fix the adjusted decoding parameters this time; otherwise, adjust the decoding parameters again and repeat the above steps until decoded parameters that meet the threshold requirements are obtained.
[0152] Example 2: If the initial single-frame decoding latency does not meet the corresponding threshold, the initial rendering parameters are adjusted, and the test stream is decoded using the adjusted rendering parameters to determine the single-frame decoding latency corresponding to the adjusted rendering parameters. It is then determined whether this single-frame decoding latency meets the corresponding threshold. If it does, the adjusted rendering parameters are fixed. If the single-frame decoding latency corresponding to the adjusted rendering parameters does not meet the corresponding threshold, the rendering parameters are adjusted again, and the test stream is decoded again using the adjusted rendering parameters to determine the single-frame decoding latency corresponding to the current adjusted rendering parameters. It is then determined whether the single-frame decoding latency corresponding to the current adjusted rendering parameters meets the corresponding threshold. If it does, the current adjusted rendering parameters are fixed; otherwise, the rendering parameters are adjusted again, and the above steps are repeated until rendering parameters that meet the threshold requirements are obtained.
[0153] Example 3: If the initial decoded output frame rate and initial single-frame decoding latency do not meet the corresponding thresholds, adjust the initial decoding parameters and initial rendering parameters. Decode the test stream using the adjusted decoding and rendering parameters to determine the corresponding decoded output frame rate and single-frame decoding latency. Determine if the adjusted decoded output frame rate and single-frame decoding latency meet the corresponding thresholds. If they do, fix the adjusted decoding and rendering parameters. If the adjusted decoded output frame rate and single-frame decoding latency do not meet the corresponding thresholds, adjust the decoding and rendering parameters again. Decode the test stream again using the adjusted decoding and rendering parameters to determine the corresponding decoded output frame rate and single-frame decoding latency. Determine if the adjusted decoded output frame rate and single-frame decoding latency meet the corresponding thresholds. If they do, fix the adjusted decoding and rendering parameters. Otherwise, adjust the decoding and rendering parameters again, repeating the above steps until decoded and rendering parameters that meet the threshold requirements are obtained.
[0154] In some embodiments, the decoding parameters of this application are related to the decoding output frame rate. Adjusting the decoding parameters can adjust the decoding output frame rate. The rendering parameters are related to the single-frame decoding latency. Adjusting the rendering parameters can adjust the single-frame decoding latency.
[0155] For ease of description, this embodiment denotes the threshold corresponding to the decoding output frame rate as the decoding output frame rate threshold and the threshold corresponding to the single-frame decoding delay as the single-frame decoding delay threshold.
[0156] In some embodiments, S403 above includes the following cases:
[0157] Case 1: If the initial decoding output frame rate does not meet the decoding output frame rate threshold, but the initial single-frame decoding delay meets the single-frame decoding delay threshold, then the initial decoding parameters are adjusted to obtain the target decoding parameters that meet the decoding output frame rate threshold, and the rendering parameters are determined as the target rendering parameters.
[0158] In case 1, if the initial single-frame decoding latency meets the single-frame decoding latency threshold, it means that the rendering parameters in the initial decoding rendering configuration information meet the requirements. Therefore, the rendering parameters are not adjusted, and the initial rendering parameters are directly determined as the target rendering parameters.
[0159] If the initial decoded output frame rate does not meet the decoding output frame rate threshold, it indicates that the initial decoding parameters in the initial decoding rendering configuration information do not meet the requirements. These initial decoding parameters need to be adjusted to obtain the target decoding parameters that meet the decoding output frame rate threshold. The process of adjusting the initial decoding parameters to obtain the decoding parameters that meet the decoding output frame rate threshold can be referred to the description in Example 1 above, and will not be repeated here.
[0160] The embodiments of this application do not impose restrictions on the specific value of the decoding output frame rate threshold, for example, it can be a preset value.
[0161] In some embodiments, the above-mentioned decoding output frame rate threshold is the decoding input frame rate. In this case, determining whether the initial decoding output frame rate meets the decoding output frame rate threshold includes the following steps:
[0162] Step 11: If the initial decoded output frame rate is less than the decoded input frame rate, then it is determined that the initial decoded output frame rate does not meet the decoded output frame rate threshold.
[0163] Step 12: If the initial decoded output frame rate is equal to the decoded input frame rate, then the initial decoded output frame rate is determined to meet the decoded output frame rate threshold.
[0164] It should be noted that in step 12 above, the initial decoding output frame rate being equal to the decoding input frame rate can be understood as the initial decoding output frame rate being approximately equal to the decoding input frame rate.
[0165] In this embodiment of the application, the initial decoding parameters include at least one of the decoding chip type of the terminal device, the decoding input frame rate, and the video resolution. In this case, in the above-mentioned case 1, adjusting the initial decoding parameters to obtain the target decoding parameters that meet the decoding output frame rate threshold includes the following steps:
[0166] Step A: Adjust at least one of the decoding chip type, decoding input frame rate, and video resolution to obtain target decoding parameters that meet the decoding output frame rate threshold.
[0167] In some embodiments, any one or more of the initial decoding parameters, including the decoding chip type, decoding input frame rate, and video resolution, can be adjusted simultaneously to obtain target decoding parameters that meet the decoding output frame rate threshold. For example, the decoding input frame rate and / or video resolution can be reduced first, and then it can be determined whether the decoding output frame rate corresponding to the adjusted parameters meets the decoding output frame rate threshold. If it does not meet the threshold, the chip type can be adjusted. Another example is adjusting the decoding chip type and / or decoding input frame rate, and then adjusting the video resolution. In other words, in this embodiment, the order of adjustment of the decoding chip type, decoding input frame rate, and video resolution, as well as the composition of the adjustments, are not limited and can be determined according to actual needs.
[0168] In some embodiments, step A above can be adjusted according to the adjustment sequence shown in step A1 below:
[0169] Step A1: Following the adjustment method of first adjusting the decoding chip type, then adjusting the decoding input frame rate and video resolution, adjust at least one of the decoding chip type, decoding input frame rate and video resolution to obtain the target decoding parameters.
[0170] Since different chip types have a significant impact on the decoding output frame rate, to quickly determine the target decoding parameters, if the initial decoding output frame rate does not meet the decoding output frame rate threshold, the first step is to change the decoding chip type (e.g., H.264 / H.265) and determine whether the decoding output frame rate corresponding to the adjusted decoding chip meets the threshold. In most cases, changing the decoding chip type is sufficient to meet the threshold. If the decoding output frame rate corresponding to the adjusted chip does not meet the threshold, the decoding chip with the highest decoding output frame rate is selected as the target decoding chip type for subsequent probing. Based on this, attempts are made to reduce the frame rate and resolution to find a suitable decoding configuration.
[0171] In one scheme for adjusting the decoding input frame rate and video resolution, the decoding input frame rate is first reduced, and it is determined whether the decoding output frame rate corresponding to the adjusted decoding input frame rate meets the decoding output frame rate threshold. If it does not meet the threshold, the video resolution is then reduced.
[0172] In some embodiments of this application, the decoding input frame rate and video resolution can be adjusted multiple times. Specifically, if the decoding output frame rate corresponding to the reduction of the decoding input frame rate and video resolution still does not meet the decoding output frame rate threshold, the decoding input frame rate and video resolution are reduced again until a decoding input frame rate and video resolution that meet the decoding output frame rate threshold are obtained.
[0173] For example, first reduce the decoding input frame rate, and then determine whether the corresponding decoding output frame rate meets the decoding output frame rate threshold. If not, then reduce the video resolution, and then determine whether the corresponding decoding output frame rate meets the decoding output frame rate threshold. If not, continue reducing the decoding input frame rate and video resolution in the order of first reducing the decoding input frame rate and then the video resolution, repeating the above steps until a decoding input frame rate and video resolution that meet the decoding output frame rate threshold are obtained.
[0174] In some embodiments, the initial decoding and rendering configuration information described above may only limit some parameters, with the remaining parameters determined through dynamic detection. For example, some models have stronger H.265 decoding capabilities than H.264, but both H.264 and H.265 can meet the standards for decoding output frame rate and single-frame decoding latency under default configurations. If the initial decoding and rendering configuration information includes the decoding chip type, only the default chip type will be selected. In this case, the decoding chip type can be set to an unknown type. In this case, it is necessary to first detect the optimal decoding chip type on the device to determine the default configuration before proceeding with subsequent detection operations.
[0175] In other words, if the initial decoding and rendering configuration information does not include the decoding chip type, this embodiment of the application can first determine the optimal decoding chip type, add the determined decoding chip type to the initial decoding and rendering configuration information, and obtain new initial decoding and rendering configuration information. This new initial decoding and rendering configuration information includes the decoding chip type with the best decoding performance. Then, the method of this embodiment of the application is executed using this new determined initial decoding and rendering configuration information.
[0176] Case 2: If the initial decoding output frame rate meets the decoding output frame rate threshold, but the initial single-frame decoding latency does not meet the single-frame decoding latency threshold, then the initial rendering parameters are adjusted to obtain the target rendering parameters that meet the single-frame decoding latency threshold, and the initial decoding parameters are determined as the target decoding parameters.
[0177] In scenario 2, if the initial decoding output frame rate meets the decoding output frame rate threshold, it means that the decoding parameters in the initial decoding rendering configuration information meet the requirements. Therefore, the decoding parameters are not adjusted, and the initial decoding parameters are directly determined as the target decoding parameters.
[0178] If the initial single-frame decoding latency does not meet the single-frame decoding latency threshold, it indicates that the initial rendering parameters in the initial decoding rendering configuration information do not meet the requirements. These initial rendering parameters need to be adjusted to obtain the target rendering parameters that meet the single-frame decoding latency threshold. The process of adjusting the initial rendering parameters to obtain rendering parameters that meet the single-frame decoding latency threshold can be referred to the description in Example 2 above, and will not be repeated here.
[0179] The embodiments of this application do not impose restrictions on the specific value of the single-frame decoding delay threshold, for example, it can be a preset value.
[0180] In some embodiments, the single-frame decoding latency threshold of this application is the reciprocal of the product of the sum of the number of frames and 1, multiplied by the decoding input frame rate. Provided the decoding output frame rate meets the requirements, for devices that decode without frame buffering, the single-frame decoding latency should be less than 1 / decoding input frame rate. For devices that decode with frame buffering, the decoding latency needs to consider the number of frames buffered. For example, for a device decoding with 3 frames buffered, the video image of the nth frame will only be output from the decoder after the input of the (n+4)th frame video stream. In this case, the single-frame decoding latency will be additionally calculated by the buffering time of the 3 frames, i.e., 1 / decoding input frame rate * 3. In summary, the single-frame decoding latency threshold of this application is the reciprocal of the product of the sum of the number of frames buffered and 1, multiplied by the decoding input frame rate; that is, the criterion for determining whether the single-frame decoding latency meets the standard is less than 1 / decoding input frame rate * (1 + number of frames buffered).
[0181] At this point, determining whether the initial single-frame decoding delay meets the single-frame decoding delay threshold includes the following steps:
[0182] Step 21: If the initial single-frame decoding delay is less than or equal to the reciprocal, then the initial single-frame decoding delay is determined to meet the single-frame decoding delay threshold.
[0183] Step 22: If the initial single-frame decoding delay is greater than the reciprocal, then it is determined that the initial single-frame decoding delay does not meet the single-frame decoding delay threshold.
[0184] In some embodiments, if the initial rendering parameters include at least one of rendering control type and rendering frame dropping strategy, in case 2 above, adjusting the initial rendering parameters to obtain target rendering parameters that meet the single-frame decoding latency threshold includes the following step B:
[0185] Step B: Adjust at least one of the rendering control type and rendering frame dropping strategy to obtain target rendering parameters that meet the single-frame decoding delay threshold.
[0186] In this embodiment of the application, if the decoding output frame rate meets the standard (i.e., the decoding output frame rate is approximately equal to the decoding output frame rate threshold), but the single-frame decoding latency does not meet the standard (i.e., the single-frame decoding latency is greater than the single-frame decoding latency threshold), the reasons may be twofold: First, the rendering control is underperforming at the current rendering frame rate, in which case it is necessary to switch the rendering control; second, the decoding output frame rate has reached the upper limit of the device's rendering frame rate, in which case it is necessary to enable the rendering frame dropping strategy.
[0187] In some embodiments, there are no restrictions on the order and manner in which at least one of the rendering control type and the rendering frame dropping strategy is adjusted. For example, the rendering control type may be adjusted first, followed by the rendering frame dropping strategy, or the rendering frame dropping strategy may be adjusted first, followed by the rendering control type, or the rendering frame dropping strategy and the rendering control type may be adjusted simultaneously.
[0188] In some embodiments, step B above includes the following step B1:
[0189] Step B1: Following the adjustment method of first adjusting the rendering control type and then adjusting the rendering frame dropping strategy, adjust at least one of the rendering control type and rendering frame dropping strategy to obtain the target rendering parameters that meet the single-frame decoding latency threshold.
[0190] In step B1, if the initial single-frame decoding latency is greater than the single-frame decoding latency threshold, it indicates that the initial rendering parameters do not meet the requirements. First, the rendering control type is adjusted, and the single-frame decoding latency corresponding to the adjusted rendering control type is determined. If this single-frame decoding latency is less than or equal to the single-frame decoding latency threshold, the current rendering control type and rendering frame dropping strategy are determined as the target rendering parameters. If the single-frame decoding latency is greater than the single-frame decoding latency threshold, the rendering frame dropping strategy is adjusted, for example, changing from no frame dropping to frame dropping. The single-frame decoding latency corresponding to the adjusted rendering frame dropping strategy is determined. If this single-frame decoding latency is less than or equal to the single-frame decoding latency threshold, the current rendering control type and rendering frame dropping strategy are determined as the target rendering parameters. If the single-frame decoding latency is greater than the single-frame decoding latency threshold, the above steps are repeated until the target rendering parameters that meet the requirements are obtained.
[0191] In some embodiments, if, after adjusting the rendering frame dropping strategy and rendering control type according to the above method, the target rendering parameters that meet the single-frame decoding latency threshold are not obtained, the method of this application embodiment further includes:
[0192] Step C: Adjust the video encoding multi-frame reference in the initial decoding and rendering configuration information to the video encoding single-frame reference.
[0193] As can be seen from the above, the encoding bitrate parameter in the initial decoding and rendering configuration information is a video encoding multi-frame reference. The multi-frame reference needs to accumulate a large number of reference frames, which increases the single-frame decoding latency. In order to reduce the single-frame decoding latency, the video encoding multi-frame reference in the initial decoding and rendering configuration information can be adjusted to the video encoding single-frame reference.
[0194] Case 3: If the initial decoding output frame rate does not meet the decoding output frame rate threshold, and the initial single-frame decoding latency does not meet the single-frame decoding latency threshold, then the initial decoding parameters are adjusted to obtain the target decoding parameters that meet the decoding output frame rate threshold. Based on the target decoding parameters, the initial rendering parameters are adjusted to obtain the target rendering parameters that meet the single-frame decoding latency threshold.
[0195] In scenario 3, the initial decoded output frame rate does not meet the decoded output frame rate threshold, and the initial single-frame decoded latency also does not meet the single-frame decoded latency threshold. This indicates that the initial decoded parameters and initial rendering parameters in the initial decoded rendering configuration information do not meet the requirements and need to be adjusted. In this embodiment, the initial decoded parameters are first adjusted to obtain target decoded parameters that meet the decoded output frame rate threshold, as described in scenario 1 above. Based on these target decoded parameters, the initial rendering parameters are then adjusted to obtain target rendering parameters that meet the single-frame decoded latency threshold, as described in scenario 2 above.
[0196] Based on the above three scenarios, target decoding parameters and target rendering parameters are obtained. These target decoding parameters and target rendering parameters constitute the target decoding and rendering configuration information of the terminal device.
[0197] According to the method described above, the terminal device in this application embodiment obtains target decoding and rendering configuration information. This target decoding and rendering configuration information can be understood as the optimal decoding and rendering configuration parameters of the terminal device, which the terminal device stores in...
[0198] In this embodiment, upon initial startup, the terminal device probes for the optimal decoding and rendering configuration based on the initial decoding and rendering configuration information to obtain the target decoding and rendering configuration information for the terminal device, and stores the target decoding and rendering configuration information that matches the terminal device. Subsequent startups directly use the locally stored target decoding and rendering configuration information for decoding.
[0199] The target decoding and rendering configuration information obtained in this application embodiment meets the requirements of high resolution, high frame rate and low latency, and can be understood as the optimal configuration of the terminal device.
[0200] It should be noted that in the embodiments of this application, during a parameter adjustment process, only the adjusted parameter changes, while other parameters remain unchanged.
[0201] In some embodiments, after the terminal device determines the target decoding and rendering configuration information according to the above method, it sends the target decoding and rendering configuration information to the cloud server so that the cloud server can encode according to the target decoding and rendering configuration information to obtain a bitstream that conforms to the decoding capability of the terminal device.
[0202] The method for determining decoding configuration parameters provided in this application involves a terminal device determining initial decoding and rendering configuration information, including initial decoding parameters and initial rendering parameters. Under this initial decoding configuration information, a test bitstream is decoded to obtain the initial decoding output frame rate and initial single-frame decoding latency corresponding to the initial decoding and rendering configuration information. If at least one of the initial decoding output frame rate and initial single-frame decoding latency does not meet a corresponding threshold, at least one of the initial decoding parameters and initial rendering parameters is adjusted to obtain the target decoding and rendering configuration information for the terminal device. The target decoding and rendering configuration information satisfies both the decoding output frame rate and single-frame decoding latency corresponding to the target decoding and rendering configuration. In other words, this application embodiment fully considers the decoding and rendering parameters of the terminal device when determining the target decoding and rendering configuration. By adjusting the decoding and rendering parameters, a target decoding and rendering configuration information that satisfies low latency and high resolution scenarios is determined. Using this target decoding and rendering configuration information for encoding and decoding can improve the quality of encoding and decoding. In addition, in some embodiments, the initial decoding and rendering configuration information is the optimal decoding and rendering configuration information for multiple devices. By performing a limited number of probes based on the initial decoding and rendering configuration information, the target decoding and rendering configuration information of the terminal device can be found from the complex configuration combinations, which is highly efficient.
[0203] Figure 6 A flowchart illustrating a method for determining decoding configuration parameters according to an embodiment of this application. Figure 6 One specific embodiment of this application includes:
[0204] S601, determine the initial decoding and rendering configuration information of the terminal device.
[0205] The initial decoding and rendering configuration information includes the initial decoding parameters and initial rendering parameters of the terminal device.
[0206] Optionally, the initial decoding and rendering configuration information is the target decoding and rendering configuration information with the highest probability of occurrence among the target decoding and rendering configuration information of N preset terminal devices, where N is a positive integer greater than 1.
[0207] S602. Determine the initial decoding output frame rate and initial single-frame decoding delay corresponding to the initial decoding rendering configuration information.
[0208] For details, please refer to the description in S402 above, which will not be repeated here.
[0209] S603. Determine whether the initial decoded output frame rate meets the corresponding threshold.
[0210] If the initial decoded output frame rate meets the corresponding threshold, then execute S611.
[0211] If the initial decoded output frame rate does not meet the corresponding threshold, then execute S604 as follows.
[0212] S604. Change the decoding chip type and redetermine the decoding output frame rate.
[0213] S605 selects the optimal decoding chip based on the decoded output frame rate.
[0214] In this embodiment of the application, if the terminal device has multiple decoding chips, the decoding output frame rate corresponding to each decoding chip is determined, and the decoding chip with the optimal decoding output frame rate is selected as the optimal decoding chip. The optimal decoding output frame rate can be understood as having the smallest difference from a decoding output frame rate threshold.
[0215] S606. Determine whether the decoding output frame rate corresponding to the optimal decoding chip meets the corresponding threshold.
[0216] If the condition is met, execute S611 as follows; otherwise, execute S607 as follows.
[0217] S607. Reduce the decoding input frame rate and redetermine the decoding output frame rate.
[0218] S608. Determine whether the decoded output frame rate meets the corresponding threshold.
[0219] If the condition is met, execute S611 as follows; otherwise, execute S609 as follows.
[0220] S609. Reduce the decoding resolution and redetermine the decoding output frame rate.
[0221] S610. Determine whether the decoded output frame rate meets the corresponding threshold.
[0222] If the condition is met, execute step S611; otherwise, return to step S607.
[0223] S611. Determine whether the initial single-frame decoding delay meets the corresponding threshold.
[0224] If the condition is met, execute S616; otherwise, execute S612.
[0225] S612. Keep the decoding parameters unchanged, adjust the rendering control type, and redetermine the single-frame decoding delay.
[0226] S613. Determine whether the single-frame decoding delay meets the corresponding threshold.
[0227] If the conditions are met, execute S616; otherwise, execute S614.
[0228] S614. Adjust the rendering frame dropping strategy and redetermine the single-frame decoding latency.
[0229] S615. Determine whether the single-frame decoding delay meets the corresponding threshold.
[0230] If the conditions are met, execute S616; otherwise, execute S617.
[0231] S616, Obtain the target decoding and rendering configuration information of the terminal device.
[0232] S617. Adjust the video encoding multi-frame reference in the initial decoding and rendering configuration information to the video encoding single-frame reference.
[0233] In some embodiments, if the frame-stacking model is also subjected to the steps of S617 above, the optimal encoding parameters can be obtained.
[0234] In this embodiment of the application, the optimal decoding and rendering configuration detection process is as follows: Figure 6 As shown: First, the initial decoding and rendering configuration information is probed to obtain the initial decoding output frame rate and initial single-frame decoding latency. If both meet the requirements, the initial decoding and rendering configuration is considered optimal, and the optimal encoding parameters are found based on this decoding configuration. If the initial decoding output frame rate meets the requirements but the single-frame decoding latency does not, the rendering parameters need to be modified. If the initial decoding output frame rate does not meet the requirements, the decoding chip type is modified first, then the decoding input frame rate is reduced, and then the decoding resolution is reduced, until a decoding output frame rate that meets the conditions is found.
[0235] Taking the Windows platform mentioned above as an example, the configurations that need to be detected are as follows: there are two types of decoding chips, H264 and H265; two types of decoding resolutions, 1080p and 720p; two types of decoding frame rates, 60fps and 50fps; two types of rendering controls, Direct3D and OpenGL; two types of rendering frame dropping strategies, frame dropping and no frame dropping; and two types of encoding parameters, single frame and multi-frame reference. There are a total of 64 combinations, and 64 detections are required to complete the detection.
[0236] To reduce the number of probes, the initial decoding and rendering configuration information mentioned above is the target decoding and rendering configuration information with the highest probability of occurrence among the target decoding and rendering configuration information of N preset terminal devices. After the first probe, most devices will show that the initial decoding output frame rate meets the standard, the initial single-frame decoding latency meets the standard, and there is no frame hoarding phenomenon during decoding. At this time, the initial decoding and rendering configuration information is the optimal configuration, and only one probe is needed. For models with frame hoarding, an additional probe of the encoded single-frame reference bitstream is required, that is, the video encoding multi-frame reference in the initial decoding and rendering configuration information is adjusted to the video encoding single-frame reference, requiring a total of two probes. If the initial decoding output frame rate meets the standard but the initial single-frame decoding latency does not meet the standard (i.e., decoding capability is sufficient, rendering performance is insufficient), only 3 additional probes are needed to select the optimal rendering configuration. Even in the worst case, a maximum of 10 probes are needed to find the optimal configuration, far less than the 64 probes required for a complete probe.
[0237] It should be understood that Figures 4 to 6 This is merely an example of what is being done and should not be construed as limiting the scope of this application.
[0238] The preferred embodiments of this application have been described in detail above with reference to the accompanying drawings. However, this application is not limited to the specific details of the above embodiments. Within the scope of the technical concept of this application, various simple modifications can be made to the technical solutions of this application, and these simple modifications all fall within the protection scope of this application. For example, the various specific technical features described in the above specific embodiments can be combined in any suitable manner without contradiction. To avoid unnecessary repetition, this application will not describe the various possible combinations separately. Furthermore, various different embodiments of this application can also be arbitrarily combined, as long as they do not violate the spirit of this application, they should also be considered as the content disclosed in this application.
[0239] The above text combined Figures 4 to 6 The method embodiments of this application have been described in detail below, and the device embodiments of this application are described in detail below.
[0240] Figure 7 This is a schematic diagram of a device for determining decoding configuration parameters according to an embodiment of this application. The device is applied to a terminal device, and the device 10 includes:
[0241] The determining unit 11 is used to determine the initial decoding and rendering configuration information of the terminal device, wherein the initial decoding and rendering configuration information includes the initial decoding parameters and initial rendering parameters of the terminal device;
[0242] The detection unit 12 is used to decode the test bitstream under the initial decoding configuration information to obtain the initial decoding output frame rate and initial single-frame decoding delay corresponding to the initial decoding rendering configuration information.
[0243] The adjustment unit 13 is used to adjust at least one of the initial decoding parameters and initial rendering parameters if at least one of the initial decoding output frame rate and the initial single-frame decoding delay does not meet the corresponding threshold, so as to obtain the target decoding rendering configuration information of the terminal device, wherein the decoding output frame rate and single-frame decoding delay corresponding to the target decoding rendering configuration both meet the corresponding threshold.
[0244] In some embodiments, the initial decoding and rendering configuration information is the target decoding and rendering configuration information with the highest probability of occurrence among the target decoding and rendering configuration information of N preset terminal devices, where N is a positive integer greater than 1.
[0245] In some embodiments, the threshold includes a decoding output frame rate threshold and a single-frame decoding delay threshold. The adjustment unit 13 is specifically used to adjust the initial decoding parameters if the initial decoding output frame rate does not meet the decoding output frame rate threshold and the initial single-frame decoding delay meets the single-frame decoding delay threshold, so as to obtain target decoding parameters that meet the decoding output frame rate threshold and determine the rendering parameters as target rendering parameters.
[0246] If the initial decoding output frame rate meets the decoding output frame rate threshold, and the initial single-frame decoding latency does not meet the single-frame decoding latency threshold, then the initial rendering parameters are adjusted to obtain target rendering parameters that meet the single-frame decoding latency threshold, and the initial decoding parameters are determined as target decoding parameters.
[0247] If the initial decoding output frame rate does not meet the decoding output frame rate threshold, and the initial single-frame decoding delay does not meet the single-frame decoding delay threshold, then the initial decoding parameters are adjusted to obtain target decoding parameters that meet the decoding output frame rate threshold. Based on the target decoding parameters, the initial rendering parameters are adjusted to obtain target rendering parameters that meet the single-frame decoding delay threshold.
[0248] The target decoding parameters and the target rendering parameters constitute the target decoding and rendering configuration information.
[0249] In some embodiments, the decoding output frame rate threshold is the decoding input frame rate. The adjustment unit 13 is further configured to determine that the initial decoding output frame rate does not meet the decoding output frame rate threshold if the initial decoding output frame rate is less than the decoding input frame rate; and to determine that the initial decoding output frame rate meets the decoding output frame rate threshold if the initial decoding output frame rate is equal to the decoding input frame rate.
[0250] In some embodiments, the initial decoding parameters include at least one of the decoding chip type, decoding input frame rate, and video resolution of the terminal device. The adjustment unit 13 is specifically used to adjust at least one of the decoding chip type, decoding input frame rate, and video resolution to obtain target decoding parameters that meet the decoding output frame rate threshold.
[0251] In some embodiments, the adjustment unit 13 is specifically used to adjust at least one of the decoding chip type, decoding input frame rate, and video resolution according to the adjustment method of first adjusting the decoding chip type, and then adjusting the decoding input frame rate and video resolution, so as to obtain the target decoding parameters.
[0252] In some embodiments, the adjustment unit 13 is further configured to, if the decoding output frame rate corresponding to the reduction of the decoding input frame rate and video resolution does not meet the decoding output frame rate threshold, reduce the decoding input frame rate and video resolution again until a decoding input frame rate and video resolution that meet the decoding output frame rate threshold are obtained.
[0253] In some embodiments, if the initial decoding rendering configuration information does not include the decoding chip type, the adjustment unit 13 is further configured to obtain M decoding chips of the terminal device, where M is a positive integer greater than 1; determine the decoding output frame rate corresponding to each of the M decoding chips under the initial decoding rendering configuration information; and add the type corresponding to the decoding chip with the largest decoding output frame rate among the M decoding chips to the initial decoding rendering configuration information to obtain new initial decoding rendering configuration information.
[0254] In some embodiments, the single-frame decoding delay threshold is the reciprocal of the sum of the number of frames and 1, multiplied by the decoding input frame rate. The adjustment unit 13 is further configured to determine that the initial single-frame decoding delay meets the single-frame decoding delay threshold if the initial single-frame decoding delay is less than or equal to the reciprocal; and to determine that the initial single-frame decoding delay does not meet the single-frame decoding delay threshold if the initial single-frame decoding delay is greater than the reciprocal.
[0255] In some embodiments, the initial rendering parameters include at least one of a rendering control type and a rendering frame dropping strategy. The adjustment unit 13 is specifically used to adjust at least one of the rendering control type and the rendering frame dropping strategy to obtain target rendering parameters that meet the single-frame decoding delay threshold.
[0256] In some embodiments, the adjustment unit 13 is specifically used to adjust at least one of the rendering control type and the rendering frame dropping strategy according to the adjustment method of first adjusting the rendering control type and then adjusting the rendering frame dropping strategy, so as to obtain the target rendering parameters that meet the single frame decoding delay threshold.
[0257] In some embodiments, if adjusting the rendering parameters fails to obtain the target rendering parameters that satisfy the single-frame decoding delay threshold, the adjustment unit 13 is further configured to adjust the video encoding multi-frame reference in the initial decoding rendering configuration information to the video encoding single-frame reference.
[0258] In some embodiments, the adjustment unit 13 is further configured to send a second indication information to the cloud server, the second indication information being used to indicate the target decoding and rendering configuration information of the terminal device, so that the cloud server performs encoding according to the target decoding and rendering configuration information.
[0259] It should be understood that the device embodiments and method embodiments can correspond to each other, and similar descriptions can be referred to the method embodiments. To avoid repetition, further details will not be provided here. Specifically, Figure 7 The apparatus 10 shown can perform the above-described method embodiments, and the foregoing and other operations and / or functions of each module in the apparatus 7 are respectively for implementing the above-described methods. Figure 4 The method embodiments shown are not described in detail here for the sake of simplicity.
[0260] The apparatus of this application embodiment has been described above from the perspective of functional modules in conjunction with the accompanying drawings. It should be understood that this functional module can be implemented in hardware, in software instructions, or in a combination of hardware and software modules. Specifically, the steps of the method embodiments in this application can be completed by integrated logic circuits in the processor's hardware and / or by software instructions. The steps of the method disclosed in this application embodiment can be directly embodied as being executed by a hardware decoding processor, or by a combination of hardware and software modules in the decoding processor. Optionally, the software module can reside in a mature storage medium in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, etc. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps in the above method embodiments.
[0261] Figure 8 This is a schematic block diagram of an electronic device provided in an embodiment of this application. The electronic device may be the encoder and / or decoder described above.
[0262] like Figure 8 As shown, the electronic device 40 may include:
[0263] Memory 41 and memory 42 are provided. Memory 41 is used to store computer programs and to transfer the program code to memory 42. In other words, memory 42 can call and run computer programs from memory 41 to implement the methods in the embodiments of this application.
[0264] For example, the memory 42 can be used to execute the above-described method embodiments according to instructions in the computer program.
[0265] In some embodiments of this application, the memory 42 may include, but is not limited to:
[0266] General-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
[0267] In some embodiments of this application, the memory 41 includes, but is not limited to:
[0268] Volatile memory and / or non-volatile memory. Non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. Volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDR SDRAM), Enhanced Synchronous DRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and Direct Rambus RAM (DR RAM).
[0269] In some embodiments of this application, the computer program may be divided into one or more modules, which are stored in the memory 41 and executed by the memory 42 to perform the method provided in this application. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which describe the execution process of the computer program in the video production device.
[0270] like Figure 8 As shown, the electronic device 40 may further include:
[0271] Transceiver 40, transceiver 43 can be connected to memory 42 or memory 41.
[0272] The memory 42 can control the transceiver 43 to communicate with other devices; specifically, it can send information or data to other devices or receive information or data sent by other devices. The transceiver 43 may include a transmitter and a receiver. The transceiver 43 may further include antennas, and the number of antennas can be one or more.
[0273] It should be understood that the various components in the video production equipment are connected through a bus system, which includes a data bus, a power bus, a control bus, and a status signal bus.
[0274] This application also provides a computer storage medium storing a computer program thereon, which, when executed by a computer, enables the computer to perform the methods of the above-described method embodiments. Alternatively, embodiments of this application also provide a computer program product containing instructions that, when executed by a computer, cause the computer to perform the methods of the above-described method embodiments.
[0275] When implemented using software, it can be implemented entirely or partially as a computer program product. This computer program product includes one or more computer instructions. When these computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., digital video disc (DVD)), or a semiconductor medium (e.g., solid-state disk (SSD)).
[0276] Those skilled in the art will recognize that the modules and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0277] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or modules may be electrical, mechanical, or other forms.
[0278] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical modules; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. For example, the functional modules in the various embodiments of this application may be integrated into one processing module, or each module may exist physically separately, or two or more modules may be integrated into one module.
[0279] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method for determining decoding configuration parameters, characterized in that, Applied to terminal devices, including: Determine the initial decoding and rendering configuration information of the terminal device. The initial decoding and rendering configuration information includes the initial decoding parameters and initial rendering parameters of the terminal device. The initial decoding and rendering configuration information is the target decoding and rendering configuration information with the highest probability of occurrence among N preset target decoding and rendering configuration information of terminal devices, where N is a positive integer greater than 1. The test bitstream is decoded under the initial decoding and rendering configuration information to obtain the initial decoding output frame rate and initial single-frame decoding latency corresponding to the initial decoding and rendering configuration information. If at least one of the initial decoding output frame rate and the initial single-frame decoding delay does not meet the corresponding threshold, then at least one of the initial decoding parameters and the initial rendering parameters is adjusted to obtain the target decoding rendering configuration information of the terminal device, wherein the decoding output frame rate and single-frame decoding delay corresponding to the target decoding rendering configuration both meet the corresponding threshold.
2. The method according to claim 1, characterized in that, The thresholds include a decoding output frame rate threshold and a single-frame decoding latency threshold. If at least one of the initial decoding output frame rate and the initial single-frame decoding latency does not meet the corresponding threshold, at least one of the initial decoding parameters and the initial rendering parameters is adjusted to obtain the target decoding and rendering configuration information of the terminal device, including: If the initial decoding output frame rate does not meet the decoding output frame rate threshold, and the initial single-frame decoding delay meets the single-frame decoding delay threshold, then the initial decoding parameters are adjusted to obtain target decoding parameters that meet the decoding output frame rate threshold, and the rendering parameters are determined as target rendering parameters. If the initial decoding output frame rate meets the decoding output frame rate threshold, and the initial single-frame decoding latency does not meet the single-frame decoding latency threshold, then the initial rendering parameters are adjusted to obtain target rendering parameters that meet the single-frame decoding latency threshold, and the initial decoding parameters are determined as target decoding parameters. If the initial decoding output frame rate does not meet the decoding output frame rate threshold, and the initial single-frame decoding delay does not meet the single-frame decoding delay threshold, then the initial decoding parameters are adjusted to obtain target decoding parameters that meet the decoding output frame rate threshold. Based on the target decoding parameters, the initial rendering parameters are adjusted to obtain target rendering parameters that meet the single-frame decoding delay threshold. The target decoding parameters and the target rendering parameters constitute the target decoding and rendering configuration information.
3. The method according to claim 2, characterized in that, The decoding output frame rate threshold is the decoding input frame rate, and the method further includes: If the initial decoded output frame rate is less than the decoded input frame rate, then it is determined that the initial decoded output frame rate does not meet the decoded output frame rate threshold. If the initial decoded output frame rate is equal to the decoded input frame rate, then the initial decoded output frame rate is determined to satisfy the decoded output frame rate threshold.
4. The method according to claim 3, characterized in that, The initial decoding parameters include at least one of the decoding chip type, decoding input frame rate, and video resolution of the terminal device. Adjusting the initial decoding parameters to obtain target decoding parameters that satisfy the decoding output frame rate threshold includes: At least one of the decoding chip type, decoding input frame rate, and video resolution is adjusted to obtain target decoding parameters that satisfy the decoding output frame rate threshold.
5. The method according to claim 4, characterized in that, The step of adjusting at least one of the decoding chip type, decoding input frame rate, and video resolution to obtain target decoding parameters that satisfy the decoding output frame rate threshold includes: The target decoding parameters are obtained by adjusting at least one of the decoding chip type, decoding input frame rate, and video resolution, by first adjusting the decoding chip type and then adjusting the decoding input frame rate and video resolution.
6. The method according to claim 5, characterized in that, The method further includes: If the decoding output frame rate corresponding to the reduced decoding input frame rate and video resolution does not meet the decoding output frame rate threshold, the decoding input frame rate and video resolution are reduced again until a decoding input frame rate and video resolution that meets the decoding output frame rate threshold are obtained.
7. The method according to any one of claims 1-6, characterized in that, If the initial decoding rendering configuration information does not include the decoding chip type, the method further includes: Obtain M decoding chips from the terminal device, where M is a positive integer greater than 1; Determine the decoding output frame rate for each of the M decoding chips under the initial decoding rendering configuration information; Add the type corresponding to the decoding chip with the highest decoding output frame rate among the M decoding chips to the initial decoding rendering configuration information to obtain new initial decoding rendering configuration information.
8. The method according to any one of claims 2-6, characterized in that, The single-frame decoding latency threshold is the reciprocal of the sum of the number of frames and 1, multiplied by the decoding input frame rate. The method further includes: If the initial single-frame decoding delay is less than or equal to the reciprocal, then the initial single-frame decoding delay is determined to satisfy the single-frame decoding delay threshold. If the initial single-frame decoding delay is greater than the reciprocal, then it is determined that the initial single-frame decoding delay does not meet the single-frame decoding delay threshold.
9. The method according to claim 8, characterized in that, The initial rendering parameters include at least one of a rendering control type and a rendering frame dropping strategy. Adjusting the initial rendering parameters to obtain target rendering parameters that satisfy the single-frame decoding latency threshold includes: Adjust at least one of the rendering control type and the rendering frame dropping strategy to obtain target rendering parameters that meet the single-frame decoding delay threshold.
10. The method according to claim 9, characterized in that, The step of adjusting at least one of the rendering control type and the rendering frame dropping strategy to obtain target rendering parameters that satisfy the single-frame decoding latency threshold includes: By first adjusting the rendering control type and then adjusting the rendering frame dropping strategy, at least one of the rendering control type and the rendering frame dropping strategy is adjusted to obtain the target rendering parameters that meet the single-frame decoding latency threshold.
11. The method according to claim 9, characterized in that, If adjusting the rendering parameters fails to obtain the target rendering parameters that satisfy the single-frame decoding latency threshold, the method further includes: Adjust the video encoding multi-frame reference in the initial decoding and rendering configuration information to a video encoding single-frame reference.
12. The method according to any one of claims 1-6, characterized in that, The method further includes: A second instruction message is sent to the cloud server. The second instruction message is used to instruct the target decoding and rendering configuration information of the terminal device so that the cloud server can encode according to the target decoding and rendering configuration information.
13. A device for determining decoding configuration parameters, characterized in that, Applied to terminal devices, including: A determining unit is used to determine the initial decoding and rendering configuration information of a terminal device. The initial decoding and rendering configuration information includes the initial decoding parameters and initial rendering parameters of the terminal device. The initial decoding and rendering configuration information is the target decoding and rendering configuration information of N terminal devices with the highest probability of occurrence, where N is a positive integer greater than 1. The detection unit is used to decode the test bitstream under the initial decoding rendering configuration information to obtain the initial decoding output frame rate and initial single-frame decoding delay corresponding to the initial decoding rendering configuration information. An adjustment unit is configured to adjust at least one of the initial decoding parameters and initial rendering parameters if at least one of the initial decoding output frame rate and the initial single-frame decoding delay does not meet the corresponding threshold, thereby obtaining the target decoding rendering configuration information of the terminal device, wherein the decoding output frame rate and single-frame decoding delay corresponding to the target decoding rendering configuration both meet the corresponding threshold.
14. An electronic device, characterized in that, include: A processor and a memory, the memory being used to store a computer program, the processor being used to invoke and run the computer program stored in the memory to perform the method of any one of claims 1 to 12.
15. A computer storage medium, characterized in that, It includes computer program instructions that cause a computer to perform the method of any one of claims 1 to 12.