Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A dnxhd VLC encoding method based on cuda architecture

A coding method and coding technology, applied in image communication, digital video signal modification, electrical components, etc., can solve the problem that the GPU utilization rate is only 0.42%, and achieve the effect of improving GPU usage efficiency and VLC coding speed

Active Publication Date: 2020-04-07
HANGZHOU ARCVIDEO TECHNOLOGY CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] For a video sequence with a resolution of 1920x1080, slice parallelism is used, that is, only 68 threads can be used in parallel. For a Tesla K10 graphics card, each GPU can have up to 16384 threads in parallel, so the GPU utilization is only 0.42%.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A dnxhd VLC encoding method based on cuda architecture
  • A dnxhd VLC encoding method based on cuda architecture

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] Embodiments of the present invention are described in detail below.

[0023] An embodiment of the present invention provides a DNxHD VLC encoding method based on CUDA framework, please refer to figure 2 , specifically, includes the following steps:

[0024] After quantizing the DCT coefficients of 8x8block, load them into the shared memory;

[0025] 64 threads process a macro block synchronously;

[0026] When threadIdx=0, the 0th thread uses differential pulse code modulation (DPCM) to encode the direct current coefficient (DC coefficient);

[0027] When threadIdx>0, then the threadIdx thread calculates the threadIdx AC coefficient (AC coefficient) and performs VLC encoding;

[0028] 64 threads simultaneously process one encoded macroblock;

[0029] After the calculation of 8 blocks in an encoded macroblock is completed, the encoding result is saved, and the encoding of the next macroblock begins.

[0030] Preferably, in the step of encoding the DC coefficient by...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a DNxHD VLC encoding method based on CUDA architecture. The method comprises the following steps: performing decomposition optimization on an original algorithm, fining parallel computing granularity, encoding in a coefficient level parallel manner, encoding a macro block by every 64 threads, and encoding a DC coefficient by each thread each time, therefore a GPU can operate in a full load manner, the use efficiency of the GPU is improved, and thus the VLC encoding speed is improved.

Description

technical field [0001] The invention relates to a DNxHD VLC encoding method based on CUDA architecture. Background technique [0002] In the VC-3 / DNxHD standard, the input video data format is YUV 4:2:2, the size of each block (block) is fixed at 8x8, and each macroblock (MB for short) is divided into two parts: the luminance component The 16x16 part composed of four blocks and the 16x16 part composed of four blocks of the corresponding color difference component, that is, each macroblock contains eight 8x8 blocks. [0003] For frame coding of a video sequence with resolution 1920x1080, there are 120x68 macroblocks per frame. Although the concept of Slice is not explicitly proposed in the standard, 120 macroblocks per row constitute a Slice in actual coding. The VLC coding mentioned in the present invention mainly refers to coding the DCT coefficients of the 8x8 block. [0004] Usually, the slice parallel method is adopted when CPU encoding is used (in each slice, the 8x8...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04N19/176H04N19/436H04N19/625
CPCH04N19/176H04N19/436H04N19/625
Inventor 王伟黄进廖义
Owner HANGZHOU ARCVIDEO TECHNOLOGY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products