Method for optimizing DCT quick algorithm based on parallel processing in AVS
A parallel processing and fast algorithm technology, which is applied in the field of audio and video codec, can solve the problems of reducing the amount of calculation and improving the calculation speed
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment approach
[0035] The invention provides a method for optimizing the DCT fast algorithm based on parallel processing in the AVS standard, comprising the following steps:
[0036] Step 1. Data alignment:
[0037] Step 1.1, align the data to the position of the whole byte in one cycle, and 16-byte alignment is required for 128-bit registers;
[0038] Step 1.2, fetching the aligned data in the 8×8 data block into the corresponding instruction register one by one, such as MMX register (64-bit register), SSE2 register (128-bit register);
[0039] Step 2. Temporary data storage when registers are required when the register bank is full:
[0040] Step 2.1, dividing a temporary data storage space;
[0041] Step 2.2, storing the data in the register into the temporary memory space;
[0042] Step 2.3, taking out the data from the temporary data storage space;
[0043] Step 3. Instruction pairing: complete two different instruction operations without conflicts in the same cycle;
[0044] Step ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 