Address alignment SIMD (Single Instruction Multiple Data) acceleration method of array addition operation assembly library program

An address alignment and assembly library technology, applied in machine execution devices, concurrent instruction execution, etc., can solve problems such as insignificant optimization effect and difficulty in popularization and application.

Inactive Publication Date: 2013-05-01
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The above methods emphasize the importance of address alignment memory access to SIMD programming from different aspects, and propose specific solutions for specific applications, but the above solutions are limited to using high-level language programming to preprocess arrays to improve compilation. The optimized performance of the device, the optimization effect is not obvious and it is difficult to popularize and apply in practice

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Address alignment SIMD (Single Instruction Multiple Data) acceleration method of array addition operation assembly library program
  • Address alignment SIMD (Single Instruction Multiple Data) acceleration method of array addition operation assembly library program
  • Address alignment SIMD (Single Instruction Multiple Data) acceleration method of array addition operation assembly library program

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0125] figure 1 Example for vector register format

[0126] The illustrated vector register includes w double-precision floating-point components, that is, the vector width is w, and the data width size is the number of bytes occupied by the double-precision floating-point type.

[0127] figure 2 Is the general flowchart of the present invention. The present invention comprises the following steps:

[0128] Step 1: Obtain the SIMD vector width w and data width size from the target architecture information.

[0129] Step 2: Calculate the array X address alignment offset.

[0130] Step 3: Calculate the array Y address alignment offset.

[0131] Step 4: Determine whether the addresses of array X and array Y are aligned according to the address alignment offset. If they are aligned, execute step 5; otherwise, execute step 6.

[0132] Step 5: Perform vector operations on X and Y.

[0133] Step 6: Carry out vector assembly and mixed operations on X and Y.

[0134] Step Seve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an address alignment SIMD (Single Instruction Multiple Data) acceleration method of an array addition operation assembly library program, aiming at improving the execution speed of the array addition operation assembly library program. The technical scheme of the invention is as follows: the address alignment SIMD acceleration method comprises the following steps of: acquiring an SIMD vector width w and a data width size from a target system structure first, then calculating the address alignment offset of an array X and an array Y, judging whether addresses of the array X and the array Y are aligned according to the address alignment offset, and if so, directly carrying out vector addition operation on the array X and the array Y; and if not, carrying out vector assembly and hybrid operation on the array X and the array Y, namely carrying out scalar operation on the front parts of the array X and the array Y, carrying out vector assembly and vector operation on the middle parts of the array X and the array Y by using a register mask and carrying out scalar operation on parts, which do not meet the vector operation requirement, at the tail parts of the array X and the array Y. The address alignment SIMD acceleration method disclosed by the invention has the capabilities of realizing data access and storage of the assembly library program based on address alignment, accelerating the SIMD program operation and promoting the SIMD calculation performance.

Description

technical field [0001] The invention relates to an address-aligned SIMD (Single Instruction Multiple Data, Single Instruction Multiple Data) accelerated calculation method for array addition operations, in particular to an address-aligned SIMD acceleration method for array addition operations assembly library programs. Background technique [0002] The traditional CPU scalar floating-point calculation unit can only perform one floating-point operation at a time, while the SIMD functional unit can complete multiple floating-point operations at a time, which is an important part to improve the speed of the microprocessor. Design SIMD operation program. [0003] Existing SIMD extension processors are very sensitive to data access address behavior. In general, SIMD extensions only support sequential memory access data loads and stores, and only address-aligned memory access data loads and stores. Such as the AltiVec extension of the PowerPC processor, the MVI extension of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/38
Inventor 迟利华刘杰甘新标晏益慧徐涵胡庆丰龚春叶冯华蒋杰
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products