Unlock instant, AI-driven research and patent intelligence for your innovation.

High-performance implementation method of multi-dimensional fft on domestic Shenwei 26010 many-core processor

A technology of many-core processors and implementation methods, applied in the fields of electrical digital data processing, special data processing applications, instruments, etc.

Active Publication Date: 2020-02-11
INST OF SOFTWARE - CHINESE ACAD OF SCI
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] The technical solution of the present invention is to overcome the problem that there is no open source FFT algorithm library that can directly use the powerful computing power of the computing core of this platform, and provide a high-performance implementation method of multi-dimensional FFT on the domestic Shenwei 26010 many-core processor , use the two-layer decomposed FFT algorithm structure, Bluestein algorithm, batch multi-row one-dimensional FFT calculation, batch multi-column one-dimensional FFT calculation and a variety of high-performance optimization methods to improve the performance of multi-dimensional FFT

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-performance implementation method of multi-dimensional fft on domestic Shenwei 26010 many-core processor
  • High-performance implementation method of multi-dimensional fft on domestic Shenwei 26010 many-core processor
  • High-performance implementation method of multi-dimensional fft on domestic Shenwei 26010 many-core processor

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach

[0044] Such as figure 1 As shown, the present invention is a high-performance implementation method of multi-dimensional FFT on the domestic Shenwei 26010 many-core processor. The design framework includes four layers: interface layer, main core layer, slave core layer and core layer. The specific implementation method is as follows:

[0045] 1. Interface layer: descriptor operation

[0046] (1) Set up descriptors, set the basic information such as data precision, data dimensions, data scale and transformation type required for FFT calculation; data precision is two types of double precision and single precision; the data dimension is multi-dimensional, and the data The scale is the size of the input sequence, which can be any scale; the transformation type is complex-to-complex transformation.

[0047] (2) Set the descriptor of the input data: set the input and output span of the FFT and the calculation parameters of the batch type FFT, and the parameters can be specified by...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention puts forward a high-performance implementation method of a multidimensional FFT on a domestic Sunway 26010 many-core processor. On the basis of a platform of the domestic Sunway 26010 processor, a solution algorithm of the one-dimensional FFT of double-layer decomposition is effectively applied to multidimensional FFT computation, and FFT with input and output spans, multi-line one-dimensional FFT, multi-row one-dimensional FFT, corresponding two-power and non-two-power of FFT and other FFT types are designed, and the operation performance of the multidimensional FFT is improved. Compared with an open-source FFTW base, on the basis of the platform, the operation performance of the multidimensional FFT can be greatly improved, the average speed-up ratio is 22.283, and the maximum speed-up ratio is 30.340.

Description

technical field [0001] The invention relates to the technical field of Fourier transform operation, in particular to a high-performance implementation method of multi-dimensional FFT on a domestic Shenwei 26010 many-core processor. Background technique [0002] Discrete Fourier Transform (DFT) plays an important role in the fields of digital signal processing and image processing. Fast Fourier Transform (FFT) is a fast algorithm for calculating discrete Fourier transform and its inverse transform. In 1965, after it was proposed by Cooley-Tukey, the computational complexity of DFT was changed from O(N 2 ) is reduced to O(NlogN) (N is the transformation scale), and it spreads rapidly to various fields of scientific research. Because of its complex memory access mode and large amount of data communication, it has become an integral part of the HPC Challenge benchmark program, which can be used to evaluate the architecture and overall performance of supercomputers. [0003] F...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/14
CPCG06F17/142
Inventor 杨超赵玉文张佳佳刘芳芳孙乔
Owner INST OF SOFTWARE - CHINESE ACAD OF SCI