Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A GPU-accelerated qr-factorization method for a large number of homogeneous sparse matrices

A QR decomposition and sparse matrix technology, applied in instrumentation, computing, concurrent instruction execution, etc., can solve problems such as the inability of programs to fully utilize GPU advantages, the lack of in-depth research on data indexing methods, and the lack of in-depth optimization of thread design. Floating point calculation, solving the effect of time-consuming power flow calculation and improving memory operation speed

Active Publication Date: 2019-01-29
SOUTHEAST UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, through reasonable scheduling between the CPU and the GPU, the coefficient matrix of the equation system can be quickly completed for QR decomposition, and the sparse linear equation system can be solved. Scholars at home and abroad have begun to study the method of accelerating the solution of the sparse linear equation system on the GPU, but there is no In-depth optimization of thread design, purely from the distribution of calculation calculation thread design, without in-depth research on thread calculation methods and data index methods, can not make the program fully utilize the advantages of the GPU

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A GPU-accelerated qr-factorization method for a large number of homogeneous sparse matrices
  • A GPU-accelerated qr-factorization method for a large number of homogeneous sparse matrices
  • A GPU-accelerated qr-factorization method for a large number of homogeneous sparse matrices

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The technical solution of the present invention will be further described below in conjunction with the accompanying drawings.

[0032] Such as Figure 4 Shown, a kind of GPU accelerated QR decomposition method of a large amount of isomorphic sparse matrices of the present invention, described method comprises the steps:

[0033] (1) A large number of isomorphic sparse matrices refer to a series of n-order matrices A with the same sparse structure 1 ~A N , for the sparse matrix A on the CPU 1 Perform QR symbol decomposition to get the Household transformation matrix V 1 and the upper triangular matrix R 1 The sparse structure of A after symbolic decomposition 1 The sparse structure of the matrix is ​​equal to V 1 +R 1 ;according to R 1 The sparse structure of the matrix, for the matrix A 1 Each column is parallelized and stratified; because A 1 ~A N have the same sparse structure, so A 1 ~A N With the same sparse structure of Household transformation matrix...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a GPU acceleration QR decomposition method for a large number of isomorphic sparse matrixes. The method includes the following steps that QR symbol decomposition is carried out on a sparse matrix A1 on a CPU, and sparse structures of a Household transformation matrix V1 and an upper triangular matrix R1 are obtained; parallel layering is carried out on all arrays of the matrix A1, wherein A1-AN have the same sparse structure V1, the same upper triangular matrix sparse structure R1 and the same parallel layering result; the CPU transmits data needed for QR decomposition to a GPU; task allocation and equipment memory optimization are carried out, wherein the QR decomposition task of the matrixes A1-AN is allocated to a large number of threads on the GPU to be executed, and memory usage is optimized according to the merged access principle; a layering QR decomposition kernel function Batch_QR is calculated in the GPU. According to the method, intense floating point calculation is processed in the GPU through the process of a CPU control program, and the QR decomposition speed of a large number of isomorphic sparse matrixes can be greatly increased.

Description

technical field [0001] The invention belongs to the application field of high-performance computing in power systems, and in particular relates to a GPU-accelerated QR decomposition method for a large number of homogeneous sparse matrices. Background technique [0002] Power flow calculation is the most widely used, basic and important electrical calculation in power system. In the study of power system operation mode and planning scheme, power flow calculation is required to compare the feasibility, reliability and economy of the operation mode or planning power supply scheme. In the real-time monitoring of power system operation status, online power flow calculation is required . In the traditional Newton-Raphson method power flow calculation, the time for solving the modified equations accounts for 70% of the power flow calculation time, and the calculation speed for solving the modified equations affects the overall performance of the program. [0003] The fault power ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/38
CPCG06F9/3877
Inventor 周赣孙立成秦成明张旭柏瑞冯燕钧傅萌
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products