Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Method for FPGA Accelerated Realization of Singular Value Decomposition of Matrix

A singular value decomposition and matrix singularity technology, applied in the field of signal processing, can solve problems such as FPGA development difficulties, huge workload, and small matrix size, and achieve the effects of easy FPGA development and implementation, reduced handling capacity, and improved parallel efficiency

Active Publication Date: 2021-12-24
ZHEJIANG LAB
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, singular value decomposition involves a large number of mathematical operations and loop iterations, especially for large-scale matrices, which are computationally intensive and storage-intensive. Both put forward harsh requirements, but also lead to the extremely difficult and huge workload of FPGA development
In the published research and inventions, due to the limited FPGA resources and the high complexity of the singular value decomposition itself, the matrix size of the current singular value decomposition based on the FPGA is generally small, the real-time performance is poor, and it can only support fixed-size Insufficient matrix input etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Method for FPGA Accelerated Realization of Singular Value Decomposition of Matrix
  • A Method for FPGA Accelerated Realization of Singular Value Decomposition of Matrix
  • A Method for FPGA Accelerated Realization of Singular Value Decomposition of Matrix

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The present invention will be described in detail below according to the accompanying drawings and preferred embodiments, and the purpose and effect of the present invention will become clearer. It should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the present invention.

[0022] First give an explanation of the technical terms:

[0023] (1) FPGA: Field Programmable Gate Array Field Programmable Gate Array

[0024] (2) BRAM: Block RAM, FPGA internal block RAM

[0025] (3) Jacobi: In this invention, it refers specifically to unilateral Jacobi rotation, which is often used in FPGA-based matrix singular value decomposition

[0026] (4) round-robin: round-robin scheduling, a commonly used scheduling mechanism for unilateral Jacobi rotation singular value decomposition

[0027] (5) DRAM: Dynamic Random Access Memory, here specifically refers to off-chip DRAM, such as DDR3 (or DDR4)...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an FPGA accelerated implementation method of singular value decomposition of a matrix. The method firstly divides a matrix of m rows×n columns stored in an off-chip DRAM into p=n / k sub-blocks according to a group of k column vectors. , the p sub-blocks are alternately combined in pairs in order to obtain a small-sized matrix of m rows × 2k columns and write it into the internal BRAM of the FPGA, and further perform unilateral Jacobi rotation transformation, and half of the column vectors in the obtained calculation results are written back to the slice Outside the DRAM, the other half of the column vectors will continue to be combined with the next sub-block to obtain a new m-row×2k-column matrix, and the above operations will be repeated on the FPGA, until p sub-blocks are combined in pairs to perform a full round of unilateral Jacobi rotation Transformation; perform the above operations multiple times until the convergence condition is satisfied, that is, the singular value decomposition of the large-size matrix with m rows×n columns is completed. The present invention adopts a divide-and-conquer decomposition strategy and an implementation mode of alternate combination between sub-blocks, which improves the data multiplexing rate, reduces frequent data transfer, and reduces the bandwidth pressure of on-chip and off-chip data transmission.

Description

technical field [0001] The invention relates to the field of signal processing, in particular to an FPGA accelerated realization method of matrix singular value decomposition. Background technique [0002] Singular value decomposition is an important matrix decomposition in linear algebra, which is widely used in signal processing, image compression and deep learning. In the current existing research, it is mainly implemented by CPU or GPU in the form of software programs. In recent years, with the rapid development of FPGA technology, the use of FPGA to realize matrix singular value decomposition has gradually become a popular technology, especially some FPGAs that have deployed In the application scenario, implementing matrix singular value decomposition based on FPGA to replace the GPU solution can reduce costs and reduce power consumption. Compared with the CPU solution, it can obtain more real-time and low-latency performance. [0003] However, singular value decomposi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/16
CPCG06F17/16
Inventor 胡塘李相迪徐志伟
Owner ZHEJIANG LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products