Unlock instant, AI-driven research and patent intelligence for your innovation.

Generalized acceleration of matrix multiply accumulate operations

A technology of matrix operation and matrix product, which is applied in the field of acceleration of matrix multiplication, accumulation and addition, and can solve the problem of low technical efficiency.

Active Publication Date: 2018-11-23
NVIDIA CORP
View PDF13 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this technique is inefficient since MMA operations must be decomposed into each elementary arithmetic operation using scalar operands

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Generalized acceleration of matrix multiply accumulate operations
  • Generalized acceleration of matrix multiply accumulate operations
  • Generalized acceleration of matrix multiply accumulate operations

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] Many modern applications could benefit from more efficient processor handling of matrix operations. Arithmetic operations performed on matrix operands are commonly used in various algorithms, including but not limited to: deep learning algorithms, linear algebra, and graphics acceleration, among others. Greater efficiency can be achieved by using parallel processing units, since matrix operations can be reduced to multiple parallel operations on different parts of the matrix operands.

[0025] This paper explores a new paradigm for datapath design to accelerate matrix operations as performed by a processor. The basic concept of a datapath is that a datapath performs one or more dot product operations on multiple vector operands. Matrix operations can then be accelerated by reducing them to multiple dot product operations, and some dot product operations can benefit from data sharing within the datapath, which reduces the bandwidth between the register file and the inpu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method for performing matrix multiply and accumulate (MMA) operations, a computer readable medium, and a processor are disclosed. The processor includes a data path configured to execute the MMA operation to generate a plurality of elements of a result matrix at an output of the data path. Each element of the result matrix is generated by calculating at least one dot product of corresponding pairs of vectors associated with matrix operands specified in an instruction for the MMA operation. A dot product operation includes following steps of: generating a plurality of partial products by multiplying each element of a first vector with a corresponding element of a second vector; aligning the plurality of partial products based on the exponents associated with each element of the first vector and each element of the second vector; and accumulating the plurality of aligned partial products into a result queue utilizing at least one adder.

Description

[0001] Cross References to Related Applications [0002] This application claims the benefit of U.S. Provisional Application No. 62 / 503,159, entitled "Generalized Acceleration of Matrix Multiply Accumulate Operations," filed May 8, 2017 (Attorney Docket No. NVIDP1157+) , the entire contents of which are incorporated herein by reference. technical field [0003] The present disclosure relates to implementing arithmetic operations on a processor, and more particularly to the acceleration of matrix multiply-accumulate operations. Background technique [0004] Modern computer processors are basically integrated circuits designed to perform logical tasks. One task that processors are really good at is performing arithmetic operations on numbers encoded in different formats (e.g. 8-bit integers, 32-bit integers, 32-bit floating point values, etc.). However, most processors contain logic to perform these arithmetic operations on scalar operands. For example, logic designed to p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/16G06F7/575G06F7/523G06F7/50
CPCG06F7/50G06F7/5235G06F7/575G06F17/16G06F9/3001G06F9/30014G06F9/30036G06T1/20G06F9/3851G06F9/3012
Inventor B·R·博斯韦尔M·Y·西乌J·H·肖凯特J·M·阿尔本S·奥伯曼
Owner NVIDIA CORP