Processing unit, related device and tensor operation method

A technology of processing unit and computing unit, applied in the direction of computing, computing model, physical implementation, etc., can solve the problem that bandwidth and resources cannot be fully utilized, the computing unit has many idle waiting states, and the external environment bandwidth and computing power of the processing unit are not suitable, etc. question

Active Publication Date: 2022-04-26
平头哥上海半导体技术有限公司
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the bandwidth of the external environment is low, the multicast data input mode cannot be used for efficient calculation, and the idle waiting state of the computing unit is often
When the bandwidth of the external environment is high, the pulse data input mode cannot fully utilize the bandwidth and resources
These all cause the external environmental bandwidth and computing power of the processing unit to be unsuitable, reducing the computing energy efficiency of the processing unit

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Processing unit, related device and tensor operation method
  • Processing unit, related device and tensor operation method
  • Processing unit, related device and tensor operation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The present disclosure is described below based on examples, but the present disclosure is not limited only to these examples. In the following detailed description of the disclosure, some specific details are set forth in detail. The present disclosure can be fully understood by those skilled in the art without the description of these detailed parts. In order to avoid obscuring the essence of the present disclosure, well-known methods, procedures, and procedures are not described in detail. Additionally, the drawings are not necessarily drawn to scale.

[0042] The following terms are used in this document.

[0043]Deep learning model: Deep learning is a new research direction in the field of machine learning (ML, Machine Learning). It is introduced into machine learning to make it closer to the original goal-artificial intelligence (AI). Deep learning learns the internal laws and representation levels of sample data, and the information obtained during the learnin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Provided are a processing unit, a related device and a tensor operation method. The processing unit includes: a plurality of computing units, forming a computing matrix with n rows and m columns, where n and m are non-zero natural numbers; a computing unit controller, configured to: The control calculation matrix works in the multicast data input mode. The data is broadcast to all the calculation units in the corresponding column by column, and to all the calculation units in the corresponding row by row. When the bandwidth of the external environment does not meet the predetermined bandwidth requirements, the control calculation matrix Working in the pulse data input mode, the computing unit receives data from the computing unit in the same row in the previous column and the computing unit in the previous row in the same column to support tensor operations. The embodiment of the present disclosure flexibly configures the working mode of the computing matrix according to the bandwidth of the external environment of the processing unit, so that the bandwidth of the external environment of the processing unit and the computing capability are adapted, and the computing energy efficiency of the processing unit is improved.

Description

technical field [0001] The present disclosure relates to the field of chips, and in particular, to a processing unit, a related device, and a tensor operation method. Background technique [0002] Deep learning is currently widely used in face recognition, speech recognition, automatic driving and other fields. Since deep learning relies on a large number of repeated tensor operations such as convolution and matrix operations, traditional hardware is inefficient in executing the corresponding algorithms. Therefore, a computing architecture dedicated to executing them has emerged as the times require. The deep learning processing units in these architectures employ computational matrices composed of multiple computational units. Each calculation unit in the calculation matrix performs convolution and operation of elements in the matrix operation, and then accumulates the operation results to obtain the tensor operation result. There are generally two ways to transmit the el...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04W4/06H04W28/20H04W84/08G06N3/04G06N3/063G06N20/00
CPCH04W4/06H04W28/20H04W84/08G06N3/063G06N20/00G06N3/045Y02D30/70
Inventor 范虎劳懋元阎承洋李玉东
Owner 平头哥上海半导体技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products