Hardware architecture of accelerated artificial intelligence processor

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of artificial intelligence and hardware architecture, applied in the field of artificial intelligence, can solve problems such as inapplicability, achieve high performance, improve scalability, and accelerate the work of artificial intelligence

Inactive Publication Date: 2019-01-11

NANJING ILUVATAR COREX TECH CO LTD (DBA ILUVATAR COREX INC NANJING)

View PDF3 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

As for CPU and DSP solutions, the core of their computer is a vector processor, which is not suitable for AI pipeline engineering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0023] The present invention is described in further detail now in conjunction with accompanying drawing.

[0024] Such as figure 1 As shown, the artificial intelligence feature map can usually be described as a four-dimensional tensor [N, C, Y, X]. These four dimensions are, feature map dimension: X, Y; channel dimension: C; batch dimension: N. A kernel can be a 4D tensor [K,C,S,R]. The AI job is to give the input feature map tensor and kernel tensor, we according to figure 1 The formula in computes the output tensor [N,K,Y,X].

[0025] Another important operation in AI is matrix multiplication, which can also be mapped to feature map processing. exist figure 2 In , matrix A can be mapped to tensor [1,K,1,M], matrix B can be mapped to tensor [N,K,1,1], and the result C is tensor [1,N,1,M].

[0026] In addition, there are other operations, such as normalization and activation, which can be supported in general-purpose hardware operators.

[0027] We propose a hardwar...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A Hardware architecture for an accelerated artificial intelligence processor includes: a main engine, a front lobe engine, a parietal lobe engine, a renderer engine, a pillow engine, a temporal lobe engine and a memory. The front-lobe engine obtains 5D tensor from the host and divides it into several sets of tensors, and sends these sets of tensors to the top-lobe engine. The front-lobe engine obtains 5D tensors from the host and divides them into several sets of tensors. The top engine acquires a set of tensors and divides them into a plurality of tensor waves, sends the tensor waves to the renderer engine to execute an input feature renderer, and outputs a portion of the tensors to the pincushion engine. The pincushion engine accumulates a partial tensor and executes an output feature renderer to obtain a final tensor sent to the temporal lobe engine. The temporal lobe engine compresses the data and writes the final tensor to memory. The artificial intelligence work in the inventionis divided into a plurality of highly parallel parts, some parts are allocated to an engine for processing, the number of engines is configurable, the scalability is improved, and all work partitioning and distribution are realized in the architecture, thereby obtaining high-performance efficiency. The artificial intelligence work in the invention is divided into a plurality of highly parallel parts, and some parts are allocated to an engine for processing, and the number of engines is configurable, and the scalability is improved.

Description

technical field [0001] The invention belongs to the field of artificial intelligence, and in particular relates to a hardware architecture for accelerating an artificial intelligence processor. Background technique [0002] Artificial intelligence (AI) processing, a hot topic these days, is both compute- and memory-intensive and requires high performance-power efficiency. Accelerating with current devices such as CPUs and GPUs is not easy, and many solutions such as GPU+TensorCore, TPU, CPU+FPGA, and AI ASIC try to solve these problems. GPU+ TensorCore mainly focuses on solving computing-intensive problems, TPU focuses on computing and data reuse, and CPU+ FPGA / AI ASIC focuses on improving performance-power efficiency. [0003] However, only one-third of the logic of the GPU is used for AI, so higher performance efficiency cannot be obtained. TPUs require more software work to reshape the data layout and split up jobs and send them to the computing cores. As for CPU and D...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06T1/20

CPCG06T1/20G06N3/063G06F13/124G06N3/048G06F9/5027G06F13/00G06N3/04

Inventor 李云鹏倪岭邵平平刘伟栋蔡敏

Owner NANJING ILUVATAR COREX TECH CO LTD (DBA ILUVATAR COREX INC NANJING)

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Hardware architecture of accelerated artificial intelligence processor

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology