Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

FPGA (Field Programmable Gate Array) virtualization hardware system stack design for cloud deep learning reasoning

A virtualization and cloud technology, applied in CAD circuit design, neural learning method, design optimization/simulation, etc.

Active Publication Date: 2021-09-21
TSINGHUA UNIV
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For the hardware layer, the existing multi-user multi-core virtualization accelerator [3] uses a fully connected method to make the memory bandwidth of each core equal to the slice, so as to achieve multi-user performance isolation; for the scheduling compilation layer, the existing scheduler and compiler [ 2,3,4] Traverse the performance of all resource allocation and scheduling methods, and select the optimal scheduling method; for the application layer, the mainstream virtualization framework uses an application program interface (API)-based method to remotely configure hardware through the scheduling compilation layer Layer virtualization resources, such as [5] the GPU virtualization framework uses the CUDA API on the client side to remotely use the CUDA Runtime Library of the computing node, which will generate about 105 API calls per second, resulting in up to 5 times Remote access latency overhead

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • FPGA (Field Programmable Gate Array) virtualization hardware system stack design for cloud deep learning reasoning
  • FPGA (Field Programmable Gate Array) virtualization hardware system stack design for cloud deep learning reasoning
  • FPGA (Field Programmable Gate Array) virtualization hardware system stack design for cloud deep learning reasoning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] Embodiments of the present application are described in detail below, and examples of the embodiments are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary, and are intended to explain the present application, and should not be construed as limiting the present application.

[0046] The following describes an FPGA virtualization hardware system stack design for cloud-based deep learning inference with reference to the accompanying drawings.

[0047] figure 1 The hardware architecture implementation of the ISA-based DNN accelerator virtualization provided by the embodiment of the present invention.

[0048] For this problem, as figure 1 As shown, the embodiment of the first aspect of the present application provides a distributed FPGA hardware-assisted virtualization hardware archi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an FPGA (Field Programmable Gate Array) virtualization hardware system stack design for cloud deep learning reasoning, and relates to the technical field of artificial intelligence, the design comprises a distributed FPGA hardware auxiliary virtualization hardware architecture, a CPU (Central Processing Unit) server node for running a virtual machine container, a static compiler and a deep neural network DNN, wherein the deep neural network DNN is used for acquiring a user instruction, and the user instruction is compiled into an instruction packet through the static compiler; the FPGA server computing node is used for operating virtualization system service and an FPGA acceleration card, and the FPGA acceleration card comprises a virtualization multi-core hardware resource pool and four double-rate synchronous dynamic random access memories (DDRs); and the master control node is used for managing each node in the CPU server node and the FPGA server computing node through a control layer. According to the scheme, the technical problem that in the prior art, an FPGA virtualization scheme oriented to deep learning reasoning application cannot be expanded to a distributed multi-node computing cluster is solved.

Description

technical field [0001] The present invention relates to the technical field of artificial intelligence, in particular to a design of an FPGA virtualized hardware system stack oriented to cloud-based deep learning reasoning. Background technique [0002] We are in the era of rapid development of artificial intelligence, and deep learning is playing an increasingly important role in various fields. Among them, the inference task of deep neural network (DNN) occupies most of the deep learning task load of cloud data center. The use of traditional general-purpose processors (CPUs) in data centers can no longer meet the huge computing power requirements of deep learning. Therefore, dedicated hardware platforms, such as GPUs, FPGAs, and ASICs, are now commonly used to accelerate deep learning algorithms. Thanks to FPGA's good balance of programmability, performance, and power consumption, more and more cloud service providers, such as Amazon, Alibaba, and Baidu, have begun to de...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F30/34G06F30/27G06N3/04G06N3/08
CPCG06F30/34G06F30/27G06N3/08G06N3/045
Inventor 曾书霖戴国浩杨昕昊刘军汪玉
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products