Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

192 results about "Systolic array" patented technology

In parallel computer architectures, a systolic array is a homogeneous network of tightly coupled data processing units (DPUs) called cells or nodes. Each node or DPU independently computes a partial result as a function of the data received from its upstream neighbors, stores the result within itself and passes it downstream. Systolic arrays were invented by H. T. Kung and Charles Leiserson who described arrays for many dense linear algebra computations (matrix product, solving systems of linear equations, LU decomposition, etc.) for banded matrices. Early applications include computing greatest common divisors of integers and polynomials. They are sometimes classified as multiple-instruction single-data (MISD) architectures under Flynn's taxonomy, but this classification is questionable because a strong argument can be made to distinguish systolic arrays from any of Flynn's four categories: SISD, SIMD, MISD, MIMD, as discussed later in this article.

A universal convolutional neural network accelerator based on a one-dimensional pulsation array

The invention discloses a universal convolutional neural network accelerator based on a one-dimensional pulsation array. An AXI4 bus interface is used for realizing loading of a mode configuration instruction, reading of to-be-calculated data and batch sending of result data. The mode configurator configures each functional module as a corresponding working type through the mode configuration instruction; The data scheduling module can concurrently perform tasks of caching data to be calculated, reading calculation data, caching convolution results, processing the convolution results and outputting the convolution results; The convolution calculation module adopts a one-dimensional pulsation array mode to carry out convolution calculation; The to-be-calculated data cache region, the convolution result cache region and the output result buffer FIFO are used for caching corresponding data; And the result processing module carries out common result processing operation in the convolutional neural network. The accelerator can be compatible with different calculation types in a convolutional neural network, high-parallelism calculation is carried out to effectively accelerate, and onlya lower off-chip memory access bandwidth requirement and a small amount of on-chip memory resources are needed.
Owner:SOUTHEAST UNIV +1

Systolic array architecture for fast IP lookup

This invention first presents SRAM based pipeline IP lookup architectures including an SRAM based systolic array architecture that utilizes multi-pipeline parallelism idea and elaborates on it as the base architecture highlighting its advantages. In this base architecture a multitude of intersecting and different length pipelines are constructed on a two dimensional array of processing elements in a circular fashion. The architecture supports the use of any type of prefix tree instead of conventional binary prefix tree. The invention secondly proposes a novel use of an alternative and more advantageous prefix tree based on binomial spanning tree to achieve a substantial performance increase. The new approach, enhanced with other extensions including four-side input and three-pointer implementations, considerably increases the parallelism and search capability of the base architecture and provides a much higher throughput than all existing IP lookup approaches making, for example, a 7 Tbps router IP lookup front end speed possible. Although theoretical worst-case lookup delay in this systolic array structure is high, the average delay is quite low, large delays being observed only rarely. The structure in its new form is scalable in terms of processing elements and is also well suited for the IPv6 addressing scheme.
Owner:BAZLAMACCI CUNEYT +1

Method and circuit of accelerated operation of pooling layer of neural network

The invention belongs to the technical field of integrated-circuit design, and particularly relates to a method and a circuit of accelerated operation of a pooling layer of a neural network. The method is to decompose two-dimensional pooling operation into two times of one-dimensional pooling operation of one-dimensional pooling operation of a width direction and one-dimensional pooling operationof a height direction. A circuit structure includes five parts including a graph layer segmentation module used for graph layer segmentation and data reading, a horizontal-pooling-operation module used for pooling operation of the width direction, a vertical-pooling-operation module used for pooling operation of the height direction and an output control module responsible for data writing-back. Compared with traditional methods, the method of the invention reduces calculation quantity; all modules in the circuit process data stream, thus too many on-chip buffers are not needed for storing temporary results, and chip areas are saved; and at the same time, the circuit uses a systolic array structure, all hardware units in each clock cycle are enabled to be all in a working state, a hardwareunit use rate is increased, and thus working efficiency of the circuit is improved.
Owner:FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products