Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

551 results about "Memory bandwidth" patented technology

Memory bandwidth is the rate at which data can be read from or stored into a semiconductor memory by a processor. Memory bandwidth is usually expressed in units of bytes/second, though this can vary for systems with natural data sizes that are not a multiple of the commonly used 8-bit bytes.

Artificial neural network calculating device and method for sparse connection

ActiveCN105512723ASolve the problem of insufficient computing performance and high front-end decoding overheadAdd supportMemory architecture accessing/allocationDigital data processing detailsActivation functionMemory bandwidth
An artificial neural network calculating device for sparse connection comprises a mapping unit used for converting input data into the storage mode that input nerve cells and weight values correspond one by one, a storage unit used for storing data and instructions, and an operation unit used for executing corresponding operation on the data according to the instructions. The operation unit mainly executes three steps of operation, wherein in the first step, the input nerve cells and weight value data are multiplied; in the second step, addition tree operation is executed, the weighted output nerve cells processed in the first step are added level by level through an addition tree, or the output nerve cells are added with offset to obtain offset-added output nerve cells; in the third step, activation function operation is executed, and the final output nerve cells are obtained. By means of the device, the problems that the operation performance of a CPU and a GPU is insufficient, and the expenditure of front end coding is large are solved, support to a multi-layer artificial neural network operation algorithm is effectively improved, and the problem that memory bandwidth becomes a bottleneck of multi-layer artificial neural network operation and the performance of a training algorithm of the multi-layer artificial neural network operation is solved.
Owner:CAMBRICON TECH CO LTD

Separable array-based reconfigurable accelerator and realization method thereof

The invention provides a separable array-based reconfigurable accelerator and a realization method thereof. The reconfigurable accelerator comprises a scratchpad memory cache area, separable calculation arrays, and a register cache area, wherein the scratchpad memory cache area is used for realizing reuse of data of convolution calculation and sparsity full connection calculation, the separable calculation arrays comprise multiple reconfigurable calculation units and fall into a convolution calculation array and a sparsity full connection calculation array, the register cache area is a storage area formed by multiple registers, and provides input data, weight data and corresponding output results for convolution calculation and sparsity full connection calculation, input data and weight data of convolution calculation are input into the convolution calculation array, the convolution calculation array outputs a convolution calculation result, input data and weight data of the sparsity full connection calculation are input into the sparsity full connection calculation array, and the sparsity full connection calculation array outputs a sparsity full connection calculation result. Characteristics of two neural networks are fused, so that the calculation resource of the chip and the memory bandwidth use ratio are improved.
Owner:TSINGHUA UNIV

Method and device for tracking error propagation and refreshing a video stream

A method and device for tracking error propagation and refreshing a video stream is provided. The proposed subject matter comprises of an error propagation tracking method that works in the sub-sampled domain to reduce computational cycles and memory bandwidth. Further, the tracking based update of the error propagation metric is done differently for static and non-static regions to avoid unnecessary refresh of static areas. Through suitable thresholding of the metric at a macroblock (MB) level, a set of refresh MBs are selected for each frame. These refresh MBs are coded either as an intra MB or as an inter MB that is predicted from one or more reliable reference frames (—frames that are known to be available at the decoder with negligible errors—). Such inter coding of refresh MBs improves the compression efficiency when compared to pure intra coding of refresh MBs. Further, variants to the threshold selection are presented that result in temporally uniform distribution of the number of refresh MBs and a strict refresh scheme wherein all MBs are guaranteed to be with negligible errors following a packet loss within a committed refresh period. In addition, to using the error propagation metric, spatial connectivity to already chosen refresh MBs is used in the selection of additional refresh MBs within a frame and across frames; this reduces the rate of error propagation due to part of a macroblock predicting from older, erroneous neighboring MBs and in turn requiring more refresh MBs on the average per frame.
Owner:ITTIAM SYST P

DSP (digital signal processing) architecture with a wide memory bandwidth and a memory mapping method thereof

A DSP (Digital Signal Processing) architecture with a wide memory bandwidth and a memory mapping method thereof. The DSP architecture includes: a first communication port; first, second, and third memory devices, which are connected with the first communication port and are arranged in a first row direction of the DSP architecture; a fourth memory device, a calculation element, and a fifth memory device, which are arranged in a second row direction below a first row direction of the DSP architecture; and sixth, seventh, and eighth memory devices, which are connected with the first communication port and arranged in a third row direction of the DSP architecture, wherein the calculation element is connected with the first through the eight memory devices. In the DSP architecture, the calculation element and the first through the eighth memory devices form one arrangement unit, wherein the calculation element is disposed in the center of the arrangement unit, the first through the eighth memory devices are connected to the calculation element, and a plurality of arrangement units are arranged in row directions and column directions of the DSP architecture. Therefore, since a wide data bandwidth is provided between the calculation element of the DSP architecture and the memory devices, it is possible to reduce memory access times when data is processed, and accordingly, to process data with a high data rate, such as a moving image with a high resolution.
Owner:SAMSUNG ELECTRONICS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products