Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

2043results about How to "Improve parallelism" patented technology

Method for accelerating convolution neutral network hardware and AXI bus IP core thereof

The invention discloses a method for accelerating convolution neutral network hardware and an AXI bus IP core thereof. The method comprises the first step of performing operation and converting a convolution layer into matrix multiplication of a matrix A with m lines and K columns and a matrix B with K lines and n columns; the second step of dividing the matrix result into matrix subblocks with m lines and n columns; the third step of starting a matrix multiplier to prefetch the operation number of the matrix subblocks; and the fourth step of causing the matrix multiplier to execute the calculation of the matrix subblocks and writing the result back to a main memory. The IP core comprises an AXI bus interface module, a prefetching unit, a flow mapper and a matrix multiplier. The matrix multiplier comprises a chain type DMA and a processing unit array, the processing unit array is composed of a plurality of processing units through chain structure arrangement, and the processing unit of a chain head is connected with the chain type DMA. The method can support various convolution neutral network structures and has the advantages of high calculation efficiency and performance, less requirements for on-chip storage resources and off-chip storage bandwidth, small in communication overhead, convenience in unit component upgrading and improvement and good universality.
Owner:NAT UNIV OF DEFENSE TECH

Reconfigurable data path processor

A reconfigurable data path processor comprises a plurality of independent processing elements. Each of the processing elements advantageously comprising an identical architecture. Each processing element comprises a plurality of data processing means for generating a potential output. Each processor is also capable of through-putting an input as a potential output with little or no processing. Each processing element comprises a conditional multiplexer having a first conditional multiplexer input, a second conditional multiplexer input and a conditional multiplexer output. A first potential output value is transmitted to the first conditional multiplexer input, and a second potential output value is transmitted to the second conditional multiplexer output. The conditional multiplexer couples either the first conditional multiplexer input or the second conditional multiplexer input to the conditional multiplexer output, according to an output control command. The output control command is generated by processing a set of arithmetic status-bits through a logical mask. The conditional multiplexer output is coupled to a first processing element output. A first set of arithmetic bits are generated according to the processing of the first processable value. A second set of arithmetic bits may be generated from a second processing operation. The selection of the arithmetic status-bits is performed by an arithmetic-status bit multiplexer selects the desired set of arithmetic status bits from among the first and second set of arithmetic status bits. The conditional multiplexer evaluates the select arithmetic status bits according to logical mask defining an algorithm for evaluating the arithmetic status bits.
Owner:STC UNM +1

Decoupling parallel scheduling method for rely tasks in cloud computing

The invention belongs to the field of cloud computing application, and relates to method for task rely relation description, decoupling, parallel scheduling and the like in cloud service. Rely task relations are provided, and a decoupling parallel scheduling method of rely tasks are constructed. The method comprises first decoupling the task rely relations with incoming degree being zero to construct a set of ready tasks and dynamically describing tasks capable of being scheduled parallelly at a moment; then scheduling the set of the ready tasks in distribution type and multi-target mode according to real time resource access so as to effectively improve schedule parallelism; and during the distribution of the tasks, further considering task execution and expenditure of communication (E/C) between the tasks to determine whether task copy is used to replace rely data transmission so as to reduce the expenditure of communication. The whole scheduling method can schedule a plurality of tasks in the set of the ready tasks in dynamic parallel mode, well considers performance indexes including real time performance, parallelism, expenditure of communication, loading balance performance and the like, and effectively improves integral performance of the system through the dynamic scheduling strategy.
Owner:DALIAN UNIV OF TECH

Hardware structure for realizing forward calculation of convolutional neural network

The present application discloses a hardware structure for realizing forward calculation of a convolutional neural network. The hardware structure comprises: a data off-chip caching module, used for caching parameter data in each to-be-processed picture that is input externally into the module, wherein the parameter data waits for being read by a multi-level pipeline acceleration module; the multi-level pipeline acceleration module, connected to the data off-chip caching module and used for reading a parameter from the data off-chip caching module, so as to realize core calculation of a convolutional neural network; a parameter reading arbitration module, connected to the multi-level pipeline acceleration module and used for processing multiple parameter reading requests in the multi-level pipeline acceleration module, so as for the multi-level pipeline acceleration module to obtain a required parameter; and a parameter off-chip caching module, connected to the parameter reading arbitration module and used for storing a parameter required for forward calculation of the convolutional neural network. The present application realizes algorithms by adopting a hardware architecture in a parallel pipeline manner, so that higher resource utilization and higher performance are achieved.
Owner:智擎信息系统(上海)有限公司

GPS signal large-scale parallel quick capturing method and module thereof

The invention discloses a GPS signal large-scale parallel quick capturing method, which comprises the following steps: configuring a large-scale parallel quick capturing module firmware comprising submodules of multiplier, data block cache, parallel part correlative processing, frequency domain transformation, postprocessing and digital controlled oscillator, code generator and the like in a system CPU; through the calling computation, converting low-medium frequency digital signals into baseband signals in a processing procedure to combine a data block; performing zeroing extension of the length and the data block on each equational data section in the data block; then based on FFT transformation computation, performing parallel part correlative PPC processing on each extended data section and local spreading codes, and performing FFT transformation on each line of a formed PPC matrix to obtain a result matrix; and performing coherent or incoherent integration on a plurality of result matrixes formed by processing a plurality of data blocks to increase the processing gain, improve the capturing sensitivity, roughly determine the code phase and the Doppler frequency of GPS signals, and achieve two-dimensional parallel quick capturing of the GPS signals. The method has high processing efficiency and high capturing speed, and can be applied to various GPS positioning navigation aids.
Owner:杭州中科微电子有限公司

A power distribution network fault diagnosis system based on topology knowledge and a method

ActiveCN103675600ASave human and financial timeIncrease topology speedFault locationInformation technology support systemWiring diagramElectric network
The invention relates to a power distribution network fault diagnosis system based on topology knowledge and a method. The system comprises a resource layer, a knowledge layer and a diagnosis layer. The resource layer comprises a fault oscilloscope, a monitoring terminal, a distribution equipment machine account, a topology file and a database. The monitoring terminal is used for collecting switch state information after a fault. The fault oscilloscope is used for obtaining waveforms after the fault. The topology file is used for providing an electrical wiring diagram of a whole electrical network. The distribution equipment machine account is used for obtaining operation conditions in the fault of power distribution equipment. The knowledge level comprises a knowledge base providing information for the diagnosis layer. The diagnosis layer comprises a fault diagnosis server. The fault diagnosis server is provided with a fault diagnosis program module. According to the method, topology knowledge formation and fault diagnosis are realized through the above system. The system and the method can carry out fault analysis after the fault and provide references for operation personnel, so that manpower, financial resources and time are saved. In addition, the system is highly efficient and rapid in operation, and the usage effect is good.
Owner:STATE GRID CORP OF CHINA +3
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products