Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

1692 results about "Concurrent computation" patented technology

Concurrent computing is a form of computing in which several computations are executed during overlapping time periods— concurrently —instead of sequentially (one completing before the next starts). This is a property of a system—this may be an individual program, a computer, or a network —and there is a separate execution point or "thread of control" for each computation ("process").

Novel massively parallel supercomputer

A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System-On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node may be used individually or simultaneously to work on any combination of computation or communication as required by the particular algorithm being solved or executed at any point in time. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. In the preferred embodiment, the multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. For particular classes of parallel algorithms, or parts of parallel calculations, this architecture exhibits exceptional computational performance, and may be enabled to perform calculations for new classes of parallel algorithms. Additional networks are provided for external connectivity and used for Input / Output, System Management and Configuration, and Debug and Monitoring functions. Special node packaging techniques implementing midplane and other hardware devices facilitates partitioning of the supercomputer in multiple networks for optimizing supercomputing resources.
Owner:INT BUSINESS MASCH CORP

Efficient de-quantization in a digital video decoding process using a dynamic quantization matrix for parallel computations

An efficient digital video (DV) decoder process that utilizes a specially constructed quantization matrix allowing an inverse quantization subprocess to perform parallel computations, e.g., using SIMD processing, to efficiently produce a matrix of DCT coefficients. The present invention utilizes a first look-up table (for 8x8 DCT) which produces a 15-valued quantization scale based on class number information and a QNO number for an 8x8 data block ("data matrix") from an input encoded digital bit stream to be decoded. The 8x8 data block is produced from a deframing and variable length decoding subprocess. An individual 8-valued segment of the 15-value output array is multiplied by an individual 8-valued segment, e.g., "a row," of the 8x8 data matrix to produce an individual row of the 8x8 matrix of DCT coefficients ("DCT matrix"). The above eight multiplications can be performed in parallel using a SIMD architecture to simultaneously generate a row of eight DCT coefficients. In this way, eight passes through the 8x8 block are used to produce the entire 8x8 DCT matrix, in one embodiment consuming only 33 instructions per 8x8 block. After each pass, the 15-valued output array is shifted by one value position for proper alignment with its associated row of the data matrix. The DCT matrix is then processed by an inverse discrete cosine transform subprocess that generates decoded display data. A second lookup table can be used for 2x4x8 DCT processing.
Owner:SONY ELECTRONICS INC +1

Multi-thread parallel processing method based on multi-thread programming and message queue

ActiveCN102902512AFast and efficient multi-threaded transformationReduce running timeConcurrent instruction executionComputer architectureConcurrent computation
The invention provides a multi-thread parallel processing method based on a multi-thread programming and a message queue, belonging to the field of high-performance computation of a computer. The parallelization of traditional single-thread serial software is modified, and current modern multi-core CPU (Central Processing Unit) computation equipment, a pthread multi-thread parallel computing technology and a technology for realizing in-thread communication of the message queue are utilized. The method comprises the following steps of: in a single node, establishing three types of pthread threads including a reading thread, a computing thread and a writing thread, wherein the quantity of each type of the threads is flexible and configurable; exploring multi-buffering and establishing four queues for the in-thread communication; and allocating a computing task and managing a buffering space resource. The method is widely applied to the application field with multi-thread parallel processing requirements; a software developer is guided to carry out multi-thread modification on existing software so as to realize the optimization of the utilization of a system resource; and the hardware resource utilization rate is obviously improved, and the computation efficiency of software and the whole performance of the software are improved.
Owner:LANGCHAO ELECTRONIC INFORMATION IND CO LTD

Method and apparatus for a parallel data storage and processing server

The present invention concerns a parallel multiprocessor-multidisk storage server which offers low delays and high throughputs when accessing and processing one-dimensional and multi-dimensional file data such as pixmap images, text, sound or graphics. The invented parallel multiprocessor-multidisk storage server may be used as a server offering its services to computer, to client stations residing on a network or to a parallel host system to which it is connected. The parallel storage server comprises (a) a server interface processor interfacing the storage system with a host computer, with a network or with a parallel computing system; (b) an array of disk nodes, each disk node being composed by one processor electrically connected to at least one disk and (c) an interconnection network for connecting the server interface processor with the array of disk nodes. Multi-dimensional data files such as 3-d images (for example tomographic images), respectively 2-d images (for example scanned aerial photographs) are segmented into 3-d, respectively 2-d file extents, extents being striped onto different disks. One-dimensional files are segmented into 1-d file extents. File extents of a given file may have a fixed or a variable size. The storage server is based on a parallel image and multiple media file storage system. This file storage system includes a file server process which receives from the high level storage server process file creation, file opening, file closing and file deleting commands. It further includes extent serving processes running on disk node processors, which receive from the file server process commands to update directory entries and to open existing files and from the storage interface server process commands to read data from a file or to write data into a file. It also includes operation processes responsible for applying in parallel geometric transformations and image processing operations to data read from the disks and a redundancy file creation process responsible for creating redundant parity extent files for selected data files.
Owner:AXS TECH

Generation and search method for reachability chain list of directed graph in parallel environment

The invention belongs to the field of data processing for large graphs and relates to a generation and search method for reachability chain list of a directed graph in the parallel environment. The method includes distributing the directed graph to every processor which stores nodes in the graph and sub-nodes corresponding to the nodes; compressing graph data split to the processors; calculating a backbone node reachability code of a backbone graph; building a chain index; building a skip list on the chain index; allowing data communication among the processors; allowing each processor to send skip list information to other processors; allowing each processor to upgrade own skip list information; and building a reachability index of a total graph. Through use of graph reachability compression technology in the parallel environment, the size of graph data is greatly reduced, system computing load is reduced, and a system can process the graph data on a larger scale. The method has the advantages that the speed of reading data from a disk is higher, search speed is indirectly increased, accuracy of search results is guaranteed, and network communication cost and search time are reduced greatly for a parallel computing system during searching.
Owner:NORTHEASTERN UNIV

Large-scale resource scheduling system and large-scale resource scheduling method based on deep learning neural network

The invention claims a large-scale resource scheduling system and a large-scale resource scheduling method based on a deep learning neural network. The system comprises at least one scheduling controlmodule and at least two execution modules; the scheduling control module is used for receiving a use request, allocating a scheduling resource and performing parallel computing of state feedback; theexecution modules are used for receiving a task request sent by the scheduling control module, opening up a memory space and performing computation. In the system and the method provided by the invention, a user task request interface is provided, and a scheduler receives a submitted task request information, predicts and judges whether a task satisfies expectation of task completion conditions of a user through the deep learning neural network and consequently determines an initialization parameter of a resource scheduling policy. The scheduler segments the task according to the resource scheduling policy and allocates the task to the execution modules to complete computation. While performing computation and tidying for the task, the execution modules feeds back resource information tothe scheduling control module to complete the user task uniformly.
Owner:WUHAN UNIV OF TECH

Hybrid-domain full wave form inversion method of central processing unit (CPU)/graphics processing unit (GPU) synergetic parallel computing

InactiveCN103135132ASolving Convergence ProblemsAvoid the problem of insufficient storage and occupancySeismic signal processingInternal memoryFull wave
The invention discloses a hybrid-domain full wave form inversion method of central processing unit (CPU)/graphics processing unit (GPU) synergetic parallel computing. Compared with a traditional method, the method can be adopted to conduct the CPU/GPU synergetic parallel computing so as to obviously improve computing efficiency. The method enables a forward part of full wave form inversion to be placed to a time domain, namely forward is conducted in the time domain, the forward is converted to be conducted as the inversion in a frequency domain by the discrete fourier transform (DFT) being utilized, namely the DFT is adopted to pick wave field components corresponding to inversion frequency, and the inversion is conducted in the frequency domain from low frequency to high frequency. The method effectively resolves the problem of astringency of a standard time domain method, and avoids the problem that internal memory of wave form inversion of a standard three-dimensional frequency domain and a Laplace domain cannot be satisfied. The method is few in step, reduces computing expenses, and due to the fact the method can be adopted to conduct the CPU/GPU synergetic parallel computing, largely improves speed-up ratio of the computing method.
Owner:INST OF GEOLOGY & GEOPHYSICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products