Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

36 results about "Single instruction, multiple threads" patented technology

Single instruction, multiple thread (SIMT) is an execution model used in parallel computing where single instruction, multiple data (SIMD) is combined with multithreading.

Configurable matrix register unit for supporting multi-width SIMD and multi-granularity SIMT

The invention relates to a configurable matrix register unit for supporting multi-width single instruction multiple data stream (SIMD) and multi-granularity single instruction multiple threads (SIMT). The configurable matrix register unit comprises a matrix register and a control register SR; the matrix register of which the size is N*N is divided into M*M blocks, wherein N is a positive integer and is the power of 2, and M is an integer which is more than or equal to 0 and is the power of 2; the block modes of the matrix register and the multi-thread numbers simultaneously processed by a vector processing unit are recorded in the control register; and the width of the control register is log2C+log2T, wherein C is the number of the number of the block modes of the matrix register, and T is the number of multi-thread modes which can be processed by a vector processor. The configurable matrix register unit has the advantages that: the principle is simple; the configurable matrix register unit is simple and convenient to operate; the block size and the thread number can be configured flexibly; the access to vector data in the mode of multi-width SIMD and multi-granularity SIMT is supported at the same time and the like.
Owner:NAT UNIV OF DEFENSE TECH

Method for data correlation in parallel solving process based on cloud elimination equation of GPU (Graph Processing Unit)

The invention discloses a method for data correlation in the parallel solving process based on a cloud elimination equation of a GPU (Graph Processing Unit) and aims at increasing the reusability and access efficiency of the data. The invention has the technical scheme that the data correlation between every two warp interior threads is eliminated by using a parallel mechanism of an SIMT (Single-Instruction Multiple-Thread); constructing a warp block by grids processed by 32 threads in a warp; determining an organization mode of the warp block; restricting the three dimensionality of the block and the grids according to the capacity of a shared memory; carrying out discretization on a whole simulation area with the warp block as a basic unit; dividing a global task into 8 groups; avoiding the data correlation between every two warp blocks in each group; starting kernel calling for 8 times; and finishing update of the current density of 1 / 8 grid in the whole simulation area. According to the method disclosed by the invention, the condition that no multiple threads are used for updating the current density of the same grid at the same time can be ensured; the data correlation between adjacent grids is eliminated; the reusability and high-efficiency access of the data are realized; and the operation speed of a CUDA (Compute Unified Device Architecture) program is increased.
Owner:NAT UNIV OF DEFENSE TECH

Single-instruction multi-thread staining cluster structure of uniform staining architecture graphics processor

The invention belongs to the field of graphics processor design, and relates to a single-instruction multi-thread staining cluster structure of a uniform staining architecture graphics processor. The structure comprises a CU (Control Unit) (3), a FDU (Fetch Decode Unit) (2), an I$ (Instruction Cache) unit (4), a plurality of SPUs (Staining Processing Unit) (1), a SSRAM (Synchronous Static Random Access Memory) unit (8), a RAC (RAM Access Control) unit (7), a LSU (Load Storage Unit) (6) and a C$ (Constant Cache) unit (5), wherein the CU (3) is used for controlling and scheduling SSC; the FDU (2) is used for carrying out FD on an instruction; the I$ unit (4) is used for quickening an instruction access speed; the SPUs are used for executing a staining program; the SSRAM unit (8) is used for sharing data among the SPUs; the RAC unit (7) is used for carrying out decoding and arbitration control on internal memory access; the LSU (6) is used for carrying out data exchange among the SSRAM unit (8), the internal memories of the SPUs and a RF (Radio Frequency) unit; and the C$ unit (5) is used for quickening constant access. By use of the structure, a single-instruction multi-thread processing way is realized.
Owner:XIAN AVIATION COMPUTING TECH RES INST OF AVIATION IND CORP OF CHINA

Multi-core processing device and power consumption control method thereof

The embodiment of the invention discloses a multi-core processing device and a power consumption control method thereof. The device obtains the workload of each of one or more single-instruction multi-thread processing units in a power consumption adjustment period through a power consumption control unit, and judges whether the workload of each of the one or more single-instruction multi-thread processing units in the power consumption adjustment period is smaller than a power consumption adjustment threshold value or not; when the working load of any one single-instruction multi-thread processing unit in one or more single-instruction multi-thread processing units in one power consumption adjusting period is smaller than a power consumption adjusting threshold value, at least part of thread bundles in the any one single-instruction multi-thread processing unit are indicated to execute a power consumption adjusting instruction; therefore, the workload of any single-instruction multi-thread processing unit is increased. According to the embodiment of the invention, the multi-core processing device can be ensured to maintain relatively stable current and voltage during an effective working period, and the problem of voltage prompt drop in the multi-core processing device is reduced, so that the overall power consumption of a chip is reduced, and the processing frequency and performance are improved.
Owner:METAX INTEGRATED CIRCUITS (SHANGHAI) CO LTD

Processor device, instruction execution method thereof and computing equipment

The embodiment of the invention discloses a processor device, an instruction execution method thereof and computing equipment. The apparatus comprises one or more single-instruction multi-thread processing units, and the single-instruction multi-thread processing units comprise one or more thread bundles used for executing instructions; a shared register group including a plurality of general purpose registers shared among the thread bundles; a predicate base address register which is arranged corresponding to each thread bundle and is used for indicating a base address of a group of general purpose registers which are used as predicate registers of each thread bundle in the shared register group; wherein each thread bundle performs asserted execution on instructions based on predicate values in the set of general purpose registers used as predicate registers for each thread bundle. According to the embodiment of the invention, the inherent special predicate register of each thread bundle in the original processor architecture can be canceled, the dynamic expansion of the predicate register resource of each thread bundle is realized, the full utilization of processor resources is realized, the overhead of switching instructions is reduced, and the instruction processing performance is improved.
Owner:METAX INTEGRATED CIRCUITS (SHANGHAI) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products