Patents

Literature

Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.

96 results about "Simd processor" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Simty is a massively multi-threaded processor core that dynamically assembles SIMD instructions from scalar multi-threaded code. It runs the RISC-V (RV32-I) instruction set. Unlike existing SIMD or SIMT processors like GPUs, Simty runs binaries compiled for general-purpose processors without any instruction set extension or compiler changes.

SIMD processor and addressing method

InactiveUS20060047937A1Without unduly consuming processor resourcesMemory adressing/allocation/relocationMicro-instruction address formationMemory addressProcessor register

A single instruction, multiple data (SIMD) processor including a plurality of addressing register sets, used to flexibly calculate effective operand source and destination memory addresses is disclosed. Two or more address generators calculate effective addresses using the register sets. Each register set includes a pointer register, and a scale register. An address generator forms effective addresses from a selected register set's pointer register and scale register; and an offset. For example, the effective memory address may be formed by multiplying the scale value by an offset value and summing the pointer and the scale value multiplied by the offset value.

SIMD processor and addressing method

SIMD processor and addressing method

SIMD processor and addressing method

Owner:AVAGO TECH INT SALES PTE LTD

Flexible vector modes of operation for SIMD processor

InactiveUS20100274988A1General purpose stored program computerMachine execution arrangementsComputer architectureVector element

In addition to the usual modes of SIMD processor operation, where corresponding elements of two source vector registers are used as input pairs to be operated upon by the execution unit, or where one element of a source vector register is broadcast for use across the elements of another source vector register, the new system provides several other modes of operation for the elements of one or two source vector registers. Improving upon the time-costly moving of elements for an operation such as DCT, the present invention defines a more general set of modes of vector operations. In one embodiment, these new modes of operation use a third vector register to define how each element of one or both source vector registers are mapped, in order to pair these mapped elements as inputs to a vector execution unit. Furthermore, the decision to write an individual vector element result to a destination vector register, for each individual element produced by the vector execution unit, may be selectively disabled, enabled, or made to depend upon a selectable condition flag or a mask bit.

Flexible vector modes of operation for SIMD processor

Flexible vector modes of operation for SIMD processor

Flexible vector modes of operation for SIMD processor

Owner:MIMAR TIBET

Method and apparatus for enable/disable control of SIMD processor slices

ActiveUS20060155964A1Avoid writingEnergy efficient ICTDigital data processing detailsPathPingComputer architecture

Methods and apparatus provide for disabling at least some data path processing circuits of a SIMD processing pipeline, in which the processing circuits are organized into a matrix of slices and stages, in response to one or more enable flags during a given cycle.

Method and apparatus for enable/disable control of SIMD processor slices

Method and apparatus for enable/disable control of SIMD processor slices

Method and apparatus for enable/disable control of SIMD processor slices

Owner:SONY COMPUTER ENTERTAINMENT INC

Method for performing random read access to a block of data using parallel lut read instruction in vector processors

InactiveUS20160124651A1Solve excessive overheadMemory architecture accessing/allocationInput/output to record carriersProcessor registerSimd processor

This invention deals with the problem of paralleling random read access within a reasonably sized block of data for a vector SIMD processor. The invention sets up plural parallel look up tables, moves data from main memory to each plural parallel look up table and then employs a look up table read instruction to simultaneously move data from each parallel look up table to a corresponding part a vector destination register. This enables data processing by vector single instruction multiple data (SIMD) operations. This vector destination register load can be repeated if the tables store more used data. New data can be loaded into the original tables if appropriate. A level one memory is preferably partitioned as part data cache and part directly addressable memory. The look up table memory is stored in the directly addressable memory.

Method for performing random read access to a block of data using parallel lut read instruction in vector processors

Method for performing random read access to a block of data using parallel lut read instruction in vector processors

Method for performing random read access to a block of data using parallel lut read instruction in vector processors

Owner:TEXAS INSTR INC

Method for programmable motion estimation in a SIMD processor

InactiveUS7126991B1Television system detailsImage analysisSlide windowSimd processor

The present invention provides a 16×16-sliding window using vector register file with zero overhead for horizontal or vertical shifts to incorporate motion estimation into SIMD vector processor architecture. SIMD processor's vector load mechanism, vector register file with shifting of elements capability, and 16×16 parallel SAD calculation hardware and instruction are used. Vertical shifts of all sixteen-vector registers occur in a ripple-through fashion when the end vector register is loaded. The parallel SAD calculation hardware can calculate one 16-by-16-block match per clock cycle in a pipelined fashion. In addition, hardware for best-match SAD value comparisons and maintaining their pixel location reduces the software overhead. Block matching for less than 16 by 16 block areas is supported using a mask register to mask selected elements, thereby reducing search area to any block size less than 16 by 16.

Method for programmable motion estimation in a SIMD processor

Method for programmable motion estimation in a SIMD processor

Method for programmable motion estimation in a SIMD processor

Owner:MIMAR TIBET

Fast and flexible scan conversion and matrix transpose in a SIMD processor

InactiveUS6963341B1Efficient and flexibleQuick implementationProgram controlArchitecture with single central processing unitPresent methodScan conversion

The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.

Fast and flexible scan conversion and matrix transpose in a SIMD processor

Fast and flexible scan conversion and matrix transpose in a SIMD processor

Fast and flexible scan conversion and matrix transpose in a SIMD processor

Owner:MIMAR TIBET

System and method for processing image data relative to a focus of attention within the overall image

InactiveUS20110211726A1Process time savingSave data storage overheadTelevision system detailsProgram control using wired connectionsSensor arrayDigital data

This invention provides a system and method for processing discrete image data within an overall set of acquired image data based upon a focus of attention within that image. The result of such processing is to operate upon a more limited subset of the overall image data to generate output values required by the vision system process. Such output value can be a decoded ID or other alphanumeric data. The system and method is performed in a vision system having two processor groups, along with a data memory that is smaller in capacity than the amount of image data to be read out from the sensor array. The first processor group is a plurality of SIMD processors and at least one general purpose processor, co-located on the same die with the data memory. A data reduction function operates within the same clock cycle as data-readout from the sensor to generate a reduced data set that is stored in the on-die data memory. At least a portion of the overall, unreduced image data is concurrently (in the same clock cycle) transferred to the second processor while the first processor transmits at least one region indicator with respect to the reduced data set to the second processor. The region indicator represents at least one focus of attention for the second processor to operate upon.

System and method for processing image data relative to a focus of attention within the overall image

System and method for processing image data relative to a focus of attention within the overall image

System and method for processing image data relative to a focus of attention within the overall image

Owner:COGNEX CORP

Volume rendering apparatus and process

ActiveUS20070040830A13D-image rendering3D modellingGraphicsVolumetric data

A computer automated process is presented for accelerating the rendering of sparse volume data on Graphics Processing Units (GPUs). GPUs are typically SIMD processors, and thus well suited to processing continuous data and not sparse data. The invention allows GPUs to process sparse data efficiently through the use of scatter-gather textures. The invention can be used to accelerate the rendering of sparse volume data in medical imaging or other fields.

Volume rendering apparatus and process

Volume rendering apparatus and process

Volume rendering apparatus and process

Owner:TOSHIBA MEDICAL VISUALIZATION SYST EURO

Systems and methods for accelerating sub-pixel interpolation in video processing applications

ActiveUS20070070080A1Increase precision supported for all operationImprove accuracyGeometric image transformationCathode-ray tube indicatorsVideo processingDiagonal

A data path for a SIMD-based microprocessor is used to perform different simultaneous filter sub-operations in parallel data lanes of the SIMD-based microprocessor. Filter operations for sub-pixel interpolation are performed simultaneously on separate lanes of the SIMD processor's data path. Using a dedicated internal data path, precision higher than the native precision of the SIMD unit may be achieved. Through the data path according to this invention, a single instruction may be used to generate the value of two adjacent sub-pixels located diagonally with respect to integer pixel positions.

Systems and methods for accelerating sub-pixel interpolation in video processing applications

Systems and methods for accelerating sub-pixel interpolation in video processing applications

Systems and methods for accelerating sub-pixel interpolation in video processing applications

Owner:SYNOPSYS INC

Method and apparatus for data alignment and parsing in SIMD computer architecture

InactiveUS7275147B2Reduce in quantityIncrease speedGeneral purpose stored program computerSpecific program execution arrangementsProcessor registerSimd processor

Execution of a single stand-alone instruction manipulates two n bit strings of data to pack data or align the data. Decoding of the single instruction identifies two registers of n bits each and a shift value, preferably as parameters of the instruction. A first and a second subset of data of less than n bits are selected, by logical shifting, from the two registers, respectively, based solely upon the shift value. Then, the subsets are concatenated, preferably by a logical OR, to obtain an output of n bits. The output may be aligned data or packed data, particularly useful for performing a single operation on multiple sets of the data through parallel processing with a SIMD processor.

Method and apparatus for data alignment and parsing in SIMD computer architecture

Method and apparatus for data alignment and parsing in SIMD computer architecture

Method and apparatus for data alignment and parsing in SIMD computer architecture

Owner:HITACHI LTD

SIMD processor with scalar arithmetic logic units

ActiveUS7146486B1Shorten the timeAvoid confictConcurrent instruction executionArchitecture with single central processing unitArithmetic logic unitScalar processor

A scalar processor that includes a plurality of scalar arithmetic logic units and a special function unit. Each scalar unit performs, in a different time interval, the same operation on a different data item, where each different time interval is one of a plurality of successive, adjacent time intervals. Each unit provides an output data item in the time interval in which the unit performs the operation and provides a processed data item in the last of the successive, adjacent time intervals. The special function unit provides a special function computation for the output data item of a selected one of the scalar units, in the time interval in which the selected scalar unit performs the operation, so as to avoid a conflict in use among the scalar units. A vector processing unit includes an input data buffer, the scalar processor, and an output orthogonal converter.

SIMD processor with scalar arithmetic logic units

SIMD processor with scalar arithmetic logic units

SIMD processor with scalar arithmetic logic units

Owner:S3 GRAPHICS

Method for efficient and parallel color space conversion in a programmable processor

InactiveUS20110072236A1Easy to operateEffective coloringInstruction analysisProgram control using wired connectionsParallel computingSimd processor

The present invention relates to an efficient implementation of color space conversion in a SIMD processor as part of converting output of video decompression to interface to a display unit.

Method for efficient and parallel color space conversion in a programmable processor

Method for efficient and parallel color space conversion in a programmable processor

Method for efficient and parallel color space conversion in a programmable processor

Owner:MIMAR TIBET

Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements

InactiveUS7506135B1Program controlArchitecture with multiple processing unitsScalar processorSimd processor

The present invention provides histogram calculation for images and video applications using a SIMD and VLIW processor with vector Look-Up Table (LUT) operations. This provides a speed up of histogram calculation by a factor of N times over a scalar processor where the SIMD processor could perform N LUT operations per instruction. Histogram operation is partitioned into a vector LUT operation, followed by vector increment, vector LUT update, and at the end by reduction of vector histogram components. The present invention could be used for intensity, RGBA, YUV, and other type of multi-component images.

Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements

Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements

Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements

Owner:MIMAR TIBET

SIMD processor executing min/max instructions

ActiveUS20060095739A1Computation using non-contact making devicesGeneral purpose stored program computerParallel computingSimd processor

A SIMD processor responds to a single min / max instruction to find the minimum or maximum valued data unit in an array of data units. The determined minimum / maximum value and an associated index value thereto may be output. Alternatively, the value of a data unit in another array may be output at a corresponding location. A further single instruction executable by the SIMD processor, may be applied to results obtained using such a single min / max instruction, to allow such instructions to operate on two dimensional arrays.

SIMD processor executing min/max instructions

SIMD processor executing min/max instructions

SIMD processor executing min/max instructions

Owner:AVAGO TECH INT SALES PTE LTD

Method and system for parallel histogram calculation in a simd and vliw processor

InactiveUS20090276606A1Program control using stored programsGeneral purpose stored program computerScalar processorParallel computing

The present invention provides histogram calculation for images and video applications using a SIMD and VLIW processor with vector Look-Up Table (LUT) operations. This provides a speed up of histogram calculation by a factor of N times over a scalar processor where the SIMD processor could perform N LUT operations per instruction. Histogram operation is partitioned into a vector LUT operation, followed by vector increment, vector LUT update, and at the end by reduction of vector histogram components. The present invention could be used for intensity, RGBA, YUV, and other type of multi-component images.

Method and system for parallel histogram calculation in a simd and vliw processor

Method and system for parallel histogram calculation in a simd and vliw processor

Method and system for parallel histogram calculation in a simd and vliw processor

Owner:MIMAR TIBET

Volume rendering apparatus and process

ActiveUS7333107B23D-image rendering3D modellingComputational scienceGraphics

A computer automated process is presented for accelerating the rendering of sparse volume data on Graphics Processing Units (GPUs). GPUs are typically SIMD processors, and thus well suited to processing continuous data and not sparse data. The invention allows GPUs to process sparse data efficiently through the use of scatter-gather textures. The invention can be used to accelerate the rendering of sparse volume data in medical imaging or other fields.

Volume rendering apparatus and process

Volume rendering apparatus and process

Volume rendering apparatus and process

Owner:TOSHIBA MEDICAL VISUALIZATION SYST EURO

SIMD operation system capable of designating plural registers

InactiveUS20020026570A1Maximize the effectMaintain accuracyRegister arrangementsConcurrent instruction executionOperational systemNetwork packet

In view of a necessity of alleviating factors obstructing an effect of SIMD operation such as in-register data alignment in high speed formation of an SIMD processor, numerous data can be supplied to a data alignment operation pipe 211 by dividing a register file into four banks and enabling to designate a plurality of registers by a single piece of operand to thereby enable to make access to four registers simultaneously and data alignment operation can be carried out at high speed. Further, by defining new data pack instruction, data unpack instruction and data permutation instruction, data supplied in a large number can be aligned efficiently. Further, by the above-described characteristic, definition of multiply accumulate operation instruction maximizing parallelism of SIMD can be carried out.

SIMD operation system capable of designating plural registers

SIMD operation system capable of designating plural registers

SIMD operation system capable of designating plural registers

Owner:RENESAS ELECTRONICS CORP

Compilation for a SIMD RISC processor

InactiveUS7840954B2Software engineeringGeneral purpose stored program computerData processing systemScalar Value

A computer implemented method, data processing system, and computer usable code are provided for generating code to perform scalar computations on a Single-Instruction Multiple-Data (SIMD) Reduced Instruction Set Computer (RISC) architecture. The illustrative embodiments generate code directed at loading at least one scalar value and generate code using at least one vector operation to generate a scalar result, wherein all scalar computation for integer and floating point data is performed in a SIMD vector execution unit.

Compilation for a SIMD RISC processor

Compilation for a SIMD RISC processor

Compilation for a SIMD RISC processor

Owner:INT BUSINESS MASCH CORP

System and method for processing thread groups in a SIMD architecture

ActiveUS20070130447A1Reduce clock frequencyLow hardware requirementsGeneral purpose stored program computerMemory systemsGraphicsDatapath

A SIMD processor efficiently utilizes its hardware resources to achieve higher data processing throughput. The effective width of a SIMD processor is extended by clocking the instruction processing side of the SIMD processor at a fraction of the rate of the data processing side and by providing multiple execution pipelines, each with multiple data paths. As a result, higher data processing throughput is achieved while an instruction is fetched and issued once per clock. This configuration also allows a large group of threads to be clustered and executed together through the SIMD processor so that greater memory efficiency can be achieved for certain types of operations like texture memory accesses performed in connection with graphics processing.

System and method for processing thread groups in a SIMD architecture

System and method for processing thread groups in a SIMD architecture

System and method for processing thread groups in a SIMD architecture

Owner:NVIDIA CORP

SIMD operation system capable of designating plural registers via one register designating field

InactiveUS7043627B2Maximize the effectMaintain accuracyRegister arrangementsConcurrent instruction executionNetwork packetProcessor register

In view of a necessity of alleviating factors obstructing an effect of SIMD operation such as in-register data alignment in high speed formation of an SIMD processor, numerous data can be supplied to a data alignment operation pipe 211 by dividing a register file into four banks and enabling to designate a plurality of registers by a single piece of operand to thereby enable to make access to four registers simultaneously and data alignment operation can be carried out at high speed. Further, by defining new data pack instruction, data unpack instruction and data permutation instruction, data supplied in a large number can be aligned efficiently. Further, by the above-described characteristic, definition of multiply accumulate operation instruction maximizing parallelism of SIMD can be carried out.

SIMD operation system capable of designating plural registers via one register designating field

SIMD operation system capable of designating plural registers via one register designating field

SIMD operation system capable of designating plural registers via one register designating field

Owner:RENESAS ELECTRONICS CORP

Merged control/process element processor for executing VLIW simplex instructions with SISD control/SIMD process mode bit

InactiveUS6874078B2Low implementation costSingle instruction multiple data multiprocessorsRegister arrangementsOperation modeSimd processor

A highly parallel data processing system includes an array of n processing elements (PEs) and a controller sequence processor (SP) wherein at least one PE is combined with the controller SP to create a Dynamic Merged Processor (DP) which supports two modes of operation. In its first mode of operation, the DP acts as one of the PEs in the array and participates in the execution of single-instruction-multiple-data (SIMD) instructions. In the second mode of operation, the DP acts as the controlling element for the array of PEs and executes non-array instructions. To support these two modes of operation, the DP includes a plurality of execution units and two general-purpose register files. The execution units are “shared” in that they can execute instructions in either mode of operation. With very long instruction word (VLIW) capability, both modes of operation can be in effect on a cycle by cycle basis for every VLIW executed. This structure allows the controlling element in a highly parallel SIMD processor to be reused as one of the processing elements in the array to reduce the overall number of transistors and wires in the SIMD processor while maintaining its capabilities and performance.

Merged control/process element processor for executing VLIW simplex instructions with SISD control/SIMD process mode bit

Merged control/process element processor for executing VLIW simplex instructions with SISD control/SIMD process mode bit

Merged control/process element processor for executing VLIW simplex instructions with SISD control/SIMD process mode bit

Owner:ALTERA CORP

Method and apparatus for the efficient representation of interpolated video frames for motion-compensated coding

InactiveUS20050047502A1Improve execution speedEasy to useTelevision system detailsColor television with pulse code modulationParallel computingDiagonal

An efficient representation of an interpolated video frame when used with Single-Instruction-Multiple-Data processors implementing motion-compensated coding. The pixel data of a half-pixel interpolated video frame is stored in memory so that the full-pixels of the original image are stored in a first set of contiguous memory locations; the half-pixels interpolated between horizontally adjacent full-pixels are stored in a second set of contiguous memory locations; the half-pixels interpolated between vertically adjacent full-pixels are stored in a third set of contiguous memory locations; and the half-pixels interpolated between diagonally adjacent full-pixels are stored in a fourth set of contiguous memory locations.

Method and apparatus for the efficient representation of interpolated video frames for motion-compensated coding

Method and apparatus for the efficient representation of interpolated video frames for motion-compensated coding

Method and apparatus for the efficient representation of interpolated video frames for motion-compensated coding

Owner:LUCENT TECH INC

Method for efficient DCT calculations in a programmable processor

InactiveUS20110072065A1Easy to operateEfficient implementationConditional code generationDigital computer detailsParallel computingSimd processor

The present invention relates to a efficient implementation of integer and fractional 8-length or 4-length, or 8×8 or 4×4 DCT in a SIMD processor as part of MPEG and other video compression standards.

Method for efficient DCT calculations in a programmable processor

Method for efficient DCT calculations in a programmable processor

Method for efficient DCT calculations in a programmable processor

Owner:MIMAR TIBET

Method and apparatus for enable/disable control of SIMD processor slices

ActiveUS7644255B2Avoid writingEnergy efficient ICTDigital data processing detailsPathPingComputer architecture

Methods and apparatus provide for disabling at least some data path processing circuits of a SIMD processing pipeline, in which the processing circuits are organized into a matrix of slices and stages, in response to one or more enable flags during a given cycle.

Method and apparatus for enable/disable control of SIMD processor slices

Method and apparatus for enable/disable control of SIMD processor slices

Method and apparatus for enable/disable control of SIMD processor slices

Owner:SONY COMPUTER ENTERTAINMENT INC

SIMD processor with exchange sort instruction operating or plural data elements simultaneously

InactiveUS7500089B2Increase speedImage enhancementConcurrent instruction executionProcessor elementProcessor register

An SIMD type microprocessor having a plurality of processor elements, wherein data stored in a specific register included in each processor element and data stored in an operand-designated source register are compared based on a first type of instruction; after the comparison, a larger data is stored in the specific register; and a smaller data is stored in the source register or an operand-designated destination register other than the source register.

SIMD processor with exchange sort instruction operating or plural data elements simultaneously

SIMD processor with exchange sort instruction operating or plural data elements simultaneously

SIMD processor with exchange sort instruction operating or plural data elements simultaneously

Owner:RICOH KK

Reconfigurable Logic in Processors

InactiveUS20080189514A1Easy accessEasy to useArchitecture with single central processing unitMachine execution arrangementsLogic cellParallel computing

A data processor comprises an array of processing elements (PEn 4), each element in the array comprising a respective configurable logic unit (CLU 11), whereby the logic capability of each processing element can be reconfigured at will. Memory (14, FIGS. 3, 4 not shown) may be pre-loaded with configuration instructions, whereby the configuration state of each processing element can be automatically sequenced from the pre-loaded memory. The memory may be global, in which case the CLUs may be reconfigured in parallel, to perform the same function. Alternatively, the memory may be local to each processing element so that different CLUs implement different functions. Configuration may be carried out under program control at a thread switch. Each respective processing element may select, at run time, a specific configuration from a number of configurations in a microcode store. The processor is preferably a SIMD processor.

Reconfigurable Logic in Processors

Reconfigurable Logic in Processors

Reconfigurable Logic in Processors

Owner:CLEARSPEED TECH

Audio and video processing apparatus

InactiveUS20090316798A1Effective wayColor television with pulse code modulationColor television with bandwidth reductionSemiconductor chipVideo processing

The present invention performs video and audio compression / decompression, video input and output scaling, video input and output processing for enhancement, and system layer functions on a single semiconductor chip. The media processor is compromised of video processor with a SIMD vector engine, audio processor, stream processor, system processor, and video scalers, LUTs and hardware blender. Unified memory architecture is used where these four processors use a shared memory for data and instructions. Data transfers between multiple processors use multiple packet-based unidirectional communication channels via hardware-assisted circular queues in unified memory. The video processor is a SIMD processor coupled to a regular RISC processor as a dual-issue processor. Such integrated and programmable functionality provides implementation of multiple video and audio for compression standards and programmable video enhancement. Important applications of this include Digital TV, IP Video Phone, and Digital Camcorder / Camera.

Audio and video processing apparatus

Audio and video processing apparatus

Audio and video processing apparatus

Owner:MIMAR TIBET

Gate-Level Logic Simulator Using Multiple Processor Architectures

ActiveUS20110257955A1CAD circuit designSpecial data processing applicationsSimd processorCluster group

Techniques for simulating operation of a connectivity level description of an integrated circuit design are provided, for example, to simulate logic elements expressed through a netlist description. The techniques utilize a host processor selectively partitioning and optimizing the descriptions of the integrated circuit design for efficient simulation on a parallel processor, more particularly a SIMD processor. The description may be segmented into cluster groups, for example macro-gates, formed of logic elements, where the cluster groups are sized for parallel simulation on the parallel processor. Simulation may occur in an oblivious as well as event-driven manner, depending on the implementation.

Gate-Level Logic Simulator Using Multiple Processor Architectures

Gate-Level Logic Simulator Using Multiple Processor Architectures

Gate-Level Logic Simulator Using Multiple Processor Architectures

Owner:RGT UNIV OF MICHIGAN

Vector SIMD processor

InactiveUS7028066B2Easy to processSingle instruction multiple data multiprocessorsMultiple digital computer combinationsOperational systemExecution unit

A data processor whose level of operation parallelism is enhanced by composing floating-point inner product execution units to be compatible with single instruction multiple data (SIMD) and thereby enhancing the operation processing capability is made possible. An operating system that can significantly enhance the level of operation parallelism per instruction while maintaining the efficiency of the floating-point length-4 vector inner product execution units is to be implemented. The floating-point length-4 vector inner product execution units are defined in the minimum width (32 bits for single precision) even where an extensive operating system becomes available, and compose the inner product execution units to be compatible with SIMD. The mutually augmenting effects of the inner product execution units and SIMD-compatible composition enhances the level of operation parallelism dramatically. Composition of the floating-point length-4 vector inner product execution units to calculate the sum of the inner product of length-4 vectors and scalar to be compatible with SIMD of four in parallel results in a processing capability of 32 FLOPS per cycle.

Vector SIMD processor

Vector SIMD processor

Vector SIMD processor

Owner:RENESAS ELECTRONICS CORP

Reconfigurable simd processor and method for controlling its instruction execution

InactiveUS20100174891A1Increase resourcesImprove performanceSingle instruction multiple data multiprocessorsSpecific program execution arrangementsGeneral purposeParallel computing

In a reconfigurable SIMD processor, a unit of operation for executing an instruction corresponds to one group, and the one group that includes a plurality of PEs implements at least a part of an operation unit that executes at least one of an integer divide instruction: a floating decimal point add/subtract instruction; a floating decimal point multiply instruction; and a floating decimal point divide instruction, using operation units and general purpose registers provided in a plurality of the PEs. The number of the PEs that compose the one group is varied in accordance with the instruction.

Reconfigurable simd processor and method for controlling its instruction execution

Reconfigurable simd processor and method for controlling its instruction execution

Reconfigurable simd processor and method for controlling its instruction execution

Owner:NEC CORP

Popular searches

Address generator Operand Simd processing Addressing mode Engineering Vector mode Mask Data path MicroBlaze Vector processor

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com