Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Methods and apparatus for efficient synchronous MIMD operations with IVLIW PE-TO-PE communication

a technology of synchronous mimd and ivliw, which is applied in the direction of program control, instruments, and multiple processing units, can solve the problems of not all algorithms can make efficient use of the available parallelism existing in the processor, and the difficulty of efficiently synchronizing processors to cooperate,

Inactive Publication Date: 2010-09-14
ALTERA CORP
View PDF8 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0007]To increase the efficiency of certain algorithms running on the ManArray, it is possible to operate indirectly on VLIW instructions stored in a VLIW memory with the indirect execution initiated by an execute VLIW (XV) instruction and with different VLIW instructions stored in the multiple PEs at the same VLIW memory address. When the SP instruction causes this set of iVLIWs to execute concurrently across all PEs, Synchronous MIMD or SMIMD operation occurs. A one-to-many mapping exists between the XV instruction and the multiple different iVLIWs that exist in each PE. No specialized synchronization mechanism is necessary since the multiple different iVLIW executions are instigated synchronously by the single controlling point SP with the issuance of the XV instruction. Due to the use of a Receive Model to govern communication between PEs and a ManArray network, the communication latency characteristic common to MIMD operations is avoided as discussed further below. Additionally, since there is only one synchronous locus of execution, additional MIMD hardware for separate program flow in each PE is not required. In this way, the machine is organized to support SMIMD operations at a reduced hardware cost while minimizing communication latency.

Problems solved by technology

The problem with SIMD is that not all algorithms can make efficient use of the available parallelism existing in the processor.
The amount of parallelism inherent in different algorithms varies leading to difficulties in efficiently implementing a wide variety of algorithms on SIMD machines.
The problem with MIMD machines is the latency of communications between multiple processors leading to difficulties in efficiently synchronizing processors to cooperate on the processing of an algorithm.
Typically, MIMD machines also incur a greater cost of implementation as compared to SIMD machines since each MIMD PE must have its own instruction sequencing mechanism which can amount to a significant amount of hardware.
MIMD machines also have an inherently greater complexity of programming control required to manage the independent parallel processing elements.
Consequently, levels of programming complexity and communication latency occur in a variety of contexts when parallel processing elements are employed.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods and apparatus for efficient synchronous MIMD operations with IVLIW PE-TO-PE communication
  • Methods and apparatus for efficient synchronous MIMD operations with IVLIW PE-TO-PE communication
  • Methods and apparatus for efficient synchronous MIMD operations with IVLIW PE-TO-PE communication

Examples

Experimental program
Comparison scheme
Effect test

example 1-1

Loading Synchronous MIMD iVLIWs into PE VIMs

[0063]

! first load in instructions common to PEs 1, 2, 3lim.s.h0 SCR1, 1! mask off PEO in order to load in 1, 2, 3lim.s.h0 VAR, 0! load VIM base address reg v0 with zerolv.p v0, 27, 2, d=, f=! load VIM entry v0+27 (=27) with the! next two instructions; disable no! instrs; default flag setting to ALU li.p.w R1, A1+, A7! load instruction into LU fmpy.pm.1fw R6, R3, R31! mpy instruction into MAUlv.p v0, 28, 2, d=, f=! load VIM entry v0+28 (=28) with the! next two instructions; disable no! instrs; default flag setting to ALU li.p.w R2, A1+, A7! load instruction into LU fmpy.pm.1fw R4, R1, R31! mpy instruction into MAUlv.p v0, 29, 2, d=, f=! load VIM entry v0+29 (=29) with the! next two instructions; disable no! instrs; default flag setting to ALU li.p.w R3, A1+, A7! load instruction into LU fmpy.pm.1fw R5, R2, R31! mpy instruction into MAU! now load in instructions unique to PEOlim.s.h0 SCR1, 14! mask off PEs 1, 2, 3 to load PEOnop...

example 1-2

Executing Synchronous MIMD iVLIWs from PE VIMs

[0064]

! address register, loop, and other setup would be here. . .! startup VLIW execution! f= parameter indicates default to LV flag settingxv.p v0, 27, e=l, f=! execute VIM entry V0+27, LU onlyxv.p v0, 28, e=lm, f=! execute VIM entry V0+28, LU, MAU onlyxv.p v0, 29, e=lm, f=! execute VIM entry V0+29, LU, MAU onlyxv.p v0, 27, e=lmd, f=! execute VIM entry V0+27, LU, MAU,DSU onlyxv.p v0, 28, e=lamd, f=! execute VIM entry V0+28, all unitsexcept SUxv.p v0, 29, e=lamd, f=! execute VIM entry V0+29, all unitsexcept SUxv.p v0, 27, e=lamd, f=! execute VIM entry V0+27, all unitsexcept SUxv.p v0, 28, e=lamd, f=! execute VIM entry V0+28, all unitsexcept SUxv.p v0, 29, e=lamd, f=! execute VIM entry V0+29, all unitsexcept SU! loop body - mechanism to enable looping has been previously set uploop_begin: xv.p v0, 27, e=slamd, f=! execute v0+27, all units xv.p v0, 28, e=slamd, f=! execute v0+28, all unitsloop_end: xv.p v0, 29, e=slamd, f=! execute v0+2...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A SIMD machine employing a plurality of parallel processor (PEs) in which communications hazards are eliminated in an efficient manner. An indirect Very Long Instruction Word instruction memory (VIM) is employed along with execute and delimiter instructions. A masking mechanism may be employed to control which PEs have their VIMs loaded. Further, a receive model of operation is preferably employed. In one aspect, each PE operates to control a switch that selects from which PE it receives. The present invention addresses a better machine organization for execution of parallel algorithms that reduces hardware cost and complexity while maintaining the best characteristics of both SIMD and MIMD machines and minimizing communication latency. This invention brings a level of MIMD computational autonomy to SIMD indirect Very Long Instruction Word (iVLIW) processing elements while maintaining the single thread of control used in the SIMD machine organization. Consequently, the term Synchronous-MIMD (SMIMD) is used to describe the present approach.

Description

RELATED APPLICATIONS[0001]The present application is a continuation of Ser. No. 09 / 187,539, filed on Nov. 6, 1998, now U.S. Pat. No. 6,151,668.[0002]The present invention claims the benefit of U.S. Provisional Application Ser. No. 60 / 064,619 entitled “Methods and Apparatus for Efficient Synchronous MIMD VLIW Communication” and filed Nov. 7, 1997.FIELD OF THE INVENTION[0003]For any Single Instruction Multiple Data stream (SIMD) machine with a given number of parallel processing elements, there will exist algorithms which cannot make efficient use of the available parallel processing elements, or in other words, the available computing resources. Multiple Instruction Multiple Data stream (MIMD) class machines execute some of these algorithms with more efficiency but require additional hardware to support a separate instruction stream on each processor and lose performance due to communication latency with lightly coupled program implementations. The present invention addresses a bette...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F15/16G06F9/38G06F15/80
CPCG06F9/30145G06F9/30167G06F9/3802G06F9/3853G06F9/3885G06F9/3887G06F9/3889G06F9/00
Inventor PECHANEK, GERALD GEORGEDRABENSTOTT, THOMAS L.REVILLA, JUAN GUILLERMOSTRUBE, DAVIDMORRIS, GRAYSON
Owner ALTERA CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products