A data processor comprises a plurality of
processing elements (PEs), with memory local to at least one of the
processing elements, and a data packet-switched network interconnecting the
processing elements and the memory to enable any of the PEs to access the memory. The network consists of nodes arranged linearly or in a grid, e.g., in a
SIMD array, so as to connect the PEs and their
local memories to a common controller. Transaction-enabled PEs and nodes set flags, which are maintained until the transaction is completed and
signal status to the controller e.g., over a series of OR-gates. The processor performs memory accesses on data stored in the memory in response to control signals sent by the controller to the memory. The
local memories share the same
memory map or space. External memory may also be connected to the “end” nodes
interfacing with the network, eg to provide cache. One or more further processors may similarly be connected to the network so that all the PE memories from all the processors share the same
memory map or space. The packet-switched network supports multiple concurrent transfers between PEs and memory. Memory accesses include block and / or broadcast read and write operations, in which data can be replicated within the nodes and, according to the operation, written into the
shared memory or into the local PE memory.