Improved method and apparatus for 
parallel processing. One embodiment provides a multiprocessor computer 
system that includes a first and second node controller, a number of processors being connected to each node controller, a memory connected to each controller, a first input / output 
system connected to the first node controller, and a communications network connected between the node controllers. The first node controller includes: a crossbar unit to which are connected a memory port, an input / output port, a network port, and a plurality of independent processor ports. A first and a second processor port connected between the crossbar unit and a first subset and a second subset, respectively, of the processors. In some embodiments of the 
system, the first node controller is fabricated onto a single integrated-circuit 
chip. Optionally, the memory is packaged on plugable memory / 
directory cards wherein each card includes a plurality of memory chips including a first subset dedicated to holding memory data and a second subset dedicated to holding 
directory data. Further, the memory port includes a memory 
data port including a memory data 
bus and a 
memory address bus coupled to the first subset of memory chips, and a 
directory data port including a directory data 
bus and a directory 
address bus coupled to the second subset of memory chips. In some such embodiments, the ratio of (memory 
data space) to (directory 
data space) on each card is set to a value that is based on a size of the multiprocessor computer system.