Embodiments of both a non-volatile main memory (NVMM) 
single node and a multi-node computing 
system are disclosed. One embodiment of the NVMM 
single node system has a cache subsystem composed of all 
DRAM, a large main memory subsystem of all NAND flash, and provides different address-mapping policies for each 
software application. The NVMM 
memory controller provides high, sustained bandwidths for 
client processor requests, by managing the 
DRAM cache as a large, highly banked 
system with multiple ranks and multiple 
DRAM channels, and large cache blocks to accommodate large NAND flash pages. Multi-node systems organize the NVMM single nodes in a large inter-connected cache / flash main memory low-latency network. The entire interconnected flash system exports a single 
address space to the 
client processors and, like a unified cache, the flash system is shared in a way that can be divided unevenly among its 
client processors: client processors that need more memory resources receive it at the expense of processors that need less storage. Multi-node systems have numerous configurations, from board-area networks, to multi-board networks, and all nodes are connected in various Moore graph topologies. Overall, the disclosed 
memory architecture dissipates less power per GB than traditional DRAM architectures, uses an extremely large 
solid-state capacity of a terabyte or more of main memory per 
CPU socket, with a cost-per-bit approaching that of 
NAND flash memory, and performance approaching that of an all DRAM system.