For improving performance of mask ROM, bit line is multi-divided for reducing capacitance, so that multi-stage sense amps are used for reading, wherein a local sense amp receives an output from a memory cell through the bit line, and a global sense amp receives the local sense amp output. By the sense amps, a voltage difference in the bit line is converted to a time difference for differentiating data “1” and data “0”. For example, data “1” is quickly transferred to an output latch circuit through the sense amps with high gain, but data “0” is rejected by a locking signal based on data “1” as a reference signal. Furthermore, a buffered data path is used for transferring data wherein the buffered data path includes a forwarding write line and a returning read line. Additionally, alternative circuits and memory cell structures for implementing the mask ROM are described.