Memory controller, system on chip, and electronic device including address mapping table
By introducing a memory request queue, address converter, and physical layer into the memory controller and optimizing the data transmission path using multiple address mapping tables, the data transmission latency problem caused by the separation of the processor and memory in three-dimensional memory devices is solved, and data processing efficiency is improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SAMSUNG ELECTRONICS CO LTD
- Filing Date
- 2020-10-19
- Publication Date
- 2026-06-23
Smart Images

Figure CN113138956B_ABST
Abstract
Description
[0001] Cross-reference to related applications
[0002] This application is based on and claims priority to Korean Patent Application No. 10-2020-0005882, filed with the Korean Intellectual Property Office on January 16, 2020, the entire disclosure of which is incorporated herein by reference. Technical Field
[0003] The present invention relates to a memory controller, system-on-a-chip, and electronic device capable of storing multiple address mapping tables. Background Technology
[0004] Recently, multiple semiconductor dies have been stacked to increase the integration density of memory devices. This allows for high-speed processing of large amounts of data in three-dimensional memory devices. To achieve this three-dimensional structure, through-silicon vias (TVS) can be used to stack multiple semiconductor dies. However, even with increased data processing speeds, the separation of the processor and memory leads to latency in data transfer between them. To address this issue, processing-in-memory (PIM), which integrates both the processor and memory, has become a focus of attention. Summary of the Invention
[0005] Embodiments of the present invention provide a memory controller, system-on-a-chip, and electronic device capable of storing multiple address mapping tables.
[0006] According to an exemplary aspect of this disclosure, a memory controller is provided, comprising: a memory request queue configured to store memory requests associated with a memory device including a first memory die and a second memory die having a shared channel; an address translator configured to: select one of a first address mapping table and a second address mapping table based on bits of the physical address of the memory request, and translate the physical address into a memory address for the first memory die and the second memory die based on the selected address mapping table in the first address mapping table and the second address mapping table; and a physical layer configured to send the memory address to the memory device via the shared channel.
[0007] According to another exemplary aspect of this disclosure, an on-chip system is provided, comprising: a processor configured to generate a memory request; and a memory controller configured to: select one of a first address mapping table and a second address mapping table based on bits of a physical address of the memory request; translate the physical address into a memory address of a memory device based on the selected address mapping table in the first and second address mapping tables; and access one of a first memory die or a second memory die via a shared channel based on the memory address.
[0008] According to another exemplary aspect of this disclosure, an electronic device is provided, comprising: a memory device including a first memory die, a second memory die, and a shared channel for the first memory die and the second memory die; and an on-chip system including: a processor configured to generate a memory request; and a memory controller configured to: select one of a first address mapping table and a second address mapping table based on bits of the physical address of the memory request; translate the physical address into a memory address of the memory device based on the selected address mapping table in the first address mapping table and the second address mapping table; and access one of the first memory die and the second memory die via the shared channel based on the memory address.
[0009] According to another exemplary aspect of this disclosure, an address converter circuit is provided, comprising: an address range register configured to store bits of a physical address requested by a memory; a plurality of address conversion circuits, each configured to convert a physical address to a memory address based on one of a plurality of address mapping tables; and a mapping selection circuit configured to select one of the plurality of address conversion circuits based on the values of the bits stored in the address range register. Attached Figure Description
[0010] The above and other aspects, objects, and features of the inventive concept will become clear from the exemplary embodiments of the inventive concept illustrated in detail with reference to the accompanying drawings.
[0011] Figure 1 An electronic device according to an example embodiment of the present invention is shown.
[0012] Figure 2 An electronic device according to another example embodiment of the concept of the present invention is shown.
[0013] Figure 3 It shows Figure 1 and Figure 2 A block diagram of the memory controller for a system-on-a-chip.
[0014] Figure 4 It shows Figure 3Block diagram of the address converter. Figure 5 It shows Figure 1 and Figure 2 Block diagram of the system-on-a-chip.
[0015] Figure 6 Detailed illustration Figure 1 and Figure 2 Memory devices.
[0016] Figure 7 Detailed illustration Figure 1 and Figure 2 Memory devices.
[0017] Figure 8 The physical address was shown to be Figure 5 An example of a memory controller translating into a memory address.
[0018] Figure 9 It shows Figure 6 A block diagram of the memory bank group. Figure 10 It shows Figure 6 A block diagram of a PIM core.
[0019] Figure 11 and Figure 12 An example is shown of arranging the data in the PIM die according to the address mapping table.
[0020] Figure 13 It shows the result of PE according to Figure 12 An example of the computation performed by the data arrangement.
[0021] Figure 14 It shows Figure 5 An example of a processor accessing a memory controller and a memory controller accessing a memory device.
[0022] Figure 15 An electronic device according to another example embodiment of the concept of the present invention is shown. Detailed Implementation
[0023] Figure 1 An electronic device 100a according to an example embodiment of the present invention is shown. The electronic device 100a may include a system-on-a-chip (SoC) 1000, a memory device 2000, and an inserter 3000. The electronic device 100a may also be referred to as a “computing system” or an “electronic system”.
[0024] The system-on-chip 1000 can execute applications supported by the electronic device 100a by using the memory device 2000. The system-on-chip 1000 may also be referred to as a "host" or "application processor (AP)". The system-on-chip 1000 may include a memory controller 1100 that controls the memory device 2000 and performs operations to input data to and / or output data from the memory device 2000. For example, the memory controller 1100 may access the memory device 2000 in a direct memory access (DMA) manner. The memory controller 1100 may include a physical layer (PHY) 1130 that is electrically connected to the PHY 2930 of the memory device 2000 via an inserter 3000.
[0025] Memory device 2000 may include processor-in-memory (PIM) dies 2100 to 2800 and buffer dies 2900. Each of the PIM dies 2100 to 2800 may also be referred to as a "memory die," "core die," "feature-in-memory (FIM) die," or "controller die," and buffer die 2900 may also be referred to as an "interface die," "logic die," or "controller die." A die may also be referred to as a "chip." PIM die 2100 may be stacked on buffer die 2900, and PIM die 2200 may be stacked on PIM die 2100. Memory device 2000 may have a three-dimensional memory structure in which multiple dies 2100 to 2900 are stacked. To stack dies 2100 to 2900, memory device 2000 may include through-silicon vias (TSVs) penetrating dies 2100 to 2900, and microbumps (BPs) electrically connecting the TSVs. The TSVs and BPs can provide electrical and physical paths between dies 2100 to 2900 in memory device 2000. The number of TSVs and BPs is not limited to this. Figure 1 The example shown.
[0026] The memory device 2000 may relate to a PIM or FIM and may perform data processing operations in addition to reading and writing data. The memory device 2000 may correspond to a computing memory device including random access memory (RAM) and processing elements (PE) integrated in the same die. Each of the PIM dies 2100 to 2800 of the memory device 2000 may include a memory cell array (MCA) and a processing element (PE) for performing data processing operations, wherein the memory cell array (MCA) is used for reading and writing data and includes multiple memory cells. For example, the PE may also be referred to as a "processor" or "processing circuitry".
[0027] Stack identifier SID0 can be assigned to PIM dies 2100 to 2400, and stack identifier SID1 can be assigned to PIM dies 2500 to 2800. Stack identifiers SID0 and SID1 can be used to identify or distinguish multiple PIM dies 2100 to 2800 stacked on buffer die 2900. For example, memory controller 1100 can use stack identifier SID0 to access PIM dies 2100 to 2400, or memory controller 1100 can access PIM dies 2500 to 2800 by using stack identifier SID1. The total number of PIM dies 2100 to 2800 is not limited to this. Figure 1 The example shown. Additionally, the number of PIM dies 2100 to 2400 for each stack identifier SID0 and / or the number of PIM dies 2500 to 2800 for each stack identifier SID1 is not limited to... Figure 1 The example shown.
[0028] Buffer die 2900 can operate as an interface circuit between memory controller 1100 and PIM dies 2100 to 2800. Buffer die 2900 can receive commands, data, signals, etc., sent from memory controller 1100 via inserter 3000, and can transmit the received commands, data, signals, etc., to PIM dies 2100 to 2800 via through-silicon vias (TSVs) and microbumps (BPs). Buffer die 2900 can also receive data output from PIM dies 2100 to 2800 via through-silicon vias (TSVs) and microbumps (BPs), and can transmit the received data to memory controller 1100 via inserter 3000. Buffer die 2900 may include a PHY 2930, buffer circuitry, or interface circuitry that receives and amplifies the aforementioned signals.
[0029] In an example embodiment, memory device 2000 may be a general-purpose dynamic random access memory (DRAM) such as double data rate synchronous dynamic random access memory (DDR SDRAM), a mobile DRAM device such as low power double data rate (LPDDR) SDRAM, a graphics DRAM device such as graphics double data rate (GDDR) synchronous graphics random access memory (SGRAM), or a DRAM device that provides high capacity and high bandwidth such as wide I / O, high bandwidth memory (HBM), HBM2, HBM3, or hybrid memory cube (HMC).
[0030] The interposer 3000 connects the system-on-chip 1000 and the memory device 2000. The interposer 3000 provides a physical path connecting the PHY 2930 of the memory device 2000 and the PHY 1130 of the system-on-chip 1000, and is formed of a conductive material for electrical connection. A substrate or printed circuit board (PCB) can be used instead of the interposer 3000.
[0031] Figure 2 An electronic device according to another example embodiment of the present invention is illustrated. Electronic device 100b may include a system-on-chip 1000 and a memory device 2000. The system-on-chip 1000 and memory device 2000 in electronic device 100a may be interconnected via an inserter 3000, while the memory device 2000 of electronic device 100b may be stacked on the system-on-chip 1000. The system-on-chip 1000 may also include a through-silicon via (TSV) for providing an electrical connection to the memory device 2000, and the PHY 1130 of the system-on-chip 1000 and the PHY 2930 of the memory device 2000 may be electrically interconnected via microbumps (BP).
[0032] Figure 3 It shows Figure 1 and Figure 2 The diagram shows a block diagram of the memory controller of the system-on-chip. The memory controller 1100 may include a memory request queue 1110, an address translator 1120, and a PHY 1130. For example, the above components, the memory request queue 1110, the address translator 1120, and the PHY 1130, as well as other components of the memory controller 1100, may be implemented in hardware.
[0033] Memory request queue 1110 can receive and store memory requests generated within the system-on-chip 1000. A memory request associated with memory device 2000 can request an operation on memory device 2000 (e.g., a read operation, write operation, refresh operation, or processing operation) and can include the physical address of memory device 2000. The physical address can be used to access memory device 2000 and can be limited based on the capacity of memory device 2000. According to an example embodiment, unlike virtual addresses, physical addresses can be limited based on the capacity of memory device 2000. The rate at which memory requests are generated at the system-on-chip 1000 can be higher than the rate at which memory device 2000 processes memory requests. Memory request queue 1110 can store multiple memory requests.
[0034] Address translator 1120 can translate the physical address PA of a memory request stored in memory request queue 1110 into a memory address MA. The memory address MA can be used to access memory device 2000 and can indicate a specific region of memory device 2000. According to an example embodiment, the specific region can be a specific die, or a register or memory cell within a specific die. Address translator 1120 can translate the physical address PA into the memory address MA based on multiple address mapping tables. For example, the multiple address mapping tables may include a first address mapping table AMT1 and a second address mapping table AMT2.
[0035] For example, address translator 1120 can select one of multiple address mapping tables AMT1 and AMT2 based on the logical value (e.g., 0 or 1) of a bit of physical address PA, which corresponds to the stack identifier SID of physical address PA, and can translate physical address PA into memory address MA based on the selected address mapping table. Figure 3 In the example embodiment shown, the number of address mapping tables is two, namely AMT1 and AMT2. Thus, the number of bits used to select the multiple address mapping tables AMT1 and AMT2 is one. However, the number of address mapping tables is not limited to two, but can be greater than two. In this case, the number of bits used to select the multiple address mapping tables AMT1 and AMT2 can be greater than one. For another example, the most significant bit (MSB) of the physical address PA can be used to select the multiple address mapping tables AMT1 and AMT2.
[0036] According to an example embodiment, address converter 1120 can examine a first logical value of the bits of physical address PA and select address mapping table AMT1, where the first logical value corresponds to the stack identifier SID of memory address MA. Address converter 1120 can partition or classify the bits of physical address PA into fields F1 to F6 based on address mapping table AMT1. Bits of field F1 can correspond to bits above (or high-order bits) of field F6. For example, bits of field F1 can be the MSB of physical address PA. Address mapping table AMT1 can map bits of field F1 to the stack identifier SID, bits of field F2 to the row address, bits of field F3 to the column address, bits of fields F4 and F5 to the bank addresses BA0 to BA3, and bits of field F6, which are bits below (or low-order bits) of fields F1 to F5, to the cache line CL. Address converter 1120 can map bits of field F1 to stack identifier SID, bits of field F2 to row address, bits of field F3 to column address, and bits of fields F4 and F5 to memory bank addresses BA0 to BA3, based on address mapping table AMT1. The memory address MA can include stack identifier SID, memory bank addresses BA0 to BA3, row address, and column address. Stack identifier SID can be used to identify references. Figures 1 to 2 The PIM dies 2100 to 2800 are described. Memory bank addresses BA0 to BA3 can be used to identify the memory bank of the memory cell array MCA that constitutes each of the PIM dies 2100 to 2800. Row and column addresses can be used to identify memory cells within the memory bank. Cache lines CL can correspond to cache cells in the system-on-chip 1000 and can include data associated with memory device 2000 (e.g., read data read from memory device 2000 or write data to be written to memory device 2000). Bits in fields F1 to F5 can constitute the memory address MA. Address mapping table AMT1 can map the high-order bits of physical address PA to the stack identifier SID, row address, and column address Column of memory address MA, and can map the low-order bits of physical address PA to the memory bank addresses BA0 to BA3 of memory address MA. When address mapping table AMT1 is selected, the bits of physical addresses PA of bank addresses BA0 to BA3 that are converted to memory addresses MA can correspond to the bits (or low bits) below the bits of physical addresses PA of row and column addresses that are converted to memory addresses MA. The number of bits included in each of fields F1 to F6 can be one or more.
[0037] According to an example embodiment, address converter 1120 can examine the second logical value of the bits of physical address PA and select address mapping table AMT2, where the second logical value corresponds to the stack identifier SID of memory address MA. Address converter 1120 can divide or classify the bits of physical address PA into fields F1 to F7 based on address mapping table AMT2. Address mapping table AMT2 can map the bits of field F1 to stack identifier SID, the bits of field F3 to row address, the bits of field F5 to column address, the bits of fields F2, F4, and F6 to bank addresses BA0 to BA3, and map the bits of field F7, which are the lower bits (or less significant bits) of fields F1 to F6, to cache line CL. Address converter 1120 can, based on address mapping table AMT2, map bits of field F1 to stack identifier SID, bits of field F3 to row address, bits of field F5 to column address, and bits of fields F2, F4, and F6 to memory bank addresses BA0 to BA3. The bits of fields F1 to F6 can constitute the memory address MA. When address mapping table AMT2, instead of AMT1, is selected, at least one bit of the physical address PA of memory bank addresses BA0 to BA3 is translated into memory address MA. Figure 3 (BA1, but not limited to) can correspond to the bits above (or the high-order bits) of the physical address PA of the column address Column that is converted to memory address MA. Furthermore, at least one bit of the physical address PA of the bank address BA0 to BA3 that is converted to memory address MA (in...) Figure 3 The bits (BA0, but not limited to BA0) can correspond to the bits above (or the high bits) of the physical address PA of the row address (Row) that are converted to memory address MA. For example, the bits of the physical address PA of the row address (Row) or column address (Column) that are converted to memory address MA can be placed between the bits of the physical address PA of the memory bank addresses BA0 to BA3 that are converted to memory address MA. The position of the bit on physical address PA that contains the field F1, which is converted to the stack identifier SID through address mapping table AMT1, can be the same as the position of the bit on physical address PA that contains the field F1, which is converted to the stack identifier SID through address mapping table AMT2. The position of the bit on physical address PA that contains the field F6, which corresponds to the cache line CL according to address mapping table AMT1, can be the same as the position of the bit on physical address PA that contains the field F7, which corresponds to the cache line CL according to address mapping table AMT2.
[0038] According to the above example embodiment, address converter 1120 checks the first logical value of the bits of physical address PA and selects address mapping table AMT1, and address converter 1120 checks the second logical value of the bits of physical address PA and selects address mapping table AMT2. However, this disclosure is not limited thereto. For example, according to another example embodiment, address converter 1120 may check the second logical value of the bits of physical address PA and select address mapping table AMT1, and address converter 1120 may check the first logical value of the bits of physical address PA and select address mapping table AMT2.
[0039] The mappings in address mapping table AMT1 and address mapping table AMT2 can be different from each other and can be independent of each other. For example, the positions of the bits in physical address PA corresponding to memory addresses BA0 to BA3, which are translated to memory addresses MA through address mapping table AMT1, can be different from the positions of the bits in physical address PA corresponding to memory addresses BA0 to BA3, which are translated to memory addresses MA through address mapping table AMT2. For example, the positions of the bits on physical address PA corresponding to memory addresses BA0 to BA3 can be changed according to address mapping tables AMT1 and AMT2.
[0040] Address converter 1120 can simultaneously support different address mapping tables AMT1 and AMT2. The way data is arranged in memory device 2000 when address mapping table AMT1 is selected can differ from the way data is arranged in memory device 2000 when address mapping table AMT2 is selected. For example, memory controller 1100 can select PIM dies from a plurality of PIM dies 2100 to 2800 that are designed to perform read or write operations without performing PE (e.g., ...). Figure 1 and Figure 2 The address mapping table AMT1 for PIM dies 2100 to 2400 with SID0. For example, memory controller 1100 can select the PIM die (e.g., one of the multiple PIM dies 2100 to 2800) intended to perform PE (e.g., ...). Figure 1 and Figure 2 The address mapping table AMT2 of PIM dies 2500 to 2800 with SID1. Depending on the exemplary manner of selecting the address mapping table, the data arrangement in a PIM die where only read or write operations are performed can differ from the data arrangement in a PIM die where PE is performed. Therefore, when the memory controller 1100 according to an embodiment of the present invention dynamically selects one of the address mapping tables AMT1 and AMT2 based on whether PE is performed in the PIM die, the memory controller 1100 can arrange the data to be suitable for read or write operations in a PIM die where PE is not performed, and can arrange the data to be suitable for performing PE in a PIM die where PE is performed.
[0041] PHY 1130 can access memory device 2000 based on memory address MA of address translator 1120. PHY 1130 can also be referred to as a "memory interface circuit". For example, PHY 1130 can generate and output command and address signals CA based on memory requests in memory request queue 1110 and memory address MA of address translator 1120. PHY 1130 can send memory commands and memory address MA based on memory requests to memory device 2000. PHY 1130 can change the logic values of command and address signals CA differently according to memory requests in memory request queue 1110 and memory address MA of address translator 1120. PHY 1130 can generate and output data input / output signals DQ based on memory requests in memory request queue 1110, or it can receive data input / output signals DQ sent from memory device 2000. Data input / output signals DQ can include write data to be written to memory device 2000 or read data to be read from memory device 2000.
[0042] Command and address signals CA and data input / output signals DQ can be provided for each of channels CH1 to CH4. For example, memory controller 1100 can access PIM dies 2100 and 2500 through channel CH1, PIM dies 2200 and 2600 through channel CH2, PIM dies 2300 and 2700 through channel CH3, and PIM dies 2400 and 2800 through channel CH4. PIM dies 2100 and 2500 can share channel CH1, PIM dies 2200 and 2600 can share channel CH2, PIM dies 2300 and 2700 can share channel CH3, and PIM dies 2400 and 2800 can share channel CH4.
[0043] The memory controller 1100 can select one of a plurality of PIM dies allocated to a channel by using a stack identifier SID of the memory controller 1100. The memory controller 1100 can access one of the plurality of PIM dies allocated to a channel based on a memory address MA. For example, when the stack identifier SID has a first logic value (i.e., SID0), command and address signals CA and data input / output signals DQ transmitted through channels CH1 to CH4 can be associated with PIM dies 2100 to 2400. For example, when the stack identifier SID has a second logic value (i.e., SID1), command and address signals CA and data input / output signals DQ transmitted through channels CH1 to CH4 can be associated with PIM dies 2500 to 2800. For example, the number of PIM dies allocated to each channel, the number of channels, the number of channels allocated to a single PIM die, etc., are not limited to... Figure 3 For example, a portion of the bits in the physical address PA (e.g., the bits above field F1 (or the high bits)) can indicate whether the memory address MA is associated with any of the channels CH1 through CH4, and can be used to distinguish between channels CH1 and CH4.
[0044] Figure 4 It shows Figure 3 A block diagram of an address converter. The address converter 1120 may include an address range register 1121, a mapping selection circuit 1122, and address conversion circuits 1123_1 to 1123_2. M For example, address range register 1121, mapping selection circuit 1122, and address translation circuits 1123_1 to 1123_2. M Other components of the address converter can be implemented in hardware.
[0045] Address range register 1121 can store bits PA[N:N-M+1] of physical address PA[N:0] (where N is a natural number, "N+1" corresponds to the number of bits, and M is a natural number). (and...) Figure 3 Similarly, in the case where the address converter 1120 includes two address mapping tables AMT1 and AMT2, "M" can be "1", and the address range register 1121 can store bit PA[N]. The address converter 1120 can include two or more address mapping tables, and the address range register 1121 can store one or more bits. As mentioned above, the bit PA[N∶N-M+1] stored in the address range register 1121 can be the stack identifier SID of the memory address MA.
[0046] Mapping selection circuit 1122 can select address translation circuits 1123_1 to 1123_2 based on the value of bit PA[N:N-M+1] stored in address range register 1121. M One example is that the mapping selection circuit 1122 can activate the address translation circuits 1123_1 to 1123_2 based on the value of bit PA[N:N-M+1] stored in the address range register 1121. M Enable signals EN_1 to EN_2 M One of them, and can activate the remaining enable signals.
[0047] Address translation circuit 1123 can be activated by enable signal EN_1, and can be based on address mapping table AMT1 (and... Figure 3 The address mapping table (AMT1 is the same) translates the physical address PA[N:0] into the memory address MA[L:0] (L is a natural number and less than N). The address translation circuit 1124 can be enabled by the signal EN_2. M Activation, and can be based on the address mapping table AMT2 (and Figure 3 The address converter 1120 (which uses the same address mapping table AMT2) translates the physical address PA[N:0] into the memory address MA[L:0]. The address converter 1120 may include 2... M An address mapping table, and can include 2 M Address translation circuits 1123_1 to 1123_2 M , the 2 M Address translation circuits 1123_1 to 1123_2 M Based on 2 respectively M Each address mapping table translates the physical address PA[N:0] into the memory address MA[L:0].
[0048] Figure 5 It shows Figure 1 and Figure 2 The block diagram of the system-on-chip is shown below. The system-on-chip 1000 may include a memory controller 1100, a processor 1200, an on-chip memory 1300, and a system bus 1400.
[0049] The memory controller 1100 may include a memory request queue 1110, an address converter 1120, a PHY 1130, a control register 1141, a memory bank status register 1142, a system bus interface circuit 1150, a memory command queue 1160, a command scheduler 1170, a command sequencer 1180, a read buffer 1191, and a write buffer 1192. (Relative to...) Figure 3 and Figure 4A detailed description of the memory request queue 1110, the address translator 1120 supporting all different address mapping tables AMT1 and AMT2, and the PHY 1130 are provided. Therefore, additional descriptions associated with the memory request queue 1110, the address translator 1120, and the PHY 1130 will be omitted to avoid redundancy.
[0050] Control register 1141 can store and provide control information for various components in memory controller 1100, including memory request queue 1110, address converter 1120, PHY 1130, control register 1141, memory status register 1142, system bus interface circuit 1150, memory command queue 1160, command scheduler 1170, command sequencer 1180, read buffer 1191, and write buffer 1192. The control information stored in control register 1141 can be modified by processor 1200 or by user requests. Memory request queue 1110, address converter 1120, PHY 1130, control register 1141, memory status register 1142, system bus interface circuit 1150, memory command queue 1160, command scheduler 1170, command sequencer 1180, read buffer 1191, and write buffer 1192 can operate based on the various control information stored in control register 1141.
[0051] Memory bank status register 1142 can store information about multiple memory banks in memory device 2000 (see reference). Figure 7 and Figure 8 (Detailed description) Status information. For example, status information can indicate whether the memory is activated or precharged.
[0052] System bus interface circuit 1150 can receive memory requests from multiple cores 1210, 1220, 1230, and 1240 in processor 1200 via system bus 1400, based on the communication protocol of system bus 1400. System bus interface circuit 1150 can provide, send, or write received memory requests to memory request queue 1110. According to another example embodiment, the multiple cores in processor 1200 are not limited to... Figure 5 The multiple cores shown are 1210, 1220, 1230, and 1240.
[0053] The memory command queue 1160 can store memory commands for memory requests stored in the memory request queue 1110, as well as memory addresses translated by the address translator 1120. The command scheduler 1170 can adjust the processing order of the memory commands and memory addresses stored in the memory command queue 1160 based on the memory bank status information stored in the memory bank status register 1142. The command scheduler 1170 can perform scheduling on the memory commands and memory addresses stored in the memory command queue 1160. The command sequencer 1180 can output or provide the memory commands and memory addresses stored in the memory command queue 1160 to the PHY 1130 based on the scheduling order by the command scheduler 1170.
[0054] PHY 1130 may include a clock (CK) generator 1131, a command and address (CA) generator 1132, a receiver 1133, and a transmitter 1134. Clock generator 1131 generates a clock signal CK that is output to memory device 2000. For example, memory device 2000 may be a synchronous memory device operating based on clock signal CK. Command and address generator 1132 can receive memory commands and memory addresses from command sequencer 1180 and can send command and address signals CA, including memory commands and memory addresses, to memory device 2000. Receiver 1133 can receive data input / output signals DQ, including read data, sent from memory device 2000. Receiver 1133 can provide the received read data to read buffer 1191. Transmitter 1134 can receive write data from write buffer 1192. Transmitter 1134 can send data input / output signals DQ, including write data, to memory device 2000. Figure 5 The channel CH can correspond to Figure 3 One of channels CH1 to CH4. The PHY 1130 can generate and output the clock signal CK and command and address signals CA for each of channels CH1 to CH4, and can exchange the data input / output signals DQ for each of channels CH1 to CH4 with the memory device 2000.
[0055] Read buffer 1191 can store read data provided by receiver 1133. For example, read buffer 1191 can provide as much read data as cache line CL to system bus interface circuit 1150, and system bus interface circuit 1150 can send the read data to processor 1200 or on-chip memory 1300 via system bus 1400. Write buffer 1192 can receive and store write data provided by system bus interface circuit 1150 for sending the write data to memory device 2000. Write buffer 1192 can provide as much write data as the data input / output unit of memory device 2000 to transmitter 1134.
[0056] Processor 1200 can execute various software (e.g., applications, operating systems, file systems, and device drivers) loaded onto on-chip memory 1300. Processor 1200 may include multiple homogeneous cores or multiple heterogeneous cores, and may include multiple cores 1210 to 1240. For example, each core among cores 1210 to 1240 may include at least one of the following: a central processing unit (CPU), an image signal processing unit (ISP), a digital signal processing unit (DSP), a graphics processing unit (GPU), a vision processing unit (VPU), a tensor processing unit (TPU), and a neural processing unit (NPU). Each core among cores 1210 to 1240 can generate memory requests associated with memory device 2000. Memory requests generated by each core among cores 1210 to 1240 may include the aforementioned physical address PA. Applications, operating systems, file systems, device drivers, etc., used to drive electronic devices 100a / 100b can be loaded onto on-chip memory 1300. For example, on-chip memory 1300 may be static RAM (SRAM) with a higher data input / output speed than memory device 2000, or it may be a cache memory shared by cores 1210 to 1240, but the inventive concept is not limited thereto. System bus 1400 provides a communication path between memory controller 1100, processor 1200, and on-chip memory 1300. For example, system bus 1400 may be an Advanced High Performance Bus (AHB), Advanced System Bus (ASB), Advanced Peripheral Bus (APB), or Advanced Scalable Interface (AXI) based on Advanced Microcontroller Bus Architecture (AMBA).
[0057] Figure 6 Detailed illustration Figure 1 and Figure 2The memory controller 1100 can access the memory device 2000 through channels CH1 to CHK (K being a natural number of 2 or greater). For example, PIM dies 2100 and 2500 can be assigned to channel CH1, and PIM dies 2400 and 2800 can be assigned to channel CHk. As described above, the remaining dies 2200, 2300, 2600, and 2700 can be assigned to other channels. PIM dies 2100 and 2500 assigned to the same channel CH can be identified by the stack identifier SID0 / 1. The memory device 2000 can include paths Path_1 to Path_K, which correspond to channels CH1 to CHK respectively, and signals transmitted through channels CH1 to CHK are transmitted through paths Path_1 to Path_K. Paths Path_1 to Path_K can provide electrical connection paths between buffer die 2900 and PIM dies 2100 to 2800, and can include references Figure 1 and Figure 2 Described vias (TSV) and bumps (BP).
[0058] PIM die 2500 may include memory banks BG0 to BG3, data buses DB0 and DB1, memory bank controllers BCTRL0 and BCTRL1, PE controllers PECTRL0 and PECTRL1, command and address decoders CADEC, and data input / output circuitry DATAIO. Although only PIM die 2500 is described and shown in detail, the configuration and operation of the remaining PIM dies 2100 to 2400 and 2600 to 2800 may be similar to or substantially the same as those of PIM die 2500.
[0059] Memory groups BG0 to BG3 can be identified using memory address bits BA2 and BA3 of memory addresses BA0 to BA3 (or "memory address bits BA0 to BA3"). For example, memory group BG0 can be selected when BA2 = 0 and BA3 = 0. Memory group BG0 can include memory cells BK0 to BK3. Memory cells within a memory group can be identified using memory address bits BA0 and BA1 of memory addresses BA0 to BA3. For example, memory cell BK0 can be selected when BA0 = 0, BA1 = 0, BA2 = 0, and BA3 = 0. Figure 1 and Figure 2The memory cell array MCA can be divided into memory banks BK0 to BK15. Each of the memory banks BK0, BK2, BK4, BK6, BK8, BK10, BK12, and BK14 that can be selected when the memory bank address bit BA0 corresponding to the LSB in memory bank address bits BA0 to BA3 is "0" can be referred to as the top (or even) memory bank. Each of the memory banks BK1, BK3, BK5, BK7, BK9, BK11, BK13, and BK15 that can be selected when the memory bank address bit BA0 corresponding to the LSB in memory bank address bits BA0 to BA3 is "1" can be referred to as the bottom (or odd) memory bank. For example, each of memory banks BK0 to BK15 may include the same number of memory cells, and each of memory bank groups BG0 to BG3 may include the same number of memory banks. For example, memory banks BG0 to BG3 can be implemented as identical, and memory banks BK0 to BK15 can be implemented as identical.
[0060] Memory bank group BG0 may include PE0 and PE1. For example, PE0 can perform calculations on data in memory banks BK0 and BK1, while PE1 can perform calculations on data in memory banks BK2 and BK3. Memory bank group BG1 may include PE2, which performs calculations on data in memory banks BK4 and BK5, and PE3, which performs calculations on data in memory banks BK6 and BK7. As in memory banks BG0 and BG1, memory banks BG2 and BG3 may include PE4 through PE7. For example, PE0 through PE7 may correspond to... Figure 1 and Figure 2 The PE of each of the PIM cores 2100 to 2800, or can constitute Figure 1 and Figure 2 Each of the PIM cores 2100 to 2800 is made of PE.
[0061] The number of memory bank groups included in a PIM die 2500 and the number of memory banks per memory bank group are not limited to Figure 6 An example is shown where a channel CH1 is assigned to PIM die 2500, and memory groups BG0 to BG3 and memory cells BK0 to BK15 are assigned to channel CH1, but the inventive concept is not limited thereto. Different channels may be further assigned to PIM die 2500, and PIM 2500 may also include memory groups and memory cells assigned to different channels. For example, PIM die 2500 may include memory groups BG0 to BG15 and memory cells BK0 to BK63 assigned to four channels CH1 to CH4; as in Figure 6As illustrated in channel CH1, memory bank groups and memory banks for each channel can be implemented in the PIM die 2500. The following description is given: a memory bank group includes two PEs, and one PE is assigned to two memory banks; however, a memory bank group may include the same number of PEs as the number of memory banks, or one PE may be assigned to one memory bank. In any case, the inventive concept is not limited to the above values.
[0062] Data bus DB0 may include data input / output paths associated with memory banks BG0 and BG1. For example, data to be written to memory banks BK0 through BK3 or memory banks BK4 through BK7, data to be read from memory banks BK0 through BK3 or memory banks BK4 through BK7, data to be processed by PE0 and PE1 or PE2 and PE3, and data processed by PE0 and PE1 or PE2 and PE3 can be transmitted via data bus DB0. Data bus DB1 may include data input / output paths associated with memory banks BG2 and BG3. Besides the assigned memory banks, data buses DB0 and DB1 can be implemented as identical or integrated together.
[0063] Memory controller BCTRL0 can control memory banks BK0 to BK7 of memory banks BG0 and BG1 under the control of command and address decoder CADEC. Memory controller BCTRL1 can control memory banks BK8 to BK15 of memory banks BG2 and BG3 under the control of command and address decoder CADEC. For example, memory controllers BCTRL0 and BCTRL1 can activate or precharge memory banks BK0 to BK15. Besides the assigned memory banks, memory controllers BCTRL0 and BCTRL1 can be implemented as identical or integrated together.
[0064] PE controller PECTRL0, under the control of command and address decoder CADEC, controls PE0 to PE3 of memory banks BG0 and BG1. PE controller PECTRL1, under the control of command and address decoder CADEC, controls PE4 to PE7 of memory banks BG2 and BG3. For example, PE controllers PECTRL0 and PECTRL1 can select data to be processed by PE0 to PE7 or data already processed by PE0 to PE7, or they can control the timing of initiating or terminating computations by PE0 to PE7. Besides the assigned PEs, PE controllers PECTRL0 and PECTRL1 can be implemented as identical or integrated together.
[0065] The command and address decoder CADEC can be based on the clock signal CK sent via channel CH1 and path Path_1 (reference). Figure 5To receive command and address signals CA sent via channel CH1 and path_1 (see...) Figure 5 The Command and Address Decoder (CADEC) can decode the command and address signals (CA). Based on the decoding results, the CADEC can control the components of the PIM die 2500.
[0066] Under the control of the command and address decoder CADEC, the data input / output circuit DATAIO can receive the data input / output signal DQ (reference) sent through channel CH1 and path_1. Figure 5 The system can provide write data, including data in the data input / output signal DQ, to memory banks BK0 to BK15 of memory banks BG0 to BG3. The data input / output circuit DATAIO can receive read data output from memory banks BK0 to BK15 and PE0 to PE7 of memory banks BG0 to BG3, and can output the data input / output signal DQ, including the read data. The data input / output signal DQ, including the read data, can be sent to the memory controller 1100 via path Path_1 and channel CH1.
[0067] Figure 7 Detailed illustration Figure 1 and Figure 2 Memory devices. Will be focused on Figure 6 Memory devices 2000 and Figure 7 The differences between memory devices 2000 are described below. Memory device 2000 may include memory dies 2100 to 2400 and PIM dies 2500 to 2800. Each of PIM dies 2500 to 2800 may be compatible with… Figure 6 The PIM die 2500 is essentially the same. Each of the memory dies 2100 to 2400 can be compared with... Figure 6 The PIM die 2500 is different. The memory die 2300 may include references. Figure 6 The described memory banks are BG0 to BG3, memory banks are BK0 to BK15, data buses are DB0 and DB1, memory bank controllers are BCTRL0 and BCTRL1, command and address decoders are CADEC, and data input / output circuitry is DATAIO. Memory die 2300 may not include PE0 to PE7 and PE controllers PECTRL0 and PECTRL1, and may not be referred to as a "PIM die". The configuration and operation of each of the remaining memory dies 2100, 2200, and 2400 may be substantially the same as memory die 2300.
[0068] Figure 8 The physical address was shown to be Figure 5This is an example of a memory controller translating a physical address into a memory address. The memory controller 1100 can translate a physical address PA with a value of SID0 corresponding to the stack identifier into a memory address MA based on address mapping table AMT1. The memory controller 1100 can also translate a physical address PA with a value of SID1 corresponding to the stack identifier into a memory address MA based on address mapping table AMT2, which is different from address mapping table AMT1. For example, the range of physical address PA with a value of SID0 corresponding to the stack identifier can be from 0x000000 to 0x000FFF, while the range of physical address PA with a value of SID1 corresponding to the stack identifier can be from 0x100000 to 0x100FFF. However, the inventive concept is not limited to these values.
[0069] According to address mapping table AMT1, the bits in physical address PA "0x000040" that correspond to memory addresses BA0 to BA3 can be "0000". (21 The memory controller 1100 can map the physical address PA "0x000040" to the memory bank BK0 of memory bank group BG0 based on the address mapping table AMT1. For example, the difference between physical address PA "0x000040" and physical address PA "0x000080" can correspond to the size of cache line CL (e.g., 64 bytes). According to the address mapping table AMT1, the bits in physical address PA "0x000080" corresponding to memory bank addresses BA0 to BA3 can be "0100". (2) The memory controller 1100 can map the physical address PA "0x000080" to memory bank BK4 of memory bank group BG1 based on the address mapping table AMT1. As described above, the memory controller 1100 can translate the physical address PA with the value SID0 corresponding to the stack identifier into the memory address MA based on the address mapping table AMT1. The physical address PA, sequentially increasing from "0x000040" to "0x000400", can be mapped to memory banks BK0, BK4, BK8 and BK12 corresponding to the top memory bank T_BK, memory banks BK1, BK5, BK9 and BK13 corresponding to the bottom memory bank B_BK, memory banks BK2, BK6, BK10 and BK14 corresponding to the top memory bank T_BK, and memory banks BK3, BK7, BK11 and BK15 corresponding to the bottom memory bank B_BK.
[0070] According to address mapping table AMT2, the bits in physical address PA "0x100040" that correspond to memory addresses BA0 to BA3 can be "0000". (2)The memory controller 1100 can map the physical address PA "0x100040" to memory bank BK0 of memory bank group BG0 based on the address mapping table AMT2. According to the address mapping table AMT2, the bits in the physical address PA "0x100080" corresponding to memory bank addresses BA0 to BA3 can be "0100". (2) The memory controller 1100 can map the physical address PA "0x100080" to memory bank BK4 of memory bank group BG1 based on the address mapping table AMT2. As described above, the memory controller 1100 can translate the physical address PA with the value SID1 corresponding to the stack identifier into the memory address MA based on the address mapping table AMT2. The physical address PA, which sequentially increases from "0x100040" to "0x100400", can be mapped to memory banks BK0, BK4, BK8, BK12, BK2, BK6, BK10 and BK14 corresponding to the top memory bank T_BK, and memory banks BK1, BK5, BK6 and BK7 corresponding to the bottom memory bank B_BK. 9. BK13, BK3, BK7, BK11, and BK15. Except for the stack identifiers SID0 / SID1, the physical addresses PA increasing sequentially from "0x100040" to "0x100400" and from "0x000040" to "0x000400" are changed to be the same. However, because address mapping tables AMT1 and AMT2 are different, the order in which the memory banks corresponding to the physical addresses PA increasing sequentially from "0x100040" to "0x100400" are mapped differs from the order in which the memory banks corresponding to the physical addresses PA increasing sequentially from "0x000040" to "0x000400" are mapped.
[0071] Figure 9 It shows Figure 6 A block diagram of the memory bank group. Although in Figure 9 Only memory bank group BG0 is shown in detail, but as mentioned above, the remaining memory bank groups BG1 to BG3 can be implemented in a manner similar to or substantially the same as memory bank group BG0.
[0072] The memory bank BG0 may include a row decoder RD0 and a column decoder CD0. The row decoder RD0 can decode the row address of memory address MA and can select and activate the word line WL0 of memory bank BK0. For example, when word line WL0 is activated, memory bank BK0 can be in an active state. Conversely, when word line WL0 is deactivated, memory bank BK0 can be in a pre-charge state. As described above, the status information of memory bank BK0 can be stored in the memory bank status register 1142. The column decoder CD0 can decode the column address of memory address MA and can select and activate the column select line CSL0 of memory bank BK0. Memory bank BK0 may include memory cells MC0 accessed via word line WL0 and column select line CSL0. Memory bank BK0 may also include memory cells accessed via other word lines and other column select lines.
[0073] The memory bank group BG0 may further include an input / output sense amplifier IOSA0, a write driver WDRV0, a memory bank local input / output gating circuit BLIOGT0, a memory bank global input / output gating circuit BGIOGT0, and a data bus input / output gating circuit DBIOGT0. The input / output sense amplifier IOSA0 senses and amplifies read data output from the memory cell MC0 via the cell input / output line CIO0, and can output the read data to the memory bank local input / output line BLIO0. The write driver WDRV0 receives write data sent via the memory bank local input / output line BLIO0, and can write write data to the memory cell MC0 via the cell input / output line CIO0. The memory bank local input / output gating circuit BLIOGT0 can electrically connect the write driver WDRV0 and the memory bank local input / output line BLIO0, or electrically disconnect the write driver WDRV0 from the memory bank local input / output line BLIO0. The memory bank local input / output gating circuit BLIOGT0 can electrically connect the input / output sense amplifier IOSA0 and the memory bank local input / output line BLIO0, or electrically disconnect the input / output sense amplifier IOSA0 from the memory bank local input / output line BLIO0. The memory bank global input / output gating circuit BGIOGT0 can electrically connect the memory bank local input / output line BLIO0 and the memory bank global input / output line BGIO0, or electrically disconnect the memory bank local input / output line BLIO0 from the memory bank global input / output line BGIO0. The memory bank global input / output line BGIO0 can be shared by memory banks BK0 to BK3 in memory bank group BG0. The data bus input / output gating circuit DBIOGT0 can electrically connect the memory bank global input / output line BGIO0 and the data bus DB0, or electrically disconnect the memory bank global input / output line BGIO0 from the data bus DB0. The data bus DB0 can be shared by memory banks BG0 and BG1. For example, each of the memory local input / output gating circuit BLIOGT0, the memory global input / output gating circuit BGIOGT0, and the data bus input / output gating circuit DBIOGT0 can operate as an input / output multiplexer or a switch. The aforementioned components RD0, CD0, IOSA0, WDRV0, BLIOGT0, and BGIOGT0 can be used for data input / output of memory BK0.As described above, for the data input / output of memory banks BK1 to BK3, memory bank group BG0 may also include row decoders RD1 to RD3, column decoders CD1 to CD3, input / output sense amplifiers IOSA1 to IOSA3, write drivers WDRV1 to WDRV3, memory bank local input / output gating circuits BLIOGT1 to BLIOGT3, and memory bank global input / output gating circuits BGIOGT1 to BGIOGT3.
[0074] PE0 may include an input multiplexer IMUX, a PE array PEA, a register REG, and an output multiplexer OMUX. The input multiplexer IMUX can receive data (or write or read data) from memory bank BK0 via memory bank local input / output line BLIO0, receive data (or write or read data) from memory bank BK1 via memory bank local input / output line BLIO1, receive data from memory bank group BG0 via memory bank global input / output line BGIO0, and receive data from register output line RO0. The input multiplexer IMUX can provide at least one of the above data to the PE array PEA based on the input control signal ICTRL0. For example, the above data can be provided to the PE array PEA as operands OPA to OPD. The PE array PEA can perform calculations on at least one of the above data based on the processing control signal PCTRL0. For example, the calculations that can be performed by the PE array PEA can be various arithmetic or logical operations, such as addition, subtraction, multiplication, division, shifting, AND, NAND, OR, NOR, XNOR, and XOR. The register REG can receive and store the calculation results of the PE array PEA via register input line RI0 based on the register control signal RCTRL0. The register REG can output the stored calculation results as data to register output line RO0 based on the register control signal RCTRL0. The output multiplexer OMUX can output the data stored in register REG to at least one of the following based on the output control signal OCTRL0: memory local input / output line BLIO0, memory local input / output line BLIO1, register output line RO0, and memory global input / output line BGIO0. The configuration and operation of PE1 are basically the same as those of PE0, except that PE1 is connected to the local input / output lines BLIO2 and BLIO3 of the memory bank and receives control signals ICTRL1, PCTRL1, RCTRL1 and OCTRL1.
[0075] Figure 10 It shows Figure 6The block diagram of the PIM die is shown below. The command and address decoder CADEC decodes the command and address signals CA and controls the memory controllers BCTRL0 and BCTRL1, the PE controllers PECTRL0 and PECTRL1, and the data input / output circuit DATAIO. Memory controller BCTRL0 controls the read and write operations of memory cells in memory groups BG0 and BG1. Memory controller BCTRL1 controls the read and write operations of memory cells in memory groups BG2 and BG3. Each of the PE controllers PECTRL0 and PECTRL1 can include a control register storing control information. Under the control of the command and address decoder CADEC, PE controller PECTRL0 generates control signals ICTRL0, ICTRL1, PCTRL0, PCTRL1, RCTRL0, RCTRL1, OCTRL0, and OCTRL1 to be provided to memory groups BG0 based on the control information in the control register. Under the control of the command and address decoder CADEC, the PE controller PECTRL0 generates control signals PE2 and PE3 to be provided to memory bank group BG1 based on the control information in the control register. The PE controller PECTRL1, under the control of the command and address decoder CADEC, generates control signals PE4 to PE7 to be provided to memory bank groups BG2 and BG3 based on the control information in the control register. The data input / output circuit DATAIO can output the data of the data input / output signal DQ to the data buses DB0 and DB1, or it can output the data input / output signal DQ including the data from the data buses DB0 and DB1.
[0076] Figure 11 and Figure 12 It shows according to Figure 3 An example of how the address mapping table is arranged to store data in a PIM die. Figure 11 and Figure 12 In this context, a memory request can be a write command for memory device 2000. The memory request is input into memory request queue 1110, and the physical address PA of the memory request is selected from the minimum value (e.g., ...). Figure 8 The values (0x000000) are sequentially increased to a maximum value (e.g., 0xFFFFFF), and each of the memory banks BK0 to BK15 includes a memory cell MC accessible via four word lines WL[0:3] and four column select lines CSL[0:3]. However, the inventive concept is not limited to the above values.
[0077] The physical address PA with stack identifier SID0 can be translated into a memory address MA through address mapping table AMT1. Address translator 1120 can select address mapping table AMT1, and PHY 1130 can access PIM die 2100 through channel CH1 and can activate at least one of memory banks BK0 to BK15 of PIM die 2100 based on memory address MA. According to address mapping table AMT1, the bits of physical address PA corresponding to row address and column address correspond to the higher bits of physical address PA corresponding to memory bank addresses BA0 to BA3. (See reference...) Figure 11 Data can be arranged (or written) in the order of the memory cells MC of memory banks BK0, BK4, BK8, BK12, BK1, BK5, BK9, BK13, BK2, BK6, BK10, BK14, BK3, BK7, BK11, and BK15 selected by word line WL[0] and column selection line CSL[0]. Next, data can be arranged (or written) in the order of the memory cells MC of memory banks BK0, BK4, BK8, BK12, BK1, BK5, BK9, BK13, BK2, BK6, BK10, BK14, BK3, BK7, BK11, and BK15 selected by word line WL[0] and column selection line CSL[1]. When the above process is repeated, data can be arranged in all the memory cells MC of memory banks BK0 to BK15.
[0078] The physical address PA with stack identifier SID1 can be translated into a memory address MA through address mapping table AMT2. Address translator 1120 can select address mapping table AMT2, and PHY 1130 can access PIM die 2500 through channel CH1 and can activate at least one of memory banks BK0 to BK15 and at least one of PE0 to PE7 of PIM die 2500 based on memory address MA. According to address mapping table AMT2, at least one bit of the physical address PA corresponding to memory bank addresses BA0 to BA3 (see...) Figure 3 BA0 or BA1) corresponds to the bits above the 1st bit of the physical address PA corresponding to the column address. (See reference...) Figure 12Data can be written or arranged in the memory cell MC of the memory bank BK0 selected by word line WL[0] and column selection line CSL[0]. Next, data can be written or arranged in the memory cell MC of the memory bank BK0 selected by word line WL[0] and column selection line CSL[1]. When the above process is repeated, data can be written or arranged in the memory cell MC of the memory bank BK0 selected by word line WL[0]. As in the case where data is arranged in the order of the memory cell MCs of the memory bank BK0 selected by word line WL[0] and column selection line CSL[0:3], data can be arranged in the order of the memory cell MCs of the memory banks BK4, BK8, BK12, BK2, BK6, BK10, BK14, BK1, BK3, BK5, BK7, BK9, BK11, BK13 and BK15 selected by word line WL[0] and column selection line CSL[0:3]. After arranging the data in the memory cells MC of all memory banks BK0 to BK15 selected by word line WL[0], as described above, the data can be arranged in the memory cells MC of all memory banks BK0 to BK15 selected by word line WL[1:3]. (This is in contrast to the previous description.) Figure 11 The data arrangement of the address mapping table AMT1 is different, according to Figure 12 In the case of the data arrangement of the address mapping table AMT2, after the data is completely arranged in the memory cell MC selected by a word line WL[0], the arrangement of the data in the memory cell MC selected by word line WL[1:3] can be initiated.
[0079] Figure 13 It shows the result of PE according to Figure 12 Examples of computations performed on data arrangement. For instance, vector A can be arranged in memory locations BK0, BK4, BK8, BK12, BK2, BK6, BK10, and BK14 corresponding to the top memory location T_BK, and vector B can be arranged in memory locations BK1, BK5, BK9, BK13, BK3, BK7, BK11, and BK15 corresponding to the bottom memory location B_BK. PE0 through PE7 can generate vector C by adding vector A arranged in memory locations BK0, BK4, BK8, BK12, BK2, BK6, BK10, and BK14 corresponding to the top memory location T_BK and vector B arranged in memory locations BK1, BK5, BK9, BK13, BK3, BK7, BK11, and BK15 corresponding to the bottom memory location B_BK. Any other computation can be performed instead of addition. Vector C can be stored in registers REG from PE0 to PE7. (See reference...) Figure 13Because each of PE0 through PE7 is shared by two memory banks, it is more advantageous to arrange data in all memory cells MC selected by a single word line according to address mapping table AMT2, rather than arranging data in memory cells MC according to address mapping table AMT1, since PE0 through PE7 perform calculations. For example, memory controller 1100 can apply address mapping table AMT1 to PIM die 2100 and address mapping table AMT2 to PIM die 2500 independently of each other. Therefore, when processor 1200 performs PIM calculations using PIM dies 2500 via PE0 to PE7, memory controller 1100 can independently apply address mapping table AMT1 to PIM die 2100 and address mapping table AMT2 to PIM die 2500, instead of applying address mapping table AMT2 to both PIM dies 2100 and 2500 together. This prevents performance degradation during normal operation of processor 1200, which is performed using PIM dies 2100 where PE0 to PE7 are not activated. Normal operation may include data input / output operations on PIM dies 2100 based on read or write commands without executing PE0 to PE7. Furthermore, because the memory controller 1100 supports different address mapping tables AMT1 and AMT2, the overhead of program data rearrangement that can be executed by the processor 1200 when performing PIM calculations using PE0 to PE7 of the PIM die 2500 can be reduced, and memory-level parallelism can be utilized during normal operation.
[0080] Figure 14 It shows Figure 5Examples of processors accessing a memory controller and memory controllers accessing memory devices. For example, processor 1200 may access memory controller 1100 in a memory-mapped I / O (MMIO) manner. The system address space (or region) may include space allocated to memory controller 1100. According to example embodiments, the system address space may also include space (or region) allocated to any other component in the on-chip system 1000 (e.g., on-chip memory 1300, intellectual property (IP) blocks, and controllers). Processor 1200 may access and control memory controller 1100 and any other component in the on-chip system 1000 using the same system address space. Processor 1200 may access the space in the system address space allocated to memory controller 1100 and may write values to the space allocated to memory controller 1100 using write instructions. Memory controller 1100 may respond to the value and, for example, may receive memory requests from processor 1200. Memory controller 1100 may ignore values written to the remaining space in the system address space other than the space allocated to memory controller 1100.
[0081] The space allocated to memory controller 1100 in the system address space can be a physical address space, can correspond to a physical address space, or can be mapped to a physical address space. The physical address space can correspond to a range of physical address PAs associated with a memory request. The physical address space can include space allocated to control register 1141 and space allocated to memory device 2000. The space allocated to memory controller 1100 in the system address space can include space corresponding to the space allocated to control register 1141 in the physical address space, and processor 1200 can change the value (or information) of control register 1141 by accessing the space corresponding to the space allocated to control register 1141. As described above, the space allocated to memory controller 1100 in the system address space can include space corresponding to the space allocated to memory device 2000 in the physical address space, and processor 1200 can access memory device 2000 by accessing the space corresponding to the space allocated to memory device 2000.
[0082] The memory controller 1100 can also access the memory device 2000 in MMIO mode. The physical address space may include the space allocated to the memory device 2000. The memory controller 1100 can access the space in the physical address space allocated to the memory device 2000, and can translate the physical address PA of the space in the physical address space allocated to the memory device 2000 into a memory address MA.
[0083] The space allocated to memory device 2000 in the physical address space can be a memory address space, can correspond to a memory address space, or can be mapped to a memory address space. The memory address space can correspond to the range of memory addresses MA. The memory address space may include space allocated to the control registers of PE controllers PECTRL0 and PECTRL1, and space allocated to memory cells. The space allocated to memory device 2000 in the physical address space may include space corresponding to the space allocated to the control registers of PE controllers PECTRL0 and PECTRL1 in the memory address space, and memory controller 1100 can change the values (or information) of the control registers of PE controllers PECTRL0 and PECTRL1 by accessing the space corresponding to the space allocated to the control registers of PE controllers PECTRL0 and PECTRL1. As described above, the space allocated to memory device 2000 in the physical address space may include space corresponding to the space allocated to memory cells in the memory address space, and memory controller 1100 can access memory cells by accessing the space corresponding to the space allocated to memory cells. The memory cells of each of the memory banks BK0 to BK15 of each of the PIM dies 2100 to 2800 of the memory device 2000, as well as the control registers of the PE controllers PECTRL0 and PECTRL1, can be mapped to the memory address associated with each of the PIM dies 2100 to 2800 of the memory device 2000.
[0084] Figure 15 An electronic device 100c according to another embodiment of the present invention is shown. The electronic device 100c may include a system-on-a-chip 1000, memory devices 2000_1 to 2000_4, an inserter 3000, and a package board 4000. Each of the memory devices 2000_1 to 2000_4 may correspond to the aforementioned memory device 2000, and the number of memory devices 2000_1 to 2000_4 is not limited to this. Figure 15 The example shown illustrates this. The inserter 3000 may include paths to multiple channels that allow the system-on-chip 1000 to access memory devices 2000_1 through 2000_4. For example, the inserter 3000 may be stacked on a package board 4000. In another example, the system-on-chip 1000 and memory devices 2000_1 through 2000_4 may be stacked on the package board 4000 without the inserter 3000.
[0085] The memory controller according to an embodiment of the present invention can simultaneously support multiple address mapping tables for translating physical addresses into memory addresses. The memory controller can dynamically select multiple address mapping tables based on whether a physical process (PE) of the PIM die is executed, and can optimally arrange data to suit the execution of the PE.
[0086] At least one of the components, elements, modules, or units (collectively referred to as "components" in this paragraph) indicated by boxes in the accompanying drawings (e.g., processors and memory controllers) can be embodied as various numbers of hardware, software, and / or firmware structures performing the various functions described above according to exemplary embodiments. For example, at least one of these components can use direct circuit structures, such as memory, hardware processors, logic circuits, lookup tables, etc., which can perform various functions under the control of one or more microprocessors or other control devices. Furthermore, at least one of these components can be embodied by a module, program, or portion of code containing one or more executable instructions for performing specific logical functions and executed by one or more microprocessors or other control devices. Furthermore, at least one of these components can include or be implemented as a processor, microprocessor, etc., such as a central processing unit (CPU) performing various functions. Two or more of these components can be combined into a single component performing all the operations or functions of the combined two or more components. Furthermore, at least a portion of the function of at least one of these components can be performed by another of these components. Moreover, although a bus is not shown in the above block diagrams, communication between components can be performed via a bus. The functional scheme of the above example embodiments can be implemented as an algorithm executed on one or more processors. Furthermore, the components or processing steps represented by the boxes can utilize any number of related technologies for electronic configuration, signal processing and / or control, data processing, etc.
[0087] Although the inventive concept has been described with reference to exemplary embodiments thereof, it will be apparent to those skilled in the art that various changes and modifications may be made thereto without departing from the spirit and scope of the inventive concept as set forth in the appended claims.
Claims
1. A memory controller, comprising: First address mapping table and second address mapping table; A memory request queue is configured to store memory requests associated with memory devices, including a first memory die and a second memory die having a shared channel. An address translator is configured to select one of the first address mapping table and the second address mapping table based on the bits of the physical address requested by the memory, and to translate the physical address into a memory address of the memory device based on the selected address mapping table in the first address mapping table and the second address mapping table. as well as The physical layer is configured to access one of the first memory die and the second memory die via the shared channel based on the memory address. Wherein, the first mapping of the first address mapping table is different from the second mapping of the second address mapping table, and The first memory die is a memory die that does not include the processing element PE, and the second memory die is an in-memory processing memory (PIM) die that includes the PE.
2. The memory controller according to claim 1, wherein, The bits of the physical address requested by the memory correspond to a stack identifier, which identifies the first memory die and the second memory die.
3. The memory controller according to claim 2, wherein, The bits of the physical address requested by the memory are the most significant bits.
4. The memory controller according to claim 1, wherein, Each of the first memory die and the second memory die includes a plurality of memory cells, which are identified by the memory cell address of the memory address, and Wherein, the first position of the first group of bits in the physical address that is converted to the memory address through the first address mapping table is different from the second position of the second group of bits in the physical address that is converted to the memory address through the second address mapping table.
5. The memory controller according to claim 4, wherein, The memory address includes the address of the memory bank, and the row address and column address of the memory cell that identifies each of the plurality of memory banks, and Wherein, at least one bit in the second group of bits of the bank address of the physical address converted to the memory address through the second address mapping table corresponds to the high bit of a plurality of bits of the column address of the physical address converted to the memory address through the second address mapping table.
6. The memory controller according to claim 1, wherein, The address converter is also configured to select the first address mapping table, and The physical layer is further configured to: access the first memory die through the shared channel based on the memory address, and activate at least one of the multiple memory banks of the first memory die.
7. The memory controller according to claim 1, wherein, The address translator is also configured to select the second address mapping table, and The physical layer is further configured to: access the second memory die through the shared channel based on the memory address, and activate at least one of the multiple memory banks of the second memory die and the PE of the second memory die.
8. The memory controller according to claim 1, wherein, The address converter includes: The address range register is configured to store the bits of the physical address requested by the memory; A first address translation circuit is configured to translate the physical address into the memory address based on the first address mapping table; A second address translation circuit is configured to translate the physical address into the memory address based on the second address mapping table; and The mapping selection circuit is configured to select one of the first address translation circuit and the second address translation circuit based on the value of the bit stored in the address range register.
9. The memory controller according to claim 1, further comprising: A memory command queue is configured to store the memory command associated with the memory request and the memory address; The memory bank status register is configured to store status information of each of the plurality of memory banks in each of the first memory die and the second memory die; A command scheduler is configured to adjust the order in which memory commands and memory addresses stored in the memory command queue are processed based on the status information of the plurality of memory banks. A command sequencer is configured to output the memory commands and the memory addresses to the physical layer based on the said order; The system bus interface circuit is configured to receive the memory request via the system bus and provide the memory request to the memory request queue. A read buffer is configured to store read data sent from the memory device; as well as A write buffer is configured to store write data to be sent to the memory device.
10. The memory controller according to claim 1, wherein, Multiple address mapping tables are stored in the memory controller, and The plurality of address mapping tables include the first address mapping table and the second address mapping table.
11. The memory controller according to claim 1, wherein, The physical address with the first stack identifier is translated into a memory address through the first address mapping table; and The physical address with the second stack identifier is converted into a memory address through the second address mapping table.
12. A system-on-a-chip, comprising: The processor is configured to generate memory requests; as well as A memory controller, including a first address mapping table and a second address mapping table, is configured to: Based on the bits of the physical address requested by the memory, select one of the first address mapping table and the second address mapping table; Based on the selected address mapping table in the first address mapping table and the second address mapping table, the physical address is converted into a memory address of the memory device; as well as Based on the memory address, one of the first memory die or the second memory die can be accessed via a shared channel. Wherein, the first mapping of the first address mapping table is different from the second mapping of the second address mapping table, and The first memory die is a memory die that does not include the processing element PE, and the second memory die is an in-memory processing memory (PIM) die that includes the PE.
13. The system-on-a-chip according to claim 12, wherein, Each of the first memory die and the second memory die includes a plurality of memory cells, and each of the plurality of memory cells includes a plurality of memory units. The memory address includes: a stack identifier that identifies the first memory die and the second memory die; a memory bank address that identifies the plurality of memory banks; and a row address and a column address that identify a plurality of memory cells in each of the plurality of memory banks.
14. The system-on-a-chip according to claim 13, wherein, The memory controller is also configured to: The first bit of the physical address is converted into the stack identifier, where the first bit of the physical address is the bit of the physical address requested by the memory; Convert the second bit of the physical address to the memory address; Convert the third bit of the physical address to the row address; and Convert the fourth bit of the physical address to the column address.
15. The system-on-a-chip according to claim 14, wherein, The first bit is the most significant bit of the physical address.
16. The system-on-a-chip according to claim 14, wherein, The position of the second bit in the physical address changes based on the first address mapping table and the second address mapping table.
17. The system-on-a-chip according to claim 14, wherein, In the first address mapping table, the second bit corresponds to the lower bit of the fourth bit, and In the second address mapping table, at least one of the second bits corresponds to the high-order bit of the fourth bit.
18. The system-on-a-chip according to claim 12, wherein, The memory controller includes an address translator, and The address converter includes: The address range register is configured to store the bits of the physical address requested by the memory; A first address translation circuit is configured to translate the physical address into the memory address based on the first address mapping table; A second address translation circuit is configured to translate the physical address into the memory address based on the second address mapping table; and The mapping selection circuit is configured to select one of the first address translation circuit and the second address translation circuit based on the value of the bit stored in the address range register.
19. An electronic device comprising: A memory device, including a first memory die, a second memory die, and a shared channel for the first memory die and the second memory die; as well as The system-on-chip includes: The processor is configured to generate memory requests; and The memory controller is configured as follows: Based on the bits of the physical address requested by the memory, select one of the first address mapping table and the second address mapping table. Based on the selected address mapping table in the first address mapping table and the second address mapping table, the physical address is translated into the memory address of the memory device; and Based on the memory address, one of the first memory die and the second memory die can be accessed through the shared channel. Wherein, the first mapping of the first address mapping table is different from the second mapping of the second address mapping table, and The first memory die is a memory die that does not include the processing element PE, and the second memory die is an in-memory processing memory (PIM) die that includes the PE.
20. The electronic device according to claim 19, wherein, The first memory die is different from the second memory die, and The second memory die includes a plurality of first memory cells, and the first memory die includes a plurality of second memory cells.