Memory controller, memory access method, chip and electronic device

By introducing a shared control logic circuit and a command copier into the memory controller, the coordinated control of the zeroth virtual channel and the first virtual channel is achieved, which solves the problem of the large number of control logic circuits in the memory controller and reduces overhead and power consumption.

CN117951052BActive Publication Date: 2026-06-23HYGON INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HYGON INFORMATION TECH CO LTD
Filing Date
2024-01-31
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing memory controllers require separate control logic circuits for each virtual channel, resulting in a large number of control logic circuits, a large area footprint, and high power consumption, especially in the case of multiple channels and multiple memory controllers.

Method used

A shared control logic circuit and a command replicator are introduced. The shared control logic circuit processes memory access requests and replicates the request command according to preset conditions after the final arbitrator outputs, thereby realizing the coordinated control of the zeroth virtual channel and the first virtual channel and reducing the number of control logic circuits.

Benefits of technology

By using a shared control logic circuit and command replicator design, the overhead of the memory controller is reduced, the number of control logic circuits is decreased, and the overall chip overhead and power consumption are reduced.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117951052B_ABST
    Figure CN117951052B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide a memory controller, a memory access method, a chip and an electronic device. The memory controller comprises: a common control logic circuit configured to determine a request command corresponding to a memory access request and send the request command to a final arbiter; a command replicator configured to replicate the request command and send the replicated request command to the final arbiter when a command output by the final arbiter is a non-replicated request command and meets a preset replication condition; and the final arbiter configured to perform final arbitration on the sent command. If the final arbitration passes the command, the final arbiter outputs the request command and instructs transmission through a zeroth virtual channel. In at least one clock cycle thereafter, the final arbitration passes the corresponding replicated request command, and the final arbiter outputs the replicated request command and instructs transmission through a first virtual channel. The memory controller of the present application cooperatively controls the zeroth virtual channel and the first virtual channel through a set of control logic, thereby reducing the overhead of the memory controller.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of chip technology, specifically to a memory controller, memory access method, chip, and electronic device. Background Technology

[0002] To reduce the bandwidth loss and latency issues caused by the speed difference between cache and hard disk (e.g., non-volatile memory), memory (e.g., dynamic random access memory) can be placed between cache and hard disk to mitigate the problems caused by the speed difference between cache and hard disk.

[0003] As a crucial component for managing and controlling memory, the memory controller is responsible for coordinating, controlling, and executing tasks related to memory access operations (such as read or write operations). The overhead of the memory controller is a significant performance indicator, and how to improve the memory controller to reduce its overhead has become a pressing technical problem for those skilled in the art. Summary of the Invention

[0004] In view of this, embodiments of this application provide a memory controller, a memory access method, a chip, and an electronic device to reduce the overhead of the memory controller.

[0005] To achieve the above objectives, the embodiments of this application provide the following technical solutions.

[0006] In a first aspect, embodiments of this application provide a memory controller that transmits data with memory via a memory channel, the memory channel including a zeroth virtual channel and a first virtual channel; the memory controller includes: a shared control logic circuit, a final arbiter, and a command copyer;

[0007] The shared control logic circuit is shared by the zeroth virtual channel and the first virtual channel. The shared control logic circuit is used to determine the request command corresponding to the memory access request and send the request command to the final arbitrator.

[0008] The command copier is used to copy the request command when the command output by the final arbitrator is a non-copyable request command and the request command meets the preset copying conditions, so as to obtain the corresponding copied request command; and to send the copied request command to the final arbitrator.

[0009] The final arbitrator is used to perform final arbitration on commands sent to the final arbitrator; if the command that passes the final arbitration is a non-copying request command, the request command is output and indicated to be transmitted through the zeroth virtual channel; and at least one clock cycle after the output of the non-copying request command, if the final arbitration passes the corresponding copying request command, the copying request command is output and indicated to be transmitted through the first virtual channel.

[0010] The non-copying request command comes from the shared control logic circuit, while the copying request command comes from the command copier.

[0011] Secondly, embodiments of this application provide a memory access method applied to the memory controller described in the first aspect above, wherein the memory controller and memory transmit data through a memory channel, the memory channel including a zeroth virtual channel and a first virtual channel; the method includes:

[0012] Determine the request command corresponding to the memory access request, wherein the request command corresponding to the memory access request participates in the final arbitration;

[0013] And when the final arbitration output command is a non-copyable request command, and the request command meets the preset copying conditions, the request command is copied to obtain the corresponding copied request command, wherein the copied request command participates in the final arbitration;

[0014] The command participating in the final arbitration is subject to final arbitration; if the command that passes the final arbitration is a non-copying request command, the request command is output and indicated to be transmitted through the zeroth virtual channel; and at least one clock cycle after the output of the non-copying request command, if the final arbitration passes the corresponding copying request command, the copying request command is output and indicated to be transmitted through the first virtual channel.

[0015] Thirdly, embodiments of this application provide a chip, including: at least one processor core, at least one memory controller, a memory physical layer, and memory;

[0016] The at least one processor core is connected to the at least one memory controller; the memory physical layer includes at least one memory channel, which includes a zeroth virtual channel and a first virtual channel; the memory controller and the memory transmit data through the memory channel;

[0017] The memory controller is the memory controller described in the first aspect above.

[0018] Fourthly, embodiments of this application provide an electronic device, including a chip as described in the third aspect above, or a memory controller as described in the first aspect above.

[0019] In this embodiment, the non-copying request command output by the final arbitrator (from the shared control logic circuit) is transmitted through the zeroth virtual channel, and for the non-copying request command output by the final arbitrator that meets the preset copying conditions, the command copier can copy the request command; then the final arbitrator can output the corresponding copied request command and transmit it through the first virtual channel at least one clock cycle after outputting the non-copying request command. As can be seen, when the zeroth virtual channel is configured to transmit non-copy request commands and the first virtual channel to transmit copy request commands, the memory controller can perform coordinated control of the zeroth and first virtual channels. In other words, the memory controller can set up a shared control logic circuit for the zeroth and first virtual channels, and through the means of copying request commands via a command copier and arbitrating non-copy request commands via the zeroth virtual channel and arbitrating copy request commands via the first virtual channel, the shared control logic circuit can treat the zeroth and first virtual channels as a single channel for control. This allows the memory controller to achieve coordinated control of the zeroth and first virtual channels through a single set of control logic, reducing the number of control logic circuits used by the memory controller and lowering its overhead. Attached Figure Description

[0020] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of this application. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0021] Figure 1 An example diagram of a chip provided in an embodiment of this application.

[0022] Figure 2 This is an example diagram of a memory controller provided in an embodiment of this application.

[0023] Figure 3 A flowchart of a memory access method provided in an embodiment of this application.

[0024] Figure 4A Another example diagram of a memory controller provided in an embodiment of this application.

[0025] Figure 4B Another flowchart of the memory access method provided in the embodiments of this application.

[0026] Figure 4C Another flowchart of the memory access method provided in the embodiments of this application.

[0027] Figure 4D This is another flowchart of the memory access method provided in the embodiments of this application.

[0028] Figure 5A This is yet another flowchart of the memory access method provided in the embodiments of this application.

[0029] Figure 5B This is yet another flowchart of the memory access method provided in the embodiments of this application.

[0030] Figure 5C Another flowchart of the memory access method provided in the embodiments of this application.

[0031] Figure 6 This is a flowchart illustrating the timing parameter counting of commands in an embodiment of this application.

[0032] Figure 7 Another example diagram of a memory controller provided in an embodiment of this application.

[0033] Figure 8 A flowchart of a request command for determining a memory access request provided in an embodiment of this application.

[0034] Figure 9 This is yet another example diagram of a memory controller provided in an embodiment of this application. Detailed Implementation

[0035] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0036] The memory controller acts as a bridge between the processor core (such as the CPU core) and the memory. On the one hand, the memory controller can be connected to the system bus through an interface, thereby connecting to the processor core through the system bus. On the other hand, it can be connected to the physical layer (the physical layer connected to the memory controller can be called the memory physical layer) through an interface, thereby connecting to the memory through the memory physical layer. In addition, the memory controller can be controlled by the processor core to configure the mode register, perform memory access operations (such as read or write operations), and issue control commands (such as memory refresh, precharge, and other control commands).

[0037] To enable data transfer between the memory controller and the memory, the physical memory layer can have multiple memory channels for this data transfer. Taking HBM3 (High Bandwidth Memory 3) technology as an example, in HBM3, the memory controller can be called an HBM3 memory controller, the memory channel can be called an HBM3 channel, and the memory itself can be called HBM3 memory. One HBM3 memory (such as a DRAM memory) can support 16 HBM3 channels, and each HBM3 channel can have independent command and data interfaces. As an example, a DRAM (Dynamic Random Access Memory) die supports 4 HBM3 channels, so a DRAM memory stacked with 4 DRAM dies can support 16 HBM3 channels.

[0038] Memory channels (such as HBM3 channels) can be logically divided into zeroth virtual channels and first virtual channels using virtual channel mode. A memory channel can be divided into at least one zeroth virtual channel and at least one first virtual channel. For example, a memory channel can be logically divided into two virtual channels (one zeroth virtual channel and one first virtual channel). Data transfer supported by the memory channel can be handled through the divided virtual channels.

[0039] Without including the ECC (Error Checking and Correcting) bit, the data length that a virtual channel can support for transmission (i.e., the amount of data transmitted by a virtual channel through a single command) can be defined as the first data length. The total data length that the virtual channels included in the memory channel can support for transmission can be defined as the second data length (for example, the second data length is the total data length that the zeroth virtual channel and the first virtual channel included in the memory channel can support for transmission). The multiple of the second data length relative to the first data length can correspond to the number of virtual channels divided in a memory channel; for example, if a memory channel is divided into two virtual channels (one zeroth virtual channel and one first virtual channel), then the second data length is twice the first data length.

[0040] Taking HBM3 technology as an example, a single read or write operation in HBM3 is achieved by continuously operating on 8 columns of data in a specified row, with each column operating on 32 bits of data. In other words, the data width of a virtual channel refers to the number of bits of data bus a virtual channel has; for example, in HBM3, the data width of a virtual channel is 32 bits. The data length that a virtual channel can transmit refers to the amount of data that a virtual channel can transmit through a single column command. Since a single column command is implemented by continuously operating on 8 columns of data in a specified row, the data length that a virtual channel can transmit can be 32 bits × 8 = 32 bytes, where 1 byte = 8 bits. Based on this, without including ECC bits, the data length that a virtual channel can transmit can be 32 bytes; that is, one example of the first data length is 32 bytes. When an HBM3 channel is divided into a zeroth virtual channel and a first virtual channel, the second data length is twice the first data length, which can be 64 bytes.

[0041] It should be noted that since virtual channels are logically divided channels of memory channels (such as HBM3 channels), virtual channels are also called pseudo channels.

[0042] For ease of understanding, Figure 1 An example diagram of a chip provided in an embodiment of this application is shown as an example, such as... Figure 1 As shown, the chip may include a processor core 110, a memory controller 120, a memory physical layer 130, and memory 140.

[0043] The processor core 110 is the computing core of the processor, used for data calculation and program execution, including but not limited to the computing core of a CPU (Central Processing Unit) and the computing core of a GPU (Graphics Processing Unit). The number of processor cores 110 can be at least one (or more), and this application embodiment does not set a limit.

[0044] The memory controller 120 and the processor core 110 can be connected via a system bus. For example, the interface of the memory controller 120 can be connected to the system bus, thereby connecting to the processor core 110 via the system bus. The system bus may include a command bus and a data bus, so that the memory controller 120 can receive memory access requests from the processor core 110 via the command bus and exchange data with the processor core 110 via the data bus. The number of memory controllers 120 can be at least one (or more), and this application embodiment does not limit the number.

[0045] It's important to note that the processor core can send memory access requests to the memory controller to request read, write, and other memory access operations. These requests can be categorized into read requests and write requests. A read request is the memory access request corresponding to a read operation, used to request data to be read from memory and returned to the processor core; the data read from memory by a read request is called the read data, and correspondingly, the read data is the memory access data for the read request. A write request is the memory access request corresponding to a write operation, used to request write data to be written to memory; the data to be written to memory by a write request is called the write data, and correspondingly, the write data is the memory access data for the write request.

[0046] The memory physical layer 130 is responsible for the hardware-level interface and connection between the memory controller 120 and the memory 140, providing a physical connection for data transfer between them. The memory physical layer 130 may have memory channels 131 for data transfer between the memory controller and the memory, and the number of memory channels 131 may be one or more. By configuring at least one memory channel for a memory controller, a memory controller can control the configured at least one memory channel. In one example, HBM3 technology can support configuring 16 HBM3 channels for a memory (e.g., a DRAM memory), setting up at least 16 memory controllers, and each memory controller can control at least one HBM3 channel.

[0047] Memory channels can include zeroth virtual channels and first virtual channels. A virtual channel can be considered a logical path for dividing memory channels, used to transfer data between the memory controller and memory. Figure 1 As shown, taking the division of a memory channel 131 into two virtual channels as an example, a memory channel 131 may include a zeroth virtual channel 0 and a first virtual channel 1. The data length that a virtual channel supports for transmitting data can be a first data length (e.g., 32 bytes).

[0048] Memory 140 can be responsible for temporarily storing the data and programs required for the operation of processor core 110. Memory 140 can be in the form of DRAM (Dynamic Random Access Memory). There are no restrictions on the technology used in memory 140, including but not limited to HBM3, DDR (Double Data Rate), etc.

[0049] As can be seen, the physical memory layer contains memory channels (e.g., HBM3 channels) controlled by a memory controller (e.g., the HBM3 memory controller). These memory channels are divided into zeroth virtual channels and first virtual channels. The zeroth and first virtual channels have independent data buses but share a command bus. When the memory controller controls the memory channels, it needs to configure control logic to control the virtual channels within the memory channels.

[0050] One way the memory controller controls virtual channels is as follows:

[0051] Based on the transmission of memory access request commands and data through the virtual channel corresponding to the memory access request, the memory controller can set up independent control logic circuits for each virtual channel. This allows for independent control of each virtual channel. For example, the memory controller can include a zeroth control logic circuit controlling the zeroth virtual channel and a first control logic circuit controlling the first virtual channel. The zeroth control logic circuit controls the zeroth virtual channel, and the first control logic circuit controls the first virtual channel.

[0052] It should be noted that the memory access request command refers to the request command generated by the memory controller based on the processor core's memory access request, such as the read command corresponding to a read request and the write command corresponding to a write request. When the memory access request indicates a virtual channel, the transmission of the memory access request command and memory access data through the virtual channel corresponding to the memory access request means that the memory access request command and memory access data are transmitted through the virtual channel indicated by the memory access request, and not through other virtual channels. It should be further noted that the virtual channel corresponding to the memory access request can be indicated by the virtual channel address of the memory access request (the virtual channel address of the memory access request can be carried in the physical address of the memory access request). If the virtual channel address of the memory access request belongs to the zeroth virtual channel, then the memory access request is a memory access request for the zeroth virtual channel; if the virtual channel address of the memory access request belongs to the first virtual channel, then the memory access request is a memory access request for the first virtual channel.

[0053] For ease of understanding, let's take the memory access request sent by the processor core as the zeroth virtual channel as an example. Then, regardless of the length of the memory access data, the request command and memory access data sent by the processor core are transmitted through the zeroth virtual channel, and not through the first virtual channel. Similarly, if the memory access request sent by the processor core corresponds to the first virtual channel, then regardless of the length of the memory access data, the request command and memory access data sent by the processor core are transmitted through the first virtual channel, and not through the zeroth virtual channel.

[0054] The request command and memory access data based on the memory access request are transmitted in the virtual channel indicated by the memory access request. If the data length of the memory access data is large (for example, the data length of the memory access data is greater than the first data length supported by a virtual channel), then the request command and memory access data of the memory access request need to be transmitted multiple times in the virtual channel corresponding to the memory access request.

[0055] For example, taking a virtual channel that supports 32-byte data transfer as an example, that is, the single data transfer amount of a single virtual channel in the physical memory layer is 32 bytes (for example, the single data transfer amount supported by a virtual channel in HBM3 technology is 32 bytes). If the memory access request sent by the processor core has a large amount of memory access data (64 bytes or more), then for a 64-byte memory access request (a memory access request with 64-byte data is simply referred to as a 64-byte memory access request), after determining the virtual channel corresponding to the memory access request, the request command and the memory access data are transmitted twice consecutively on the virtual channel corresponding to the memory access request.

[0056] For example, when the memory access request is a write request, the memory controller needs to divide the 64-byte write request sent by the processor core into two 32-byte write commands. One 32-byte write command is used to write 32 bytes of write data to memory. Therefore, the memory controller continuously transmits two 32-byte write commands and their corresponding write data to memory through the virtual channel corresponding to the write request, thus writing 64 bytes of write data to memory. Similarly, when the memory access request is a read request, the memory controller needs to divide the 64-byte read request sent by the processor core into two 32-byte read commands. One 32-byte read command is used to read 32 bytes of read data from memory. Therefore, the memory controller continuously transmits two 32-byte read commands to memory through the virtual channel corresponding to the read request, and continuously retrieves two 32-byte read data entries returned from memory through the same virtual channel, thus reading 64 bytes of read data from memory.

[0057] It should be noted that HBM3 technology performs read or write operations by concurrently performing 8 read or write operations, 4 of which occur per clock cycle. This allows the zeroth virtual channel and the first virtual channel to have independent data buses, but with a shared command bus. The minimum interval between read or write commands transmitted on the same virtual channel is 2 clock cycles. In other words, if a read or write command is transmitted on the zeroth virtual channel in the current clock cycle, it cannot be transmitted on the zeroth virtual channel in the first clock cycle following the current clock cycle; at least the second clock cycle following the current clock cycle is required for a read or write command to be transmitted on the zeroth virtual channel. Therefore, the first clock cycle following the current clock cycle can transmit read or write commands on the first virtual channel.

[0058] As can be seen, in the above method, the memory access request command and memory access data are transmitted in the virtual channel corresponding to the memory access request, and not in other virtual channels. Therefore, when a memory channel is divided into the zeroth virtual channel and the first virtual channel, the memory controller needs to set up independent control logic circuits for each virtual channel in order to achieve independent control of each virtual channel.

[0059] However, assigning independent control logic circuits to each virtual channel in the memory controller results in a large number of control logic circuits within the memory controller (for example, the memory controller needs to set up dual control logic circuits for the zeroth virtual channel and the first virtual channel). This leads to significant overhead for the memory controller, such as a larger footprint and higher power consumption. Furthermore, if the memory supports a large number of memory channels and the chip has a large number of memory controllers (for example, HBM3 supports 16 HBM3 channels, each HBM3 channel includes two virtual channels, and the chip supports at least 16 HBM3 memory controllers), then with each memory controller needing to independently set up control logic circuits for each virtual channel, the overall overhead of the chip on the memory controller will further increase when there are a large number of memory channels and memory controllers.

[0060] Based on this, embodiments of this application provide an improved memory controller design to reduce the overhead of the memory controller. Unlike the method where the request command and memory access data are transmitted in the virtual channel corresponding to the memory access request, embodiments of this application introduce a copy request command method. The command transmitted in the zeroth virtual channel is set to a non-copy request command, while the command transmitted in the first virtual channel is a copy request command. This allows for coordinated control of the zeroth and first virtual channels, enabling the memory controller to internally implement coordinated control of the zeroth and first virtual channels through a single set of control logic. This reduces the number of control logic circuits used by the memory controller and lowers its overhead. It should be noted that when memory channels are divided into zeroth and first virtual channels (the number of zeroth virtual channels may be one or more, and similarly, the number of first virtual channels may be one or more), the zeroth and first virtual channels can be adjusted by changing settings within the memory channels and are not fixed.

[0061] Based on the above ideas Figure 2 An example diagram of a memory controller provided in an embodiment of this application is shown as an example, such as... Figure 2 As shown, the memory controller may include: a shared control logic circuit 210, a command copier 220, and a final arbitrator 230.

[0062] The shared control logic circuit 210 is a control logic circuit shared by the zeroth virtual channel and the first virtual channel in the memory controller. In this embodiment, the shared control logic circuit 210 can process memory access requests without distinguishing the virtual channel address of the memory access request, thereby determining the request command corresponding to the memory access request sent to the final arbitrator for final arbitration. As an optional implementation, the shared control logic circuit 210 may include at least one logic component for processing memory access requests.

[0063] As an optional main function of the common control logic circuit 210, the common control logic circuit 210 can be used to determine the request command corresponding to the memory access request and send the request command to the final arbitrator, wherein the memory access request may come from the processor core.

[0064] In the case where a copy request command is introduced in the embodiments of this application, a command copier 220 may be provided in the memory controller. The command copier 220 may be used to: copy the request command to obtain the corresponding copied request command when the command output by the final arbitrator is a non-copy request command and the request command meets the preset copy conditions; and send the copied request command to the final arbitrator.

[0065] In other words, the final arbitrator 230, as the device in the memory controller that performs the final arbitration of commands, can receive commands from at least the request commands provided by the common control logic circuit 210 and the request commands copied by the command copier 220. For ease of explanation, in this embodiment, the request commands from the common control logic circuit are referred to as non-copied request commands, and the request commands from the command copier are referred to as copied request commands.

[0066] As an optional implementation, when the command transmitted through the zeroth virtual channel is set to a non-copying request command and the command transmitted through the first virtual channel is a copying request command, the final arbitrator 230 can be used to:

[0067] The command sent to the final arbitrator is finally arbitrated; if the command that passes the final arbitration is a non-replicated request command (i.e., the request command that passes the final arbitration from the common control logic circuit), the request command is output and indicated to be transmitted through the zeroth virtual channel; and at least one clock cycle after the output of the non-replicated request command, if the corresponding replicated request command passes the final arbitration (i.e., the request command that passes the final arbitration from the command replicator), the replicated request command is output and indicated to be transmitted through the first virtual channel.

[0068] As can be seen, in this embodiment, the non-copying request command output by the final arbitrator (from the shared control logic circuit) is transmitted through the zeroth virtual channel, and for the non-copying request command output by the final arbitrator that meets the preset copying conditions, the command copier can copy the request command; then the final arbitrator can output the corresponding copied request command and transmit it through the first virtual channel at least one clock cycle after outputting the non-copying request command. As can be seen, when the zeroth virtual channel is configured to transmit non-copy request commands and the first virtual channel to transmit copy request commands, the memory controller can perform coordinated control of the zeroth and first virtual channels. In other words, the memory controller can set up a shared control logic circuit for the zeroth and first virtual channels, and through the means of copying request commands via a command copier and arbitrating non-copy request commands via the zeroth virtual channel and arbitrating copy request commands via the first virtual channel, the shared control logic circuit can treat the zeroth and first virtual channels as a single channel for control. This allows the memory controller to achieve coordinated control of the zeroth and first virtual channels through a single set of control logic, reducing the number of control logic circuits used by the memory controller and lowering its overhead.

[0069] For ease of understanding, based on Figure 2 The structure of the memory controller shown is as follows. Figure 3 An exemplary flowchart of the memory access method provided in an embodiment of this application is shown, in conjunction with... Figure 2 and Figure 3 As shown, the method flow may include the following steps.

[0070] In step S310, the request command corresponding to the memory access request is determined, wherein the request command corresponding to the memory access request participates in the final arbitration.

[0071] In an optional implementation, step S310 can be executed by a shared control logic circuit.

[0072] In an optional implementation, the shared control logic circuitry can determine the request command corresponding to the memory access request through initial arbitration. For example, the shared control logic circuitry can determine the command participating in the initial arbitration of the memory access request based at least on the page hit status of the memory access request, and then determine the command that passes the initial arbitration from the commands participating in the initial arbitration as the request command corresponding to the memory access request.

[0073] It's important to note that memory can comprise multiple blocks. A block can be viewed as a data block of a predetermined size within memory; for example, a block can be an array of rows and columns. A row, as part of a memory address (e.g., a DRAM address), corresponds to a group of storage units arranged in rows within memory, also known as a page. For instance, storage units in memory are organized by rows to form a page. The number of storage units contained in a row can be fixed (due to memory protocol specifications), and this number corresponds to the row size. A column, as another part of a memory address (e.g., a DRAM address), corresponds to the location of a storage unit within a row when storage units are arranged in rows. Therefore, specific storage units can be located in memory using both row and column addresses.

[0074] It's important to further explain that memory blocks have buffers. Reading and writing data within a block requires passing through these buffers. Since the block buffer only loads one row of the block at a time, when accessing a row (e.g., reading or writing), if the requested row is not loaded into the buffer, a precharge operation is needed to write the data from the buffer back to the previously loaded row. Then, an activation operation loads the requested row into the buffer before data can be read or written. Therefore, based on the row loading status of the block buffer, page hit scenarios for memory access requests can be categorized as page hit, page miss, and page conflict.

[0075] As an optional implementation, embodiments of this application can generate different commands for participating in the initial arbitration of memory access requests based on different page hit situations such as page hit, page miss, and page conflict; then, the command that passes the initial arbitration is determined from the commands participating in the initial arbitration, which is used as the request command corresponding to the memory access request and sent to the final arbitrator for final arbitration.

[0076] In step S311, the command output by the final arbitration is monitored. If the command output by the final arbitration is a non-copyable request command and the request command meets the preset copying conditions, the request command is copied to obtain the corresponding copied request command. The copied request command participates in the final arbitration.

[0077] In an optional implementation, S311 can be implemented by a command reciprocator.

[0078] In an optional implementation, based on the array of rows and columns formed by blocks in memory, the request commands can be divided into row commands that control rows and column commands that perform memory access operations on columns.

[0079] It should be noted that the two virtual channels (the zeroth virtual channel and the first virtual channel) of the memory channel can share the command bus and address bus, but have independent data buses. Memory blocks are organized in rows and columns. The command bus can include a row command interface and a column command interface. The row command bus can transmit row commands, which are any commands other than read and write commands. The column command bus can transmit column commands, which are only for read and write commands, such as the column address strobe (CAS) command for reading data and the column address strobe command for writing data.

[0080] As an optional implementation, column commands can be used to perform memory access operations (such as read or write operations) on columns in the row specified by the memory access request, including but not limited to: column commands for read operations (i.e., read commands) and column commands for write operations (i.e., write commands). Column commands, also known as column strobe commands, are used to select columns in a row to read or write data to the selected columns. For example, a column strobe command can select columns in a block's row, thereby allowing concurrent reading or writing of data in a specified starting column and subsequent columns.

[0081] As an optional implementation, row commands can be considered as any command other than column commands such as read commands and write commands. Row commands can be used to control at least the block where the row pointed to by the memory access request is located, including but not limited to any of the following: row activation (ACT) command, per bank pre-charge (PCHGpb) command, refresh command, RFMab (All Bank Refresh Management) command, RFMpb (Per Bank Refresh Management) command, etc.

[0082] It should be noted that the row activation command, also known as the row address strobe (RAS) command, is used to load rows into the block buffer. The precharge command is used to precharge the block corresponding to the row; precharge commands can be divided into single-bank pre-charge (PCHGpb) commands and all-banks pre-charge (PCHGab) commands. A single-bank pre-charge command precharges a single block of memory, while an all-banks pre-charge command precharges all blocks of memory. The refresh command is used to refresh memory to prevent data loss; for example, for DRAM and similar types of memory, due to charge leakage, periodic refreshes are necessary to prevent data loss. Refresh commands can be divided into All Banks Refresh (REFab) commands and Per Bank Refresh (REFpb) commands. A full bank refresh command controls the refresh operation of all blocks in memory simultaneously, while a per bank refresh command controls the refresh of memory block by block, for example, each block in memory receives the refresh command independently. Compared to the full bank refresh command, the per bank refresh command allows for more flexible refreshing of a specific block or a portion of blocks in memory.

[0083] Based on the classification of request commands into row commands and column commands, as an optional implementation, since row commands control the block where the row pointed to by the memory access request is located, which involves affecting the page state of the block, in order to maintain the consistency of the page state of the zeroth virtual channel and the first virtual channel, this application embodiment can copy the non-copied row commands output by the final arbitrator (i.e., copy the row commands from the shared control logic circuit output by the final arbitrator). In other words, when the command output by the final arbitrator is a non-copied row command, this application embodiment can copy the row command to obtain the corresponding copied row command, thereby ensuring that the zeroth virtual channel (transmitting non-copied row commands) and the first virtual channel (transmitting copied row commands) can transmit the same row command. At the same time, combined with the fact that if the column command output by the final arbitrator originates from a memory access request of the first data length (e.g., 32 bytes), the column command does not carry an automatic precharge instruction, which can make the page state of the zeroth virtual channel and the first virtual channel consistent.

[0084] As an optional implementation, since column commands are memory access operations that read or write columns in the row pointed to by the memory access request, and a single virtual channel supports a data length of a first data length (e.g., 32 bytes) for transmitting data, if the command output by the final arbitrator is a non-copied column command, and the data length of the memory access request corresponding to the column command is the first data length (e.g., 32 bytes), and the virtual channel indicated by the memory access request is the zeroth virtual channel, then since the zeroth virtual channel itself is set to transmit non-copied request commands, it is not necessary to copy the column command in this embodiment. If the final arbiter outputs a non-replicated column command, and the data length of the memory access request corresponding to the column command is a first data length (e.g., 32 bytes), and the virtual channel indicated by the memory access request is a first virtual channel, then since the first virtual channel is set to transmit replicated request commands instead of non-replicated request commands, the column command output by the final arbiter must be discarded. At the same time, the command replicator must replicate the column command to transmit the replicated column command through the first virtual channel, thereby satisfying the memory access requirement of the first data length (e.g., 32 bytes) memory access request.

[0085] In a further optional implementation, to facilitate the distinction between column commands that need to be discarded and those that actually need to be sent to memory, the final arbitrator can annotate the output column commands with real or false annotation information. If the request command output by the final arbitrator is a non-copying column command, and the data length of the memory access request corresponding to the column command is a first data length, and the virtual channel indicated by the memory access request corresponding to the column command is a first virtual channel (which is inconsistent with the setting of the request command for transmitting copying in the first virtual channel), then the final arbitrator can annotate the column command with false annotation information when outputting it, so that the column command is not actually sent to memory, and instead the corresponding copying column command with a real annotation is sent to memory to satisfy the memory access requirement.

[0086] Except for marking the column commands output by the final arbitrator as false in the aforementioned situations, the final arbitrator can mark the output column commands as true in all other cases. Based on this, embodiments of this application can set conditions for marking column commands as false. Thus, when the request command output by the final arbitrator is a column command and the conditions are met, the output column command is marked as false. These conditions may include: the column command output by the final arbitrator is a non-replicated column command, the data length of the memory access request corresponding to the column command is a first data length, and the virtual channel indicated by the memory access request corresponding to the column command is a first virtual channel. Correspondingly, when the request command output by the final arbitrator is a column command and the above conditions are not met, the output column command is marked as true.

[0087] For example, if the final arbiter outputs a non-copying column command, and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request corresponding to the column command is the zeroth virtual channel (consistent with the setting of transmitting non-copying request commands via the zeroth virtual channel), then the final arbiter can annotate the column command with actual annotation information when outputting it, so that the column command can be actually sent to memory. For example, if the final arbiter outputs a non-copying column command, and the data length of the memory access request corresponding to the column command is 32 bytes, and the virtual channel indicated by the memory access request is the zeroth virtual channel, then the final arbiter can annotate the column command with actual annotation information when outputting it.

[0088] It should be noted that, in the optional implementation, if the column command output by the final arbiter originates from a memory access request of the first data length, the column command does not carry an automatic precharge instruction; for example, if the read or write command output by the final arbiter is triggered by a 32-byte memory access request, the read or write command cannot carry an automatic precharge instruction. If the column command output by the final arbiter originates from a memory access request of the second data length, the column command may or may not carry an automatic precharge instruction; for example, if the read or write command output by the final arbiter is triggered by a 64-byte memory access request, the read or write command may or may not carry an automatic precharge instruction.

[0089] It should be noted that if the request command output by the final arbiter is a non-copyable column command, and the data length of the memory access request corresponding to the column command is the second data length (e.g., 64 bytes), then the virtual channel indicated by the memory access request (i.e., the virtual channel address of the memory access request) is disregarded. The final arbiter can first output a column command marked as real, indicating transmission through the zeroth virtual channel, and then copy this column command through the command replicator. This allows the final arbiter to then output a copied column command marked as real, indicating transmission through the first virtual channel, thus satisfying the memory access requirements of the second data length memory access request. In other words, when the command output by the final arbiter is a non-copyable request command, and the request command is a column command marked as real originating from a memory access request of the second data length, the command replicator needs to copy this column command to transmit the non-copyable column command (the zeroth virtual channel supports data transmission of the first data length) through the zeroth virtual channel and the corresponding copied column command (the first virtual channel supports data transmission of the first data length) through the first virtual channel, thereby satisfying the memory access requirements of the second data length memory access request.

[0090] Based on the above description, if the command output by the final arbitrator is a non-copied column command (i.e., the column command output by the final arbitrator comes from the shared control logic circuit), and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request corresponding to the column command is the zeroth virtual channel, then the column command can be regarded as a column command corresponding to a memory access request of the first data length that performs a read or write operation. Furthermore, the column command is transmitted on the zeroth virtual channel that transmits the non-copied request command itself, therefore, the column command does not need to be copied. Thus, except in the above cases, the command copier can copy the command output by the final arbitrator. In other words, when the command output by the final arbitrator is a non-copied request command, and is other than a column command that is marked as genuine and originates from a memory access request of the first data length, the embodiments of this application can copy the command output by the final arbitrator to obtain the corresponding copied request command. For example, when the final arbitrator outputs a non-copyable request command, or a read or write command other than a genuine memory access request originating from 32 bytes, the command copier can copy the request command output by the final arbitrator to obtain the corresponding copied request command.

[0091] In step S312, the commands participating in the final arbitration are subject to final arbitration; if the command that passes the final arbitration is a non-copying request command, the request command is output and it is indicated that the command is transmitted through the zeroth virtual channel; and at least one clock cycle after the output of the non-copying request command, if the final arbitration passes the corresponding copying request command, the copying request command is output and it is indicated that the command is transmitted through the first virtual channel.

[0092] In an optional implementation, step S312 can be performed by the final arbitrator. In a further optional implementation, the copying command output by the final arbitrator can be deleted from the command copier. For example, the command copier can save the copying command sent to the final arbitrator for final arbitration, and the command copier can delete the copying command output by the final arbitrator from the saved data.

[0093] As an optional implementation, the request commands are divided into row commands and column commands. Figure 4A Another example diagram of the memory controller provided in the embodiments of this application is shown as an example, in conjunction with... Figure 2 and Figure 4A As shown, the final arbiter 230 in the memory controller may include a column command arbiter 41 and a row command arbiter 42; wherein the column command arbiter can perform final arbitration on column commands, and the row command arbiter can perform final arbitration on row commands.

[0094] The following sections will introduce the optional implementations of the final arbitrator from the perspectives of final arbitration scheduling via column commands and final arbitration scheduling via row commands.

[0095] As an optional implementation, the column command arbitrator can be used to: perform final arbitration on column commands sent to the final arbitrator; if the column command that passes the final arbitration is a non-copy column command, output the column command and indicate that it is transmitted through the zeroth virtual channel; and if there is a corresponding copy column command for the output non-copy column command, then at least one clock cycle after the output of the non-copy column command, the corresponding copy column command passes the final arbitration and the copy column command is output and indicated that it is transmitted through the first virtual channel.

[0096] As an optional implementation, the non-copyable column commands output by the column command arbitrator may fall into the following categories:

[0097] Case 1: The non-copy column command originates from a memory access request of the first data length (i.e., the data length of the memory access request corresponding to the non-copy column command is the first data length), and the virtual channel indicated by the memory access request is the zeroth virtual channel (i.e., the virtual channel address of the memory access request belongs to the zeroth virtual channel); for example, the non-copy column command originates from a 32-byte memory access request, and the virtual channel address carried by the physical address of the memory access request belongs to the zeroth virtual channel.

[0098] Case 2: The non-copy column command originates from a memory access request of the first data length, and the virtual channel indicated by the memory access request is the first virtual channel; for example, the non-copy column command originates from a 32-byte memory access request, and the virtual channel address carried by the physical address of the memory access request belongs to the first virtual channel.

[0099] Case 3: The non-copying column command originates from a memory access request of the second data length, and the virtual channel indicated by the memory access request is the zeroth virtual channel or the first virtual channel; for example, the non-copying column command originates from a 64-byte memory access request, and the virtual channel address carried by the physical address of the memory access request belongs to the zeroth virtual channel or the first virtual channel; it should be noted that when the data length of the memory access request is the second data length, the virtual channel address of the memory access request is not considered in this embodiment of the application. Therefore, when the data length of the memory access request is the second data length, the memory access request of the zeroth virtual channel or the first virtual channel is integrated into Case 3.

[0100] As an optional implementation, the command copier copies the request command when the command output by the final arbitrator is a non-copyable request command or a column command other than a genuine memory access request originating from the first data length. Since the final arbitrator outputs a non-copyable column command that is genuine and originates from the first data length, corresponding to Case 1 above, this embodiment does not perform command copying in Case 1, but performs command copying in Cases 2 and 3. In other words, when the column command arbitrator outputs a non-copyable column command, or when the column command is other than a genuine memory access request originating from the first data length (i.e., the column command is not genuine, and the data length of the memory access request corresponding to the column command is not the first data length), the column command is copied to obtain the corresponding copied column command.

[0101] In the optional implementation, for case one, Figure 4B Another flowchart of the memory access method provided in the embodiments of this application is shown as an example, with reference to Figure 4B This method flow can be implemented by a column command arbitrator, and the method flow can include the following steps.

[0102] In step S410, the column command arbitrator performs final arbitration on the column commands sent to the final arbitrator.

[0103] The column commands sent to the final arbitrator may be non-replicated column commands (from the shared logic circuit) or replicated column commands (from the replication command unit). In an optional implementation, the column command arbitrator can perform final arbitration on the column commands sent to the final arbitrator according to a certain arbitration strategy, such as an arbitration strategy based on command priority and the order in which they are sent. In an optional implementation, since a replicated request command necessarily has a corresponding non-replicated request command that has been output by the final arbitrator, the replicated request command can have the highest priority during final arbitration. This ensures that after the non-replicated request command is transmitted through the zeroth virtual channel, the corresponding replicated request command can be immediately transmitted through the first virtual channel, thereby ensuring the consistency of the state between the zeroth and first virtual channels.

[0104] In step S411, if the column command arbitrator arbitrates a non-copied column command, and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request is the zeroth virtual channel, then the column command whose virtual channel address belongs to the zeroth virtual channel and is marked as real is output in the current clock cycle.

[0105] In an optional implementation, if the column command arbitrator arbitrates a non-copying column command and the memory access request corresponding to the arbitrated non-copying column command is 32 bytes, then when the virtual channel address of the memory access request belongs to the zeroth virtual channel (the zeroth virtual channel is set to transmit non-copying column commands), the column command arbitrator can output a column command in the current clock cycle whose virtual channel address belongs to the zeroth virtual channel and is marked as real (this column command does not have an automatic precharge instruction).

[0106] It should be noted that when the non-copied column command arbitrated by the column command arbitrator originates from a memory access request of the first data length (e.g., 32 bytes), and the virtual channel address of the memory access request belongs to the zeroth virtual channel, since the column command arbitrator outputs in the current cycle that the virtual channel address belongs to the zeroth virtual channel and marks the real column command, the column command corresponding to the memory access request of the first data length (e.g., 32 bytes) can be sent to memory through the zeroth virtual channel, thereby transmitting the memory access data of the first data length (e.g., 32 bytes) and satisfying the memory access requirements of the memory access request of the first data length (e.g., 32 bytes); therefore, the command copier does not need to copy the corresponding column command again, and the first virtual channel does not need to transmit the corresponding copied column command again.

[0107] Based on this, in the embodiments of this application, the final arbiter can be occupied by the command copier in the next clock cycle, but the final arbiter does not output column commands to ensure data integrity. That is, when the column command arbiter outputs a non-copied column command, and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request is the zeroth virtual channel, the embodiments of this application can ensure that the final arbiter is occupied by the command copier in the next clock cycle, but the final arbiter does not output any column commands in the next clock cycle, to ensure that the first virtual channel will not transmit copied column commands, thereby ensuring data integrity. It is understood that if the column command arbiter outputs the corresponding copied column command in the next clock cycle and indicates transmission through the first virtual channel, then when the column command is a read operation, it will lead to repeated reading of read data, increasing meaningless power consumption. Furthermore, when the column command is a write operation, it may cause data already written to memory to be corrupted, resulting in write errors.

[0108] As an example, when the arbitrated column command comes from a shared logic circuit, and the data length of the memory access request corresponding to the column command is 32 bytes, and the virtual channel address of the memory access request belongs to the zeroth virtual channel, the column command arbitrator can output a column command in the current clock cycle that the virtual channel address belongs to the zeroth virtual channel and is marked as real (this column command does not have an automatic precharge instruction); and in the next clock cycle, the command replicator can occupy the final arbitrator, but the final arbitrator (the column command arbitrator of the final arbitrator) will not output any column command.

[0109] In the optional implementation, for case two, Figure 4C An exemplary flowchart of the memory access method provided in this application embodiment is shown below, with reference to... Figure 4C The method process may include the following steps.

[0110] In step S420, the column command arbitrator performs final arbitration on the column commands sent to the final arbitrator.

[0111] In step S421, if the column command arbitrator ultimately arbitrates a non-copy column command, and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request is the first virtual channel, then in the current clock cycle, it outputs a column command whose virtual channel address belongs to the zeroth virtual channel and is marked as false.

[0112] It should be noted that column commands marked as genuine can be sent to memory after being output by the final arbiter; while column commands marked as fake must be discarded (i.e. not sent to memory) after being output by the final arbiter.

[0113] In step S422, the command replicator copies the non-replicated column commands output by the column command arbitrator to obtain the corresponding replicated column commands, and sends the replicated column commands to the final arbitrator.

[0114] When a non-copied column command arbitrated by the column command arbitrator originates from a memory access request of the first data length (e.g., 32 bytes) and the virtual channel address of the memory access request belongs to the first virtual channel, the arbitrated non-copied column command needs to be transmitted through the set zeroth virtual channel, which is inconsistent with the first virtual channel indicated by the memory access request. Therefore, the column command arbitrator can output a column command with a virtual channel address belonging to the zeroth virtual channel and marked as false in the current clock cycle, so that the non-copied column command output by the column command arbitrator can be discarded based on the false marking information. At the same time, the command replicator can copy the non-copied column command output by the column command arbitrator and send the corresponding copied column command to the final arbitrator, so that the final arbitrator can output the corresponding copied column command marked as true in the next clock cycle and indicate that it should be transmitted through the first virtual channel to meet the memory access requirements of the memory access request of the first data length.

[0115] In step S423, the column command arbitrator finally arbitrates the corresponding replicated column command in the next clock cycle and outputs the column command whose virtual channel address belongs to the first virtual channel and is marked as a real replicated column command.

[0116] For example, when the column command that is ultimately arbitrated comes from a shared logic circuit, and the data length of the memory access request corresponding to the column command is 32 bytes, and the virtual channel address of the memory access request belongs to the first virtual channel, the column command arbitrator can output a virtual channel address belonging to the zeroth virtual channel in the current clock cycle and mark it as a false column command; thus, the command replicator can copy the column command output by the column command arbitrator and send the corresponding copied column command to the column command arbitrator; then, in the next clock cycle, the column command arbitrator can arbitrate the column command copied from the command replicator and output a virtual channel address belonging to the first virtual channel and mark it as a real copied column command (this column command does not have an automatic precharge instruction) to meet the memory access requirement of 32 bytes of data length.

[0117] As can be seen, for a 32-byte memory access request, the virtual channel address of the memory access request can be used in this embodiment. That is, the virtual channel address of the 32-byte memory access request determines the virtual channel that actually sends the column command. Simultaneously, based on the principle set in this embodiment that the zeroth virtual channel transmits non-copying request commands and the first virtual channel transmits copying request commands, when the column command arbitrator outputs a non-copying column command for a 32-byte memory access request, if the virtual channel address indicated by the memory access request belongs to the first virtual channel (for example, when the shared control logic circuit provides a column command to the column command arbitrator, the virtual channel address of the provided column command belongs to the first virtual channel), then the column command arbitrator can, based on the principle that non-copying column commands are transmitted through the zeroth virtual channel, output a non-copying column command whose virtual channel belongs to the zeroth virtual channel but is marked as false, to indicate that the output column command is invalid and needs to be discarded; subsequently, by outputting the corresponding copying column command that belongs to the first virtual channel but is marked as real, the memory access requirement of the memory access request is guaranteed, and the zeroth virtual channel and the first virtual channel are guaranteed to transmit commands according to the set parameters.

[0118] In the optional implementation, for case three, Figure 4D An exemplary flowchart of another memory access method provided in an embodiment of this application is shown below, with reference to... Figure 4D The method process may include the following steps.

[0119] In step S430, the column command arbitrator performs final arbitration on the column commands sent to the final arbitrator.

[0120] In step S431, if the column command arbitrator ultimately arbitrates a non-copying column command and the data length of the memory access request corresponding to the column command is the second data length, then in the current clock cycle, the virtual channel address is output as belonging to the zeroth virtual channel and marked as the real column command.

[0121] In step S432, the command replicator copies the non-replicated column commands output by the column command arbitrator to obtain the corresponding replicated column commands, and sends the replicated column commands to the final arbitrator.

[0122] When a non-copying column command arbitrated by the column command arbitrator originates from a memory access request of the second data length (e.g., 64 bytes), this embodiment does not consider the virtual channel address indicated by the memory access request. Instead, based on the principle that the zeroth virtual channel transmits the non-copying request command and the first virtual channel transmits the copying request command, a column command whose virtual channel address belongs to the zeroth virtual channel and is marked as real is first output in the current clock cycle. The command replicator then copies the column command output by the column command arbitrator. Subsequently, the column command arbitrator can output the corresponding copied column command (the virtual channel address belongs to the first virtual channel and is marked as real) in the next clock cycle. This satisfies the memory access requirements of the memory access request of the second data length by having the column command arbitrator output the corresponding copied column command (the virtual channel address belongs to the zeroth virtual channel and is marked as real) in the current clock cycle and the corresponding copied column command (the virtual channel address belongs to the first virtual channel and is marked as real) in the next clock cycle.

[0123] In step S433, the column command arbitrator arbitrates the corresponding copied column command in the next clock cycle and outputs the column command whose virtual channel address belongs to the first virtual channel and is marked as a real copied column command.

[0124] As an optional implementation, the column command arbitrator can, in the next clock cycle, prohibit column commands from the common control logic circuit from participating in the final arbitration (i.e., prohibit non-replicated column commands from participating in the final arbitration).

[0125] For example, when the arbitrated column command comes from a shared logic circuit and the data length of the memory access request corresponding to the column command is 64 bytes, the column command arbitrator does not consider the virtual channel address indicated by the memory access request. Instead, it outputs a column command in the current clock cycle whose virtual channel address belongs to the zeroth virtual channel and is marked as real (this command may have an automatic precharge instruction). Thus, the command replicator can copy the column command output by the column command arbitrator and send the corresponding copied column command to the column command arbitrator. Furthermore, in the next clock cycle, the column command arbitrator can prohibit column commands from the shared control logic circuit from participating in the final arbitration, and arbitrate the column command copied by the command replicator. It then outputs a column command whose virtual channel address belongs to the first virtual channel and is marked as real (this column command may have an automatic precharge instruction to maintain consistency with the column command transmitted by the zeroth virtual channel output by the column command arbitrator in the previous clock cycle) to meet the memory access requirement of 64 bytes of data length.

[0126] Based on the arbitration scheduling scheme for column commands provided in the embodiments of this application, as an example, the processing example of column commands (column commands for read operations or column commands for write operations) can be as follows:

[0127] When the column command arbitrator arbitrates a column command from the shared control logic circuit, if the data length of the memory access request corresponding to the column command is 64 bytes, the column command arbitrator can output a virtual channel address belonging to the zeroth virtual channel and marked as a real column command in the current clock cycle; thus, the command replicator can copy the column command output by the column command arbitrator and send the copied column command to the column command arbitrator; then, in the next clock cycle, the column command arbitrator arbitrates the column command copied by the command replicator and outputs a virtual channel address belonging to the first virtual channel and marked as a real copied column command.

[0128] When the column command arbitrator arbitrates a column command from the shared control logic circuit, if the data length of the memory access request corresponding to the column command is 32 bytes, then when the virtual channel address of the memory access request belongs to the zeroth virtual channel, the column command arbitrator can output a column command in the current clock cycle that the virtual channel address belongs to the zeroth virtual channel and is marked as the real column command; and the command replicator does not replicate the column command output by the column command arbitrator, and occupies the final arbitrator in the next clock cycle, but the final arbitrator will not output any column command in the next clock cycle;

[0129] When the column command arbitrator arbitrates a column command from the shared control logic circuit, if the data length of the memory access request corresponding to the column command is 32 bytes, then when the virtual channel address of the memory access request belongs to the first virtual channel, the column command arbitrator can output a column command in the current clock cycle whose virtual channel address belongs to the zeroth virtual channel and is marked as false. Thus, the command replicator can copy the column command output by the column command arbitrator and send the copied column command to the column command arbitrator. Then, in the next clock cycle, the column command arbitrator arbitrates the column command copied by the command replicator and outputs a column command whose virtual channel address belongs to the first virtual channel and is marked as true.

[0130] The following describes the arbitration scheduling process for line commands. As an optional implementation, the line command arbitrator can be used to: perform final arbitration on line commands sent to the final arbitrator; if the line command that passes the final arbitration is a non-copy line command, output the line command and indicate that it will be transmitted through the zeroth virtual channel; and, at least one clock cycle after outputting the non-copy line command, output the corresponding copy line command and indicate that it will be transmitted through the first virtual channel.

[0131] It should be noted that since row commands control the pre-charging and row activation of the block containing the row, and do not involve data reading or writing to columns, to ensure the consistency of the page states of the blocks in the zeroth virtual channel and the first virtual channel, this embodiment can be configured to require the zeroth virtual channel and the first virtual channel to transmit the same row commands. That is, non-copied row commands transmitted through the zeroth virtual channel need to be copied, and the corresponding copied row commands need to be transmitted through the first virtual channel to ensure the consistency of the page states of the zeroth virtual channel and the first virtual channel. Therefore, any non-copied row command output by the row command arbitrator (i.e., any row command output by the row command arbitrator from the shared control logic circuit) can be copied by the command copier to obtain the corresponding copied row command.

[0132] It should be noted that, since non-copying line commands need to be transmitted through the zeroth virtual channel and corresponding copied line commands need to be transmitted through the first virtual channel, there is no situation where line commands (non-copying line commands and corresponding copied line commands) need to be discarded. Therefore, in optional implementations, the embodiments of this application may not mark the line commands (non-copying line commands and corresponding copied line commands) output by the line command arbitrator with real and false labeling information.

[0133] In an optional implementation, if the row command arbitrator ultimately arbitrates a non-copying row command, it can select a row command to participate in the final arbitration (e.g., from the non-copying row command sent from the common control logic circuit and the copied row command sent from the copying commander) at least one clock cycle after the output of the non-copying row command, based on the clock cycle occupied by the output non-copying row command on the command bus, and output the copied row command and indicate that it is transmitted through the first virtual channel.

[0134] As an optional implementation, different types of commands may occupy different clock cycles on the command bus. For example, for technologies like HBM3, the clock cycle occupied by commands on the command bus can be 0.5 clock cycles, 1 clock cycle, or 1.5 clock cycles. Based on this, in an optional implementation, the clock cycle occupied by non-replicated line commands output by the line command arbitrator on the command bus may be 1.5 clock cycles, 1 clock cycle, or 0.5 clock cycles. The line command arbitration scheduling methods under different clock cycle conditions are explained below.

[0135] As an optional implementation Figure 5A This example illustrates another flowchart of the memory access method provided in an embodiment of this application. This method can be implemented by a command arbitrator. (Refer to...) Figure 5A The method process may include the following steps.

[0136] In step S510, if the line command arbitrator ultimately arbitrates a non-copy line command, it outputs the line command and indicates that it will be transmitted through the zeroth virtual channel.

[0137] Correspondingly, the command copier can copy the non-copied line commands output by the line command arbitrator (i.e., line commands from the common control logic circuit output by the line command arbitrator) to obtain the corresponding copied line commands, and send the copied line commands to the line command arbitrator.

[0138] In step S511, if the output non-copying line command of the line command arbitrator occupies 1.5 clock cycles of the command bus, then in the first clock cycle after the current clock cycle of the output non-copying line command, the line command that does not conflict with the output line command and occupies 0.5 clock cycles of the command bus is selected to participate in the final arbitration.

[0139] In a further optional implementation, the selected line command occupying 0.5 clock cycles of the command bus can be output by the line command arbitrator 0.5 clock cycles after the first clock cycle. The output line command is copied by the command copier, and the corresponding copied line command can be output by the line command arbitrator 0.5 clock cycles after the third clock cycle.

[0140] In step S512, the line command arbitrator prohibits non-replicated line commands from participating in the final arbitration during the second to third clock cycles following the current clock cycle.

[0141] In this embodiment, the clock cycle of the non-copying line command output by the line command arbitrator is taken as the current clock cycle (denoted as T0). If the non-copying line command output by the line command arbitrator at T0 occupies 1.5 clock cycles of the command bus, then in the first clock cycle after T0 (i.e., the next clock cycle after the current clock cycle, denoted as T1), the line command selected by the line command arbitrator to participate in the final arbitration should meet the following conditions: it does not conflict with the non-copying line command output at T0, and occupies 0.5 clock cycles of the command bus, so that the line command participating in the final arbitration at T1 corresponds to the non-copying line command output at T0 occupying 1.5 clock cycles of the command bus.

[0142] Furthermore, during the second and third clock cycles after T0 (the second clock cycle after T0 is set to T2, and the third clock cycle after T0 is set to T3), the line command arbitrator prohibits non-replicated line commands from participating in the final arbitration. That is, the line command arbitrator prohibits line commands not from the command replicator from participating in the arbitration during T2 to T3 (line commands not from the command replicator can be regarded as line commands from the shared control logic circuit).

[0143] In step S513, the line command arbitrator outputs the copied line commands in sequence from the second clock cycle to the third clock cycle, according to the order in which the copied line commands were sent from the current clock cycle to the first clock cycle, and indicates that the virtual channel address belongs to the first virtual channel.

[0144] Since the non-replicated line commands output by the line command arbitrator at T0 occupy 1.5 clock cycles of the command bus, the line command arbitrator does not output any line commands at time T1 (the line commands output at T0 still occupy the command bus at time T1). It needs to wait to output line commands at time T2. Therefore, the line commands output by the line command arbitrator at times T2 to T3 are the replicated line commands sent to the line command arbitrator by the command replicator at times T0 to T1. Based on this, as an optional implementation, the line command arbitrator at times T2 to T3 outputs the replicated line commands sequentially according to the order in which the replicated line commands were sent to the line command arbitrator at times T0 to T1, and instructs the output replicated line commands to be transmitted through the first virtual channel.

[0145] As an optional implementation Figure 5B This example illustrates yet another flowchart of the memory access method provided in an embodiment of this application. This method can be implemented by a command arbitrator. (Refer to...) Figure 5B The method process may include the following steps.

[0146] In step S520, if the line command arbitrator ultimately arbitrates a non-copy line command, it outputs the line command and indicates that it will be transmitted through the zeroth virtual channel.

[0147] In step S521, if the non-copying line command output by the line command arbitrator occupies one clock cycle of the command bus, then in the first clock cycle after the current clock cycle of the output non-copying line command, the non-copying line command is prohibited from participating in the final arbitration.

[0148] In step S522, the line command arbitrator outputs the corresponding copied line command and indicates that the virtual channel address belongs to the first virtual channel in the first clock cycle after the current clock cycle.

[0149] Since the non-copied row commands output by the row command arbitrator at T0 occupy one clock cycle of the command bus, the row command arbitrator can output the corresponding copied row commands at T1 to maintain the page state consistency between the zeroth virtual channel and the first virtual channel. Based on this, in the first clock cycle after T0 (i.e., T1), the row command arbitrator should prohibit non-copied row commands from participating in the final arbitration. That is, the row command arbitrator performs the final arbitration of the row commands from the command replicator at T1, and the row command that passes the final arbitration is the copied row command corresponding to the row command output at T0; furthermore, the row command arbitrator outputs the corresponding copied row command at T1 and indicates that the virtual channel address belongs to the first virtual channel.

[0150] As an optional implementation Figure 5C Another flowchart of the memory access method provided in this application embodiment is shown as an example. This method flow can be implemented by a command arbitrator. Refer to Figure 5C The method process may include the following steps.

[0151] In step S530, if the line command arbitrator ultimately arbitrates a non-copy line command, it outputs the line command and indicates that it will be transmitted through the zeroth virtual channel.

[0152] In step S531, if the output non-copying line command of the line command arbitrator occupies 0.5 clock cycles of the command bus, then in the 0.5 clock cycle after the current clock cycle of the output line command, the line command that does not conflict with the output line command and occupies 0.5 clock cycles of the command bus is selected to participate in the final arbitration.

[0153] In step S532, the line command arbitrator prohibits non-replicated line commands from participating in arbitration in the first clock cycle after the current clock cycle.

[0154] If the non-copying line command output by the line command arbitrator occupies 0.5 clock cycles of the command bus, the line command arbitrator can select a line command to participate in the final arbitration in the 0.5th clock cycle after T0 (i.e., the next 0.5 clock cycles after the current clock cycle, denoted as T0.5). The line command selected by the line command arbitrator to participate in the final arbitration should meet the following conditions: it does not conflict with the non-copying line command output by T0, and it occupies 0.5 clock cycles of the command bus, so that the line command participating in the final arbitration at T0.5 corresponds to the non-copying line command output by T0 occupying 0.5 clock cycles of the command bus.

[0155] Furthermore, in T1 following T0, the command arbitrator prohibits non-replicated command lines from participating in the final arbitration; that is, the command arbitrator prohibits command lines not from the command replicator from participating in the arbitration in T1.

[0156] In a further optional implementation, the selected row command occupying 0.5 clock cycles of the command bus can be output by the row command arbitrator in the first clock cycle, the output row command is copied by the command copier, and the corresponding copied row command can be output by the row command arbitrator 0.5 clock cycles after the first clock cycle.

[0157] In step S533, the line command arbitrator outputs the copied line commands in sequence according to the order of their input time in the current clock cycle, and indicates that the virtual channel address belongs to the first virtual channel, in the first clock cycle after the current clock cycle.

[0158] Since the non-copying line commands output by the line command arbitrator at T0 occupy 0.5 clock cycles of the command bus, the line command arbitrator can output copied line commands at T1. At T1, the line command arbitrator can output copied line commands sequentially according to the order in which they were sent to the line command arbitrator at T0, and instruct the output copied line commands to be transmitted through the first virtual channel.

[0159] In a further optional implementation, to ensure the timing correctness of memory accesses, the shared control logic circuit, when determining the request command sent to the final arbitrator, can consider the timing parameters of the command output by the final arbitrator, so as to ensure that the timing of the request command sent to the final arbitrator corresponds and matches the timing of the command output by the final arbitrator. As an optional implementation, the shared control logic circuit can also be used to: determine memory access-related timing parameters (such as recording and detecting various timing parameters used in memory access requests) to ensure the correctness of memory access operations. It should be noted that memory access-related timing parameters can be various time-related parameters involved in the memory access operation. Timing parameters are designed to ensure the correctness of memory access operations; that is, in a computer system, data reading and writing in memory need to be performed within a certain time sequence to ensure correct data exchange and storage.

[0160] In an optional implementation, the shared control logic circuit can determine memory access-related timing parameters by counting timing parameters based on a timing parameter counting basis. For example, when the final arbitrator outputs a non-copying command, the shared control logic circuit can sample the command output by the final arbitrator and start timing parameter counting for the sampled command based on the timing parameter counting basis. As an optional implementation, the timing parameters of the command counting can be used by the shared control logic circuit to determine the request command sent to the final arbitrator, so that the timing of the request command sent to the final arbitrator corresponds and matches the timing of the command output by the final arbitrator, ensuring the timing correctness of memory access.

[0161] As an optional implementation, the clock cycle of the command bus occupied by the command can include 0.5 clock cycles, 1 clock cycle, or 1.5 clock cycles; in this embodiment, the timing parameter counting can be based on the command occupying 1.5 clock cycles of the command bus as the starting point for timing parameter counting.

[0162] In an optional implementation, to make the timing parameter counting basis more convenient and timely, for commands that occupy 1.5 clock cycles of the command bus, the timing parameter counting is performed starting from the first rising edge of the clock output by the final arbiter, so that commands that occupy 1.5 clock cycles of the command bus are counted one clock cycle earlier than the standard clock cycle.

[0163] It should be noted that, under the requirements of memory protocols such as HBM3, the timing parameter counting of a command starts from the second rising edge of the clock (i.e., the start of the standard clock cycle can be regarded as the second rising edge of the clock output by the final arbiter). In this embodiment of the application, for a command that occupies 1.5 clock cycles of the command bus, the timing parameter counting starts from the first rising edge of the clock output by the final arbiter, which is equivalent to advancing the start time of the timing parameter counting of the command by 1 clock cycle.

[0164] In other words, for commands that occupy 1.5 clock cycles of the command bus, this embodiment starts timing from the first rising edge of the clock output by the final arbiter (the starting point of timing parameter counting), instead of waiting for one clock cycle before starting timing parameter counting. This makes timing for commands occupying 1.5 clock cycles of the command bus more convenient. Furthermore, timing for commands occupying 1.5 clock cycles of the command bus in advance allows for earlier timing updates, thereby allowing for earlier filtering of commands that do not meet timing conditions and improving the timeliness of filtering commands that do not meet timing conditions.

[0165] As an optional implementation, the timing parameter counting basis (for commands occupying 1.5 clock cycles of the command bus, timing parameter counting is performed starting from the first rising edge of the clock output by the final arbiter) can be used as a basis in this application embodiment when sampling non-copying commands output by the final arbiter and starting timing parameter counting for the sampled commands. The timing parameter counting can be performed on the currently sampled command based on the timing parameter counting basis and according to the change in the clock cycles occupied by adjacent sampled commands on the command bus.

[0166] As an optional implementation, the clock cycle occupied by the command bus can be divided into 0.5 clock cycles, 1 clock cycle, and 1.5 clock cycles. The clock cycle occupied by adjacent samples (e.g., the two commands before and after the final arbiter output without copying) may vary as follows:

[0167] 1 clock cycle to 1 clock cycle, 0.5 clock cycles to 0.5 clock cycles, 1.5 clock cycles to 1.5 clock cycles, 1 clock cycle to 0.5 clock cycles, 0.5 clock cycles to 1 clock cycle, 1.5 clock cycles to 0.5 clock cycles, 1.5 clock cycles to 1 clock cycle, 0.5 clock cycles to 1.5 clock cycles, 1 clock cycle to 1.5 clock cycles, etc.

[0168] The following describes optional implementations of timing parameter counting for sampling commands in this application, considering different clock cycle variations. As an optional implementation, Figure 6 An exemplary flowchart of an embodiment of this application for counting timing parameters of commands is shown. This method can be implemented by a shared control logic circuit, as described above. Figure 6 The method process may include the following steps.

[0169] In step S610, the clock cycle variation of the sampling adjacent commands occupying the command bus is determined.

[0170] In step S611, if the clock cycle of the command bus occupied by the sampled adjacent command changes to any one of 1 clock cycle to 1 clock cycle, 0.5 clock cycles to 0.5 clock cycles, 1.5 clock cycles to 1.5 clock cycles, 1 clock cycle to 0.5 clock cycles, and 0.5 clock cycles to 1 clock cycle, then the timing parameters of the currently sampled command are counted according to the standard clock cycle based on the timing parameter counting basis.

[0171] For ease of explanation, the clock cycle can be defined as T. Thus, a command that occupies 0.5T (0.5 clock cycles) of the command bus can be called a 0.5T command, a command that occupies 1T (1 clock cycle) of the command bus can be called a 1T command, and a command that occupies 1.5T (1.5 clock cycles) of the command bus can be called a 1.5T command.

[0172] If the adjacent commands sampled are 1T commands to 1T commands (i.e., the clock cycle occupied by the adjacent commands sampled varies from 1 clock cycle to 1 clock cycle), the embodiments of this application can count the timing parameters of the currently sampled command based on the timing parameter counting basis described above, according to the standard clock cycle, for example, according to the standard timing parameters required by memory technologies such as HBM3, such as using the second rising edge of the clock of the command output as the starting position for timing parameter counting.

[0173] Similarly, for adjacent commands sampled as 0.5T to 0.5T, 1.5T to 1.5T (since the timing parameter counting for the 1.5T command is performed 1T earlier, the timing parameters of the 1.5T to 1.5T commands can cancel each other out, thus maintaining the timing requirements as calculated according to the standard timing parameters), 1T to 0.5T, and 0.5T to 1T commands, the embodiments of this application can perform timing parameter counting for the currently sampled command according to the standard clock cycle based on the timing parameter counting basis described above.

[0174] In step S612, if the clock cycle occupied by the adjacent sampled command on the command bus changes to either 1.5 clock cycles to 0.5 clock cycles or 1.5 clock cycles to 1 clock cycle, then based on the counting basis of the timing parameters, the currently sampled command is counted by adding 2 clock cycles to the standard clock cycle.

[0175] If the adjacent commands sampled are 1.5T commands to 0.5T commands (i.e., the clock cycle occupied by the adjacent commands on the command bus varies from 1.5 clock cycles to 0.5 clock cycles), the embodiments of this application can count the timing parameters of the currently sampled commands based on the timing parameter counting basis described above, by adding 2 clock cycles to the standard clock cycle, for example, by adding 2 clock cycles to the standard timing parameters required by memory such as HBM3.

[0176] It should be noted that in this embodiment of the application, the timing parameter counting start point for the 1.5T command is the first rising edge of the clock output by the final arbitrator, which is one clock cycle earlier than the standard timing parameters. Therefore, for the case of the 1.5T command to the 0.5T command, since the time interval is 1T, the timing parameter counting should be performed by adding one clock cycle to the standard timing parameters first; at the same time, based on the arbitration scheduling method described above for the 1.5T command case (see above). Figure 5AAs shown, if only one clock cycle is added to the standard timing parameters to count the timing parameters for commands from 1.5T to 0.5T, the timing parameter count for commands from 1.5T to 1T transmitted through the first virtual channel will be one clock cycle less than the standard timing parameters. Therefore, to compensate for this, in this embodiment, for commands from 1.5T to 0.5T, the timing parameter count is performed by adding two more clock cycles to the standard timing parameters.

[0177] In other words, the timing parameter counting for the 1.5T command begins on the first rising edge when it is output by the final arbiter, which is 1T earlier than the second rising edge required by memory protocols such as HBM3. For example, if the 1.5T command is output by the final arbiter at time T0, then the timing parameter counting for the 1.5T command begins at time T1. Furthermore, according to the scheme provided in this application's embodiments, if the final arbiter outputs a non-copyable 1.5T command (virtual channel address belonging to the zeroth virtual channel) at time T0, then the final arbiter will only start outputting a copyable 1.5T command (virtual channel address belonging to the first virtual channel) at time T2. Optional implementation methods can be specifically referred to... Figure 5A As shown above, assuming the output interval between the 1.5T command and the 0.5T command is N clock cycles, then at time T(N+1), the final arbiter can start outputting a non-replicated 0.5T command indicating the zeroth virtual channel. It can be seen that this 0.5T command is N clock cycles away from time T1 when the timing parameter counting of the 1.5T command begins, and N+1 clock cycles away from time T0 when the 1.5T command is output.

[0178] Meanwhile, based on the command copying rules of this application embodiment, the final arbiter needs to output a 0.5T command indicating the corresponding copy of the first virtual channel at time T(N+2), and the corresponding copy of the 1.5T command transmitted through the first virtual channel is output at time T2 (memory protocols such as HBM3 require that the 1.5T command must start from the rising edge of the clock), and the timing parameter counting of the corresponding copy of the 1.5T command starts at time T3. In other words, for non-replicated 1.5T commands and non-replicated 0.5T commands, there are corresponding replicated commands (i.e., replicated 1.5T commands and replicated 0.5T commands). The replicated 1.5T command starts timing parameter counting at time T3, and the replicated 0.5T command is output by the final arbitrator at time T(N+2). The interval between time T3 and time T(N+2) is N-1 clock cycles, which is less than the timing requirement of N clock cycles. Therefore, in this embodiment, for commands from 1.5T to 0.5T, standard timing parameters plus 2 clock cycles are used for timing parameter counting to meet the timing requirements of the corresponding replicated commands in the first virtual channel.

[0179] Similarly, for the case where the adjacent commands sampled are 1.5T commands to 1T commands, the embodiments of this application can count the timing parameters of the currently sampled command based on the timing parameter counting basis described above, by adding 2 clock cycles to the standard clock cycle.

[0180] In step S613, if the clock cycle of the adjacent command occupied by the sampled command changes to either 0.5 clock cycles to 1.5 clock cycles or 1 clock cycle to 1.5 clock cycles, then based on the counting basis of the timing parameters, the currently sampled command is counted by subtracting 1 clock cycle from the standard clock cycle.

[0181] If the adjacent commands sampled are 0.5T to 1.5T commands (i.e., the clock cycle occupied by the adjacent commands on the command bus varies from 0.5 clock cycles to 1.5 clock cycles), the embodiments of this application can count the timing parameters of the currently sampled command based on the timing parameter counting basis described above, by subtracting 1 clock cycle from the standard clock cycle, for example, by subtracting 1 clock cycle from the standard timing parameters required by memory such as HBM3.

[0182] It should be noted that since the timing parameter counting starts from the first rising edge of the clock output by the final arbiter for the 1.5T command, which is one clock cycle earlier than the standard timing parameters, the timing parameter counting start point for the 0.5T command to 1.5T command should be reduced by one clock cycle from the standard timing parameters, considering the timing interval between the 0.5T command and the 1.5T command.

[0183] Similarly, for the case where the adjacent commands sampled are 1T commands to 1.5T commands, the embodiments of this application can count the currently sampled commands based on the timing parameters described above, by subtracting 1 clock cycle from the standard clock cycle.

[0184] As an optional implementation, a shared timing check module can be set up in the shared control logic circuit for both the zeroth virtual channel and the first virtual channel to execute the process of determining the memory access-related timing parameters described above. For example, the process of the shared control logic circuit sampling the non-copying command output by the final arbitrator and starting timing parameter counting for the sampled command can be implemented by the shared timing check module in the shared control logic circuit. By adjusting the timing parameter count in the above manner, timing can be performed on the zeroth virtual channel, thus satisfying the timing requirements of both the zeroth and first virtual channels and simplifying the logic of timing.

[0185] The timing parameter counting method provided in this application can assist the memory controller in the coordinated control of the zeroth virtual channel and the first virtual channel, enabling the zeroth virtual channel and the first virtual channel to calibrate timing and ensure timing correctness. Furthermore, the timing parameter counting method provided in this application simplifies the timing check logic of the memory controller. In other words, the timing parameter counting method in this application can be combined with the command copying method and command arbitration scheduling method described above to further realize the coordinated control of the zeroth virtual channel and the first virtual channel. This allows the memory controller to manage and control the zeroth virtual channel and the first virtual channel as a single channel, ensuring that the commands transmitted by the zeroth virtual channel and the first virtual channel conform to the timing and page state requirements of their respective virtual channels, thus guaranteeing the correctness of memory access.

[0186] Specifically, in this embodiment, the non-copying commands output by the final arbitrator are sampled, and timing parameter counting is initiated for the sampled commands based on the timing parameter counting. At the same time, when performing timing parameter counting, the timing parameters of the sampled commands are counted based on the clock cycle change of the command bus occupied by the adjacent commands. This ensures the timing correctness of the non-copying commands transmitted on the zero virtual channel and maintains the timing correctness of the copied commands transmitted on the first virtual channel.

[0187] Furthermore, this application embodiment ensures the consistency of page states between the zeroth virtual channel and the first virtual channel by using command copying. For example, for row commands (commands other than column commands such as read commands and write commands), this application embodiment ensures that the row commands transmitted by the first virtual channel are consistent with the row commands transmitted by the zeroth virtual channel by copying the row commands transmitted by the zeroth virtual channel. Thus, when row commands affect the page state, this application embodiment can ensure that the impact of row commands on the page states of the zeroth virtual channel and the first virtual channel is consistent.

[0188] Furthermore, for column commands such as read and write commands, for column commands triggered by memory access requests of the second data length (e.g., 64 bytes), this embodiment can transmit the marked real column command in both the zeroth virtual channel and the first virtual channel (the real column command transmitted in the first virtual channel can be obtained by copying), and the column commands transmitted in the zeroth and first virtual channels can carry consistent auto-precharge instructions (auto-precharge affects page state). Therefore, it can be ensured that the column commands (carrying auto-precharge instructions) triggered by memory access requests of the second data length (e.g., 64 bytes) have a consistent impact on the page state of the zeroth and first virtual channels. For column commands triggered by memory access requests of the first data length (e.g., 32 bytes), this embodiment can transmit the marked real column command in the corresponding virtual channel according to the virtual channel address indicated by the memory access request, and the column command does not carry auto-precharge instructions, so that the marked real column command will not affect the page state and ensure the memory access requirements of the first data length (e.g., 32 bytes).

[0189] As can be seen, the solution provided in this application embodiment can realize the coordinated control of the zeroth virtual channel and the first virtual channel through a set of control logic inside the memory controller, thereby reducing the number of control logic circuits used by the memory controller and reducing the overhead of the memory controller; and, while ensuring the memory access requirements of the memory access request, it can also ensure the correctness of the memory access (such as ensuring the consistency of the timing and page state of the zeroth virtual channel and the first virtual channel).

[0190] In a further optional implementation, the shared control logic circuit can also implement at least one of the following functions: storing memory access request information, recording block information of blocks in memory, and managing memory refresh operations. In an optional implementation, Figure 7 An exemplary diagram of another memory controller provided in an embodiment of this application is shown, in conjunction with... Figure 2 , Figure 4A and Figure 7 As shown, the shared control logic circuit 210 in the memory controller may include: a shared command queue 710, a shared page record module 720, a shared timing check module 730, a shared refresh control module 740, and a shared queue arbitrator 750.

[0191] The shared command queue 710 is a command queue shared by the zeroth virtual channel and the first virtual channel in the memory controller, and is used at least to store memory access request information. In an optional implementation, the memory controller can perform address decoding processing on the memory access request issued by the processor core to obtain the memory access request information after address decoding, which can then be stored in the shared command queue 710. For example, the memory controller can include an address decoding module that can perform address decoding processing on the memory access request issued by the processor core, and can send the memory access request information after address decoding into the shared command queue 710 for storage. Alternatively, the set bits of the physical address of the memory access request can carry the virtual channel address; for example, the fifth bit (bit 5) of the physical address of the memory access request can carry the virtual channel address. Therefore, when performing address decoding processing on the memory access request, the virtual channel address can be determined from the set bits (e.g., bit 5) of the physical address of the memory access request, and the virtual channel to which the virtual channel address belongs can be used as the virtual channel corresponding to the memory access request.

[0192] The request information for memory access requests stored in the shared command queue may include the virtual channel address of the memory access request. This virtual channel address is used to indicate whether a column command is marked as true or false when the final arbiter outputs a column command. It should be noted that, in this embodiment, the virtual channel address of the memory access request may not have any other purpose besides indicating whether a column command is marked as true or false; that is, it is not used in other functional modules of the memory controller.

[0193] As can be seen, unlike the approach of setting independent control logic circuits for each virtual channel, this embodiment does not require setting independent command queues for each virtual channel (for example, it does not require setting independent command queues for the zeroth virtual channel or the first virtual channel). Instead, without distinguishing the virtual channel address of the memory access request, the request information of the memory access request is uniformly stored in the shared command queue 710. Therefore, this embodiment can reduce the number of command queues. For example, compared to setting independent command queues for the zeroth virtual channel and the first virtual channel, this embodiment can save one command queue, thereby reducing the number of control logic and overhead at the command queue level.

[0194] The shared page record module 720 is a shared page record module for the zeroth virtual channel and the first virtual channel in the memory controller, used to record block information of blocks in memory. The shared page record module 720 can record block information of blocks in memory without considering or using virtual channel addresses. As an optional implementation, the block information may include, but is not limited to, block address information and page state information. The block page state information may include at least: the block buffer state; the block buffer state records whether the block buffer is currently loading rows, and records the loaded rows when the block buffer loads rows. Optionally, the block address information may include: block address, stack ID, and row address. For example, the block information may include the block address, stack ID, row address, and information on whether the block is loading rows. In this embodiment, the shared page record module does not record the virtual channel address of the memory access request, which can halve the number of recording units of the shared page record module. Furthermore, by using the block address, stack ID, and row address as the address information in the block information recorded by the shared page record module, the amount of address information recorded by the shared page record module can be reduced, thereby saving the area occupied by the shared page record module in the memory controller.

[0195] As an optional implementation, after a memory access request enters the memory controller, the memory controller can perform address decoding to query the block information of the block in the shared page record module, compare whether the row accessed by the memory access request is consistent with the target block, and whether the target block has loaded the row; thereby determining the page hit status, such as page hit, page miss, or page conflict, for the memory access request.

[0196] As an optional implementation, when the final arbiter output is not a line command from the command reciprocator, the shared page record module 720 can sample the line command issued by the final arbiter and update the block information of the block corresponding to the line pointed to by the line command.

[0197] As can be seen, unlike the approach of setting independent control logic circuits for each virtual channel, the embodiments of this application do not require setting independent page recording modules for each virtual channel. For example, it does not require setting independent page recording modules for the zeroth virtual channel or the first virtual channel. Instead, without distinguishing the virtual channel address of the memory access request, a shared page recording module shared by the zeroth and first virtual channels is used to record the block information of the block in memory. Therefore, the embodiments of this application can reduce the number of page recording modules. For example, compared to setting independent page recording modules for the zeroth and first virtual channels, the embodiments of this application can save one page recording module, thereby reducing the number of control logic and overhead at the page recording module level.

[0198] The shared timing check module 730 is a timing check module shared by the zeroth and first virtual channels in the memory controller. It is used to determine memory access-related timing parameters (such as recording and detecting various timing parameters used in memory access requests) to ensure the correctness of memory access operations. The shared timing check module can determine memory access-related timing parameters without considering or using virtual channel addresses. For example, when the final arbiter outputs a non-copy command, the command output by the final arbiter is sampled, and timing parameter counting for the command is started. Details regarding timing parameter counting can be found in the corresponding sections above and will not be elaborated upon here.

[0199] As can be seen, unlike the approach of setting independent control logic circuits for each virtual channel, the embodiments of this application do not require setting independent timing check modules for each virtual channel. For example, it does not require setting independent timing check modules for the zeroth virtual channel or the first virtual channel. Instead, without distinguishing the virtual channel address of the memory access request, a shared timing check module shared by the zeroth and first virtual channels is used to determine the timing parameters related to memory access. Therefore, the embodiments of this application can reduce the number of timing check modules. For example, compared to setting independent timing check modules for the zeroth and first virtual channels, the embodiments of this application can save one timing check module, thereby reducing the number and overhead of control logic at the timing check module level.

[0200] The shared refresh control module 740 is a shared refresh control module for the zeroth and first virtual channels in the memory controller. It manages memory refresh operations, such as controlling memory refresh operations and generating corresponding refresh commands. The shared refresh control module can manage memory refresh operations without needing to consider or use virtual channel addresses. It should be noted that refresh commands are used to refresh the memory to prevent data loss; for example, for DRAM and similar types of memory, due to charge leakage, periodic refreshes are necessary to prevent data loss. Optionally, refresh commands can be divided into full-block refresh commands and single-block refresh commands.

[0201] In a further optional implementation, the shared refresh control module 740 can generate any one of the following commands as needed: single block precharge (PCHGpb) command, full block precharge (PCHGab) command, single block refresh (REFpb) command, full block refresh (REFab) command, RFMab command, and RFMpb command.

[0202] In a further optional implementation, the shared refresh control module 740 can also be used to monitor the cumulative number of times a block in memory is activated.

[0203] In an optional implementation, when the final arbiter outputs a non-copying line command, if the line command is any of the following commands: single block refresh (REFpb) command, full block refresh (REFab) command, RFMab command, or RFMpb command, the shared refresh control module can sample the line command issued by the final arbiter and update the state of the line command. Alternatively, when the line command is sent to memory, the state of the line command can represent information such as the address of the block corresponding to the line indicated by the line command.

[0204] It should be noted that refresh commands must adhere to a refresh interval. Therefore, the shared refresh control module needs to sample the non-copying line commands issued by the final arbitrator to determine whether the final arbitrator will arbitrate sending the refresh command to memory, thus ensuring that refresh commands are generated according to the refresh interval. For example, for a single-block refresh (REFpb) command, if the single-block refresh (REFpb) command refreshes a specific block in memory, after outputting the single-block refresh (REFpb) command for that block, a certain refresh interval must be followed before outputting the single-block refresh (REFpb) command for that block again. Therefore, the shared refresh control module needs to sample the non-copying line commands issued by the final arbitrator to confirm whether the final arbitrator sends the single-block refresh (REFpb) command to memory, so as to ensure that the refreshed block is refreshed after a certain refresh interval. At this time, the shared refresh control module can update the status of the single-block refresh (REFpb) command, thereby maintaining that the single-block refresh (REFpb) command sent to memory can refresh a certain block according to a certain refresh interval.

[0205] As can be seen, unlike the approach of setting independent control logic circuits for each virtual channel, the embodiments of this application do not require setting independent refresh control modules for each virtual channel. For example, it does not require setting independent refresh control modules for the zeroth virtual channel or the first virtual channel. Instead, without distinguishing the virtual channel address of the memory access request, a refresh control module shared by the zeroth and first virtual channels is used to handle the memory refresh operation. Therefore, the embodiments of this application can reduce the number of refresh control modules. For example, compared to setting independent refresh control modules for the zeroth and first virtual channels, the embodiments of this application can save one refresh control module, thereby reducing the number of control logic and overhead at the refresh control module level.

[0206] The shared queue arbitrator 750 is a shared queue arbitrator for the zeroth virtual channel and the first virtual channel in the memory controller, and can determine the request command sent to the final arbitrator for final arbitration. As an optional implementation, based on the settings of the shared page record module 720, the shared timing check module 730, and the shared refresh control module 740, the shared queue arbitrator can determine the request command corresponding to the memory access request and send it to the final arbitrator based on information provided by at least one of these modules, as well as the request information of the memory access request stored in the shared command queue 710. Specifically, the shared page record module 720 provides at least block information of the memory block, the shared timing check module 730 provides at least timing parameters of the non-copying command output by the final arbitrator, and the shared refresh control module 740 can provide at least a refresh command (furthermore, the shared refresh control module 740 can also provide a precharge command, an RFMab command, an RFMpb command, etc.).

[0207] As an optional implementation, based on information provided by at least one of the shared page recording module 720, the shared timing check module 730, and the shared refresh control module 740, the shared queue arbitrator can determine the command that enables the memory access request to participate in the initial arbitration of the shared queue arbitrator; furthermore, the shared queue arbitrator can determine the request command corresponding to the memory access request to be sent to the final arbitrator from the commands participating in the initial arbitration. In other words, the shared queue arbitrator can determine the request command corresponding to the memory access request to be sent to the final arbitrator through the initial arbitration; and when selecting the command to participate in the initial arbitration, it can combine the information provided by at least one of the shared page recording module 720, the shared timing check module 730, and the shared refresh control module 740.

[0208] It should be noted that the shared queue arbitrator determines the strategy and method for the commands participating in the initial arbitration based on at least one of the block information provided by the shared page record module, the timing parameters provided by the shared timing check module, and the commands provided by the shared refresh control module. This strategy and method can be defined and set according to actual circumstances, and the embodiments in this application do not impose limitations. For example, based on the buffer state of the block provided by the shared page record module, the page hit status of the memory access request is determined; if the timing matches the timing parameters provided by the shared timing check module, different commands for participating in the initial arbitration are generated based on different page hit statuses of the memory access request.

[0209] In one optional implementation, the shared queue arbitrator can determine the page hit status of a memory access request based at least on the block information of the block recorded by the shared page record module and the address information in the request information of the memory access request stored in the shared command queue; based on the page hit status of the memory access request, it can determine the command for the memory access request to participate in the initial arbitration of the shared queue arbitrator (furthermore, if the timing parameters provided by the shared timing check module are consistent with the timing parameters, the command for the memory access request to participate in the initial arbitration of the shared queue arbitrator can be determined based on the page hit status of the memory access request); and then, from the command participating in the initial arbitration, it can determine the request command corresponding to the memory access request to be sent to the final arbitrator.

[0210] As an optional implementation Figure 8 An exemplary flowchart of an optional request command for determining a memory access request provided in an embodiment of this application is shown. This process can be implemented by a shared queue arbitrator, as described above. Figure 8 The process may include the following steps.

[0211] In step S810, the page hit status of the memory access request is determined based on the address information of the memory access request.

[0212] In an optional implementation, the page state information based on blocks includes the buffer state. As an optional implementation method for the shared queue arbitrator to determine the page hit status of a memory access request, the shared queue arbitrator can query the buffer state of the corresponding block recorded by the shared page record module based on the address information of the memory access request (for example, based on the address information of the memory access request, determine the block corresponding to the address information, and then query the buffer state of the corresponding block recorded by the shared page record module), and then determine the page hit status of the memory access request based on the queried buffer state of the block.

[0213] For example, if the buffer status of the queried block indicates that the buffer has not loaded any rows, then the page hit status of the memory access request is confirmed as a page miss; if the buffer status of the queried block indicates that the buffer has loaded rows, and it is determined that the rows loaded in the buffer are consistent with the rows pointed to by the memory access request, then the page hit status of the memory access request is confirmed as a page hit; if the buffer status of the queried block indicates that the buffer has loaded rows, and it is determined that the rows loaded in the buffer are inconsistent with the rows pointed to by the memory access request, then the page hit status of the memory access request is confirmed as a page conflict.

[0214] In step S811, if the page hit condition is page miss, then the command for the memory access request to participate in the initial arbitration is determined to be the row activation command.

[0215] As an optional implementation, if the page hit status of the memory access request is a page miss, since the block buffer is not loaded with any rows at this time, a row activation command is needed to load the row pointed to by the memory access request into the block buffer before the column strobe command can be used to read or write data in the row pointed to by the memory access request. Therefore, when it is determined that the page hit status of the memory access request is a page miss, a row activation command can be generated to participate in the initial arbitration.

[0216] In step S812, if the page hit condition is a page hit, then the command for the memory access request to participate in the initial arbitration is determined to be a column command.

[0217] If the memory access request is determined to be a page hit, since the row pointed to by the memory access request has already been loaded into the block's buffer, data can be directly read or written to that row. Therefore, in this embodiment, a read command or write command can be generated based on the command flag (e.g., read / write flag) of the memory access request and participate in the initial arbitration. For example, when the memory access request is a read request, a read command can be generated and participate in the initial arbitration based on the read request's command flag being read; when the memory access request is a write request, a write command can be generated and participate in the initial arbitration based on the write request's command flag being write.

[0218] In one implementation example, based on the storage units that store data corresponding to columns of a row in a block, when the row pointed to by the memory access request has been loaded into the block's buffer, a column strobe command can be used to select columns in the row to read or write data to the selected columns. For example, a column strobe command can select columns in a row of the block, thereby concurrently reading or writing data to a specified starting column and subsequent columns. Therefore, when the page hit condition of the memory access request is a page hit, a column strobe command can be generated and participate in the initial arbitration. For instance, when the page hit condition of the memory access request is a page hit, if the memory access request is a read request, a column strobe command for a read operation (i.e., a read command) can be generated to participate in the initial arbitration; if the memory access request is a write request, a column strobe command for a write operation (i.e., a write command) can be generated to participate in the transmission arbitration.

[0219] In step S813, if the page hit is a page conflict, the command for the memory access request to participate in the initial arbitration is determined to be a precharge command.

[0220] As an optional implementation, if the memory access request results in a page conflict, since the row pointed to by the memory access request is inconsistent with the row loaded in the block's buffer, the block needs to be precharged first. Only then can the row pointed to by the memory access request be loaded into the block's buffer via a row activation command, and finally, data can be read and written to the row pointed to by the memory access request via a column strobe command. Therefore, when a page conflict is determined in the memory access request, a precharge command can be generated (the precharge command is used to precharge the block corresponding to the row pointed to by the memory access request) and participate in the initial arbitration. The generated precharge command can be a single-block precharge command, indicating that the block to be precharged is the block containing the row pointed to by the memory access request.

[0221] It should be noted that for memory access requests sent by the processor core, the same memory access request may generate precharge commands, row activation commands, and column strobing commands sequentially to participate in the initial arbitration. For example, if the page hit status of a memory access request is a page conflict, a precharge command can be generated to participate in the initial arbitration, allowing the block containing the row pointed to by the memory access request to be precharged through the precharge command. However, the page hit status of the memory access request may subsequently change to a page miss, so a row activation command can be generated subsequently to participate in the initial arbitration, so that the row pointed to by the memory access request is loaded into the block's buffer. Furthermore, when the page hit status of the memory access request changes to a page hit, a column strobing command can be generated to participate in the initial arbitration, allowing data to be read or written to the selected column. Of course, in the embodiments of this application, row activation commands and column strobing commands may also be generated sequentially for the same memory access request; the corresponding situations can be referred to in conjunction with the foregoing description, and will not be elaborated here.

[0222] In step S814, at most one row activation command, one column command, and one precharge command are simultaneously determined from the commands participating in the initial arbitration as request commands to be sent to the final arbitrator.

[0223] When performing initial arbitration on commands participating in the initial arbitration, the shared queue arbitrator can select at most one row activation command, one queue command, and one precharge command from the commands participating in the initial arbitration as request commands to be sent to the final arbitrator. Furthermore, the shared queue arbitrator does not consider the virtual channel address of the memory access request corresponding to the command when performing initial arbitration. The strategy of the shared queue arbitrator in performing initial arbitration can be set according to actual conditions, and this application embodiment does not impose limitations.

[0224] As can be seen, unlike the approach of setting independent control logic circuits for each virtual channel, the embodiments of this application do not require setting independent queue arbitrators for each virtual channel. For example, it does not require setting independent queue arbitrators for the zeroth virtual channel or the first virtual channel. Instead, without distinguishing the virtual channel address of the memory access request, the queue arbitrator shared by the zeroth and first virtual channels is used to determine the request command for the memory access request to be sent to the final arbitrator. Therefore, the embodiments of this application can reduce the number of queue arbitrators. For example, compared to setting independent queue arbitrators for the zeroth and first virtual channels, the embodiments of this application can save one queue arbitrator, thereby reducing the number of control logic and overhead at the queue arbitrator level.

[0225] In a further optional implementation, commands generated by the shared refresh control module can also be sent to the final arbitrator to participate in the final arbitration. For example, commands generated by the shared refresh control module may include refresh commands (single block refresh command, full block refresh command, etc.), and these refresh commands can be sent to the final arbitrator to participate in the final arbitration. As another example, commands generated by the shared refresh control module may also include precharge commands (single block precharge command, full block precharge command, etc.), RFMab commands, RFMpb commands, etc., and any of these commands can be sent to the final arbitrator to participate in the final arbitration.

[0226] Therefore, in an optional implementation, the command from the shared logic circuit in the final arbiter may be from the shared queue arbiter or from the shared refresh control module. Furthermore, the final arbiter can perform final arbitration on the commands sent by the shared queue arbiter, the shared refresh control module, and the command replicator, determine the command that passes the final arbitration, and output it. The content of the final arbiter's arbitration scheduling command can be referred to the description in the corresponding section above, and will not be elaborated here.

[0227] It can be seen that the logic components in the shared control logic circuit include, but are not limited to, a shared command queue, a shared queue arbitrator, a shared page record module, a shared refresh control module, and a shared timing check module. Furthermore, the shared command queue stores the virtual channel address of the memory access request to indicate whether the command is genuine or spurious, and is not used for other purposes. Therefore, these logic components can be shared by the zeroth virtual channel and the first virtual channel without considering the virtual channel address of the memory access request (i.e., without paying attention to or using the virtual channel address of the memory access request). This further reduces the number of control logic components in the memory controller, thereby further reducing the overhead of the memory controller. It should be further noted that functional modules (e.g., logic components) responsible for other functions in the memory controller can also implement their functions without paying attention to or using virtual channel addresses; for example, low-power modules can also implement low-power control without paying attention to or using virtual channel addresses.

[0228] In a further optional implementation, Figure 9 An exemplary diagram of another memory controller provided in an embodiment of this application is shown, in conjunction with... Figure 2 , Figure 4A , Figure 7 and Figure 9 As shown, the memory controller may also include: a first interface 910, a write data cache 920, an address decoding module 930, a trigger and response queue 940, and a second interface 950.

[0229] The first interface 910 can be an interface between the memory controller and the processor core. For example, the first interface 910 is connected to the system bus, thereby connecting to the processor core through the system bus. The first interface 910 can receive memory access requests from upstream modules of the memory controller (such as the processor core).

[0230] As an optional implementation, the type of memory access request issued by the processor core can be determined according to the processor architecture, memory type, and chip design, and this application embodiment does not impose any limitations. In one example, the memory access request may include any of the following: a read request and a write request.

[0231] The write data cache 920 is used to store the write data that the write request needs to write to memory when the memory access request is a write request.

[0232] The address decoding module 930 can perform address decoding processing on the memory access request received by the first interface 910, and send the memory access request information after address decoding processing to the shared control logic circuit, such as sending it to the shared command queue. In an optional implementation, the address decoding module can map the physical address of the memory access request to the standard memory address according to the address mapping rules provided by the configuration register, so as to realize the address decoding processing of the memory access request; wherein, the address mapping rules can record the mapping rules between the physical address of the memory access request and the standard memory address, and the mapping rules can be in the form of a mapping relationship or a mapping function.

[0233] It's important to note that the standard memory address can be considered the standard address format used by memory, such as the standard address format of HBM3 memory, which corresponds to the actual storage location of data in memory. The physical address of the memory access request sent by the processor core is the upstream address. The format of the physical address of the memory access request is defined by the processor core. For example, the physical address of the memory access request is an address in the processor core's physical address space, determined by the processor core's architecture and design. By mapping the address in the processor core's physical address space (the physical address of the memory access request) to the actual storage location of the data in memory (the standard memory address), it can be ensured that the accessed data can be correctly manipulated (e.g., the accessed data can be correctly read from or written to memory). Furthermore, the memory controller can set configuration registers to record the above address mapping rules; for example, the configuration register can be controlled by the memory controller and record a series of configuration information, including the above address mapping rules.

[0234] In an optional implementation, the request information obtained by the address decoding module 930 after performing address decoding processing on the memory access request may include, but is not limited to: the address information of the memory access request, the command flag of the memory access request, the data length (also known as the data size) of the memory access data, and the priority of the memory access request. The above-mentioned memory access request information can be sent to a shared control logic circuit (e.g., a shared command queue in the shared control logic circuit).

[0235] In an optional implementation, the address information of the memory access request may include the standard memory address after the memory access request has been decoded.

[0236] The command flags for memory access requests are used to indicate the type of memory access operation requested, such as whether the memory access operation is a read operation or a write operation; correspondingly, the command flags can also be called read / write flags.

[0237] Memory access data refers to the data corresponding to the memory access operation of a memory access request, such as the write data for a write request or the read data for a read request. Correspondingly, the length of the memory access data includes, for example, the length of the write data or the length of the read data. The length of the memory access data corresponding to a memory access request can correspond to the second data length supported by a memory channel (e.g., 64 bytes) or the first data length supported by a virtual channel (e.g., 32 bytes).

[0238] The priority of a memory access request can be the priority at which the memory controller responds to the memory access request. The processor core can define the priority of a memory access request based on factors such as the urgency and importance of the memory access request when issuing the request.

[0239] In a further optional implementation, when the address decoding module 930 performs address decoding processing on the memory access request, it can determine the virtual channel corresponding to the memory access request. For example, it can determine the virtual channel address carried by the physical address of the memory access request (the virtual channel address carried by the physical address of the memory access request can be simply referred to as the virtual channel address of the memory access request). Thus, the virtual channel to which the virtual channel address of the memory access request belongs can be regarded as the virtual channel corresponding to the memory access request. As an example of an optional implementation, the set bits of the physical address of the memory access request can carry the virtual channel address. For example, the fifth bit (bit5) of the physical address of the memory access request can carry the virtual channel address. Thus, when performing address decoding processing on the memory access request, the virtual channel address can be determined from the set bits (e.g., bit5) of the physical address of the memory access request, and the virtual channel to which the virtual channel address belongs can be regarded as the virtual channel corresponding to the memory access request. For example, if the virtual channel address of the memory access request belongs to the zeroth virtual channel, then the virtual channel corresponding to the memory access request is the zeroth virtual channel; if the virtual channel address of the memory access request belongs to the first virtual channel, then the virtual channel corresponding to the memory access request is the first virtual channel.

[0240] It should be noted that the data granularity controlled by the physical address setting bit (e.g., bit 5) of the memory access request can correspond to the amount of data transferred in a single data transfer within a virtual channel. In one example, taking a 32-byte data length for a single read or write operation in HBM3 technology as an example, the data granularity controlled by the physical address setting bit (e.g., bit 5) of the memory access request can be 32 bytes, corresponding to the amount of data transferred in a single read or write operation within a virtual channel of HBM3 technology.

[0241] It should be noted that although the shared control logic circuit does not distinguish the virtual channel address of the memory access request, when the memory access data of the memory access request is of a first data length (e.g., 32 bytes), this embodiment of the application needs to use the virtual channel address of the memory access request to determine the virtual channel for the actual transmission of the column command when the final arbiter schedules the output column command. Therefore, the virtual channel address of the memory access request can also be used as part of the request information of the memory access request (e.g., part of the address information of the memory access request). In other words, the request information of the memory access request may include the virtual channel address of the memory access request, which is used to indicate whether the column command is marked as real or fake when the final arbiter outputs the column command.

[0242] The trigger and response queue 940 can be used to retrieve commands output by the final arbitrator; that is, commands output by the final arbitrator can be sent to the trigger and response queue 940. Thus, the trigger and response queue 940 can send the commands output by the final arbitrator to memory according to the virtual channel indicated by the commands output by the final arbitrator (e.g., the virtual channel address of the commands output by the final arbitrator). In an optional implementation, the trigger and response queue 940 can send commands to memory through a second interface 950 and the memory physical layer, wherein the second interface 950 is connected to the memory physical layer.

[0243] In a further optional implementation, if the command output by the final arbiter is a column command for a write operation, the trigger and response queue 940 can send the column command for the write operation output by the final arbiter and the write data stored in the write data cache 920 to memory according to the virtual channel indicated by the column command for the write operation output by the final arbiter.

[0244] In a further optional implementation, if the command output by the final arbiter is a column command for a read operation, the trigger and response queue 940 can send the column command for the read operation output by the final arbiter to memory according to the virtual channel indicated by the column command for the read operation output by the final arbiter.

[0245] It should be noted that, for column commands output by the final arbiter, the trigger and response queue 940 sends the column command to memory according to the virtual channel indicated by the column command when the column command is marked as true (and also sends the write data to memory when the column command is a write command). However, when the column command output by the final arbiter is marked as false, the trigger and response queue 940 discards the falsely marked column command and does not send it to memory. The cases of true and false column command markings can be referred to the corresponding descriptions above, and will not be elaborated upon here.

[0246] In a further optional implementation, since there are no real and fake annotations for line commands, this application embodiment needs to ensure that the line commands output by the final arbitrator (non-copied line commands and copied line commands) can be actually sent to memory. Therefore, when the command output by the final arbitrator is a line command, the departure and response queue 940 can send the line command output by the final arbitrator to memory according to the virtual channel indicated by the line command output by the final arbitrator.

[0247] In a further optional implementation, in addition to sending the command output by the final arbiter to memory, the trigger and response queue 940 can also receive read data returned from memory after sending the read command to memory. The read data corresponds to the column command (i.e., read command) of the read operation output by the final arbiter. Thus, the trigger and response queue 940 can decide whether to return the received read data directly to the processor core or wait to collect a sufficient amount of read data before returning the read data to the processor core, based on the data length of the memory access request corresponding to the column command of the read operation.

[0248] In one implementation example, if the data length of the memory access request corresponding to the column command of a read operation is the second data length (e.g., 64 bytes), then the data that can be transmitted at one time based on a virtual channel is the first data length (e.g., 32 bytes). The trigger and response queue 940 can first cache the read data corresponding to the column command of a read operation, and then return the read data corresponding to the column command of the next read operation (i.e., the read data returned in the next clock cycle) and the cached read data together to the processor core. The column command of a read operation corresponds to reading the first data length of read data.

[0249] In other words, for a read request with a second data length (e.g., 64 bytes), the read data of the first data length (e.g., 32 bytes) returned by the first read command of the read request can be cached. Furthermore, the read data of the second read command (which could be a copied read command) returning the first data length (e.g., 32 bytes) in the next clock cycle can be uncached. Instead, the read data returned by the second read command, along with the cached read data from the first read command, is sent to the processor core to satisfy the processor core's read requirement for the second data length (e.g., 64 bytes). In an optional implementation, a read data cache can be set in the trigger and response queue 940 to cache the read data. Since the read data cache needs to cache the first data length (e.g., 32 bytes), the depth of the read data cache can at least support a single read operation's column command, and the width must at least correspond to the first data length. For example, the data cache size of the read data cache can be set to correspond to the first data length (e.g., 32 bytes). Of course, in this embodiment, the data cache size of the read data cache can also be greater than the first data length (e.g., 32 bytes).

[0250] Furthermore, if the data length of the memory access request corresponding to the column command of the read operation is the first data length, then the trigger and response queue 940 can directly return the read data to the processor core.

[0251] In a further optional implementation, as an example, the trigger and response queue 940 can send the command output from the final arbitrator and the write data (for write commands) obtained from the write data cache 920 to the second interface 950 according to rules (e.g., the trigger and response queue 940 decodes the command into a memory-required format such as HBM3 and sends it to the second interface 950 according to the rules of the HBM3 memory requirement), thereby delivering it to memory through the second interface 950 and the memory physical layer. For read requests issued by the processor core, the trigger and response queue 940 also receives read data returned from memory. If it is a 64-byte read request, the trigger and response queue 940 first stores the returned 32 bytes of read data in its internal read data cache. After collecting the next 32 bytes of read data, it sends it to the processor core along with the cached 32 bytes of read data. If it is a 32-byte read request, the trigger and response queue 940 can send the returned 32 bytes of read data directly to the processor core. It should be noted that since the read command corresponding to a 64-byte read request is sent to memory over two consecutive clock cycles (one clock cycle sends a non-copy read command to memory, and the next clock cycle sends a copy read command to memory), the trigger and response queue 940 can be configured with a read data buffer of depth 1 (e.g., supporting a single read command) and width 32 bytes. Furthermore, for column commands marked as false in the final arbitrator output, the trigger and response queue 940 can discard them instead of sending them to memory.

[0252] The second interface 950 is the interface for the memory controller to connect to the memory physical layer. The second interface 950 can adopt the DFI (DDR PHY Interface) standard, etc. DFI can be regarded as an interface standard for connecting the memory controller and the memory physical layer. As a bridge between the memory controller and the memory physical layer, it can ensure normal communication and mutual coordination between the memory controller and the memory physical layer.

[0253] It should be noted that the memory access requests in this application embodiment are not limited to 64 bytes and 32 bytes. For example, this application embodiment can also support 128-byte memory access requests. For instance, a 128-byte memory access request can be coordinated and scheduled using two 64-byte memory channels. For example, the 128-byte memory access request can be distributed across two memory channels, with one memory channel handling the 64-byte request, thus allowing two memory channels to handle the 64-byte memory access request respectively, thereby satisfying the 128-byte memory access requirement. The method by which the memory controller controls the zeroth virtual channel and the first virtual channel of a memory channel to handle the 64-byte memory access request can be similarly described in the corresponding section above, and will not be elaborated here.

[0254] This application embodiment can achieve coordinated control of the zeroth virtual channel and the first virtual channel through a set of control logic within the memory controller, reducing the number of control logic circuits used by the memory controller (such as timing check modules, page record modules, queue arbitrators, refresh control modules, and other arbitration-related control logic can be halved), thereby reducing the overhead of the memory controller. Simultaneously, this application embodiment can maintain the correctness of memory access by ensuring the consistency of timing and page states between the zeroth and first virtual channels. Furthermore, for memory access requests of the second data length (e.g., 64 bytes), the difference between the two first data lengths (e.g., 32 bytes) of read data returned from memory is only one clock cycle. Therefore, the depth of the read data cache can be set to support a single read command, reducing the depth of the read data cache.

[0255] This application also provides a chip, such as a system-on-a-chip, in combination with... Figure 1 As shown, the chip may include: at least one processor core, at least one memory controller, a memory physical layer, and memory;

[0256] In this embodiment, at least one processor core is connected to at least one memory controller; the memory physical layer includes at least one memory channel, which includes a zeroth virtual channel and a first virtual channel; the memory controller and the memory transmit data through the memory channel; the memory controller in the chip can be the memory controller provided in the embodiments of this application.

[0257] This application also provides an electronic device, such as a terminal device or a server device, which may include the chip provided in this application embodiment, or the memory controller provided in this application embodiment.

[0258] The foregoing describes multiple embodiment schemes provided by the embodiments of this application. The optional methods described in each embodiment scheme can be combined and cross-referenced with each other without conflict, thereby extending to a variety of possible embodiment schemes. These can all be considered as the embodiment schemes disclosed and published by the embodiments of this application.

[0259] While the embodiments disclosed above are described in this application, this application is not limited thereto. Any person skilled in the art can make various modifications and alterations without departing from the spirit and scope of this application; therefore, the scope of protection of this application should be determined by the scope defined in the claims.

Claims

1. A memory controller, characterized in that, The memory controller transmits data to the memory via a memory channel, which includes a zeroth virtual channel and a first virtual channel. The memory controller includes: a shared control logic circuit, a final arbiter, and a command copyer; The shared control logic circuit is shared by the zeroth virtual channel and the first virtual channel. The shared control logic circuit is used to determine the request command corresponding to the memory access request and send the request command to the final arbitrator. The command copier is used to copy the request command when the command output by the final arbitrator is a non-copyable request command and the request command meets the preset copying conditions, so as to obtain the corresponding copied request command; and to send the copied request command to the final arbitrator. The final arbitrator is used to perform final arbitration on commands sent to the final arbitrator; if the command that passes the final arbitration is a non-copying request command, the request command is output and indicated to be transmitted through the zeroth virtual channel; and at least one clock cycle after the output of the non-copying request command, if the final arbitration passes the corresponding copying request command, the copying request command is output and indicated to be transmitted through the first virtual channel. The non-copying request command comes from the shared control logic circuit, while the copying request command comes from the command copier.

2. The memory controller according to claim 1, characterized in that, The memory includes multiple blocks, which are arrays of rows and columns; the request commands corresponding to the memory access request are divided into row commands and column commands; the row commands are used to control at least the block where the row pointed to by the memory access request is located, and the column commands are used to perform memory access operations on the columns in the row pointed to by the memory access request.

3. The memory controller according to claim 2, characterized in that, The command copier is used to copy a request command when the command output by the final arbitrator is a non-copyable request command and the request command meets preset copying conditions, so as to obtain the corresponding copied request command, including: When the final arbitrator outputs a non-copyable request command, or a column command other than a true memory access request originating from the first data length, the request command is copied to obtain the corresponding copied request command; the first data length is the data length that a virtual channel supports for transmitting data.

4. The memory controller according to claim 3, characterized in that, The final arbiter is further configured to mark the output column command as false when the output request command is a column command and the conditions are met; the conditions include: the column command output by the final arbiter is a non-replicated column command, and the data length of the memory access request corresponding to the column command is a first data length, and the virtual channel indicated by the memory access request corresponding to the column command is a first virtual channel; when the output request command is a column command and the conditions are not met, the output column command is marked as true. In this process, column commands marked as genuine are sent to memory after the final arbitrator outputs, while column commands marked as false are discarded after the final arbitrator outputs.

5. The memory controller according to claim 3, characterized in that, The final arbitrator includes a column command arbitrator and a row command arbitrator; The column command arbitrator is used to perform final arbitration on the column commands sent to the final arbitrator; if the column command that passes the final arbitration is a non-copy column command, the column command is output and indicated to be transmitted through the zeroth virtual channel; and if there is a corresponding copy column command for the output non-copy column command, the corresponding copy column command is passed by the final arbitration at least one clock cycle after the output of the non-copy column command, and the copy column command is output and indicated to be transmitted through the first virtual channel. The line command arbitrator is used to perform final arbitration on the line commands sent to the final arbitrator; if the line command that passes the final arbitration is a non-copy line command, the line command is output and indicated to be transmitted through the zeroth virtual channel; and at least one clock cycle after the output of the non-copy line command, the corresponding copy line command is output and indicated to be transmitted through the first virtual channel.

6. The memory controller according to claim 5, characterized in that, The command copier is used to copy request commands when the command output by the final arbitrator is a non-copyable request command, or a column command other than a memory access request marked as real and originating from the first data length, to obtain the corresponding copied request command, including: When the column command arbitrator outputs a non-copied column command, or when the column command is a column command other than a column command that is marked as real and originates from a memory access request of the first data length, the column command is copied to obtain the corresponding copied column command.

7. The memory controller according to claim 6, characterized in that, The column command arbitrator, configured to output a column command and indicate transmission via the zeroth virtual channel if the ultimately arbitrated column command is a non-replicated column command, includes: If the column command that is ultimately approved by arbitration is a non-copy column command, and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request is the zeroth virtual channel, then the virtual channel address belonging to the zeroth virtual channel and the column command marked as real will be output in the current clock cycle. The final arbiter is occupied by the command replicator in the next clock cycle, but does not output column commands.

8. The memory controller according to claim 6, characterized in that, The column command arbitrator, configured to output a column command and indicate transmission via the zeroth virtual channel if the ultimately arbitrated column command is a non-replicated column command, includes: If the column command that is ultimately approved by arbitration is a non-copy column command, and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request is the first virtual channel, then the column command whose virtual channel address belongs to the zeroth virtual channel and is marked as false will be output in the current clock cycle. The column command arbitrator, configured to, if a corresponding copied column command exists for the output non-copied column command, ultimately arbitrate the corresponding copied column command at least one clock cycle after the output of the non-copied column command, and output the copied column command indicating transmission through the first virtual channel, includes: If the output column command is a non-replicated column command, and the data length of the memory access request corresponding to the column command is the first data length, and the virtual channel indicated by the memory access request is the first virtual channel, then in the next clock cycle, the corresponding replicated column command will be finally arbitrated and output as a column command whose virtual channel address belongs to the first virtual channel and is marked as a real replicated column command.

9. The memory controller according to claim 6, characterized in that, The column command arbitrator, configured to output a column command and indicate transmission via the zeroth virtual channel if the ultimately arbitrated column command is a non-replicated column command, includes: If the column command that is ultimately approved by arbitration is a non-copy column command, and the data length of the memory access request corresponding to the column command is the second data length, then the virtual channel address is output in the current clock cycle and the real column command is marked, where the second data length is the total data length of the zeroth virtual channel and the first virtual channel included in the memory channel that supports the transmission of data. The column command arbitrator, configured to, if a corresponding copied column command exists for the output non-copied column command, ultimately arbitrate the corresponding copied column command at least one clock cycle after the output of the non-copied column command, and output the copied column command indicating transmission through the first virtual channel, includes: If the output column command is a non-copy column command, and the data length of the memory access request corresponding to the column command is the second data length, then in the next clock cycle, the arbitration passes the corresponding copy column command, and outputs the column command whose virtual channel address belongs to the first virtual channel and is marked as a real copy.

10. The memory controller according to claim 5, characterized in that, If the column command output by the column command arbitrator originates from a memory access request of the first data length, the column command does not carry an automatic precharge instruction; if it originates from a memory access request of the second data length, the column command may or may not carry an automatic precharge instruction.

11. The memory controller according to claim 5, characterized in that, The command copier is used to copy request commands when the command output by the final arbitrator is a non-copyable request command, or a column command other than a memory access request marked as real and originating from the first data length, to obtain the corresponding copied request command, including: When the line command arbitrator outputs a non-copied line command, the line command is copied to obtain the corresponding copied line command; The row command arbitrator, configured to output a corresponding copied row command and indicate transmission via the first virtual channel at least one clock cycle after the output of a non-copied row command, includes: Based on the clock cycles occupied by the output non-copy line command on the command bus, at least one clock cycle after the output non-copy line command, the line command participating in the final arbitration is selected, and the copied line command is output and indicated to be transmitted through the first virtual channel.

12. The memory controller according to claim 11, characterized in that, The line command arbitrator is configured to select, at least one clock cycle after the output of the non-copying line command, to participate in the final arbitration based on the clock cycle occupied by the output non-copying line command on the command bus, including: If the output non-copy line command occupies 1.5 clock cycles of the command bus, then in the first clock cycle after the current clock cycle of the output non-copy line command, the line command that does not conflict with the output line command and occupies 0.5 clock cycles of the command bus is selected to participate in the final arbitration; and in the second to third clock cycles after the current clock cycle, the non-copy line command is prohibited from participating in the final arbitration. The line command arbitrator, used to output the copied line command and indicate transmission via the first virtual channel, includes: Between the second and third clock cycles, the copied line commands are output sequentially according to the order in which they were sent from the current clock cycle to the first clock cycle, and the virtual channel address is indicated to belong to the first virtual channel. The selected row command occupying 0.5 clock cycles of the command bus is output by the row command arbitrator 0.5 clock cycles after the first clock cycle. The output row command is copied by the command copier, and the corresponding copied row command is output by the row command arbitrator 0.5 clock cycles after the third clock cycle.

13. The memory controller according to claim 11, characterized in that, The line command arbitrator is configured to select, at least one clock cycle after the output of the non-copying line command, to participate in the final arbitration based on the clock cycle occupied by the output non-copying line command on the command bus, including: If the output non-copy line command occupies 1 clock cycle of the command bus, then the non-copy line command is prohibited from participating in the final arbitration in the first clock cycle after the current clock cycle of the output non-copy line command. The line command arbitrator, used to output the copied line command and indicate transmission via the first virtual channel, includes: In the first clock cycle after the current clock cycle, output the corresponding copied line command and indicate that the virtual channel address belongs to the first virtual channel.

14. The memory controller according to claim 11, characterized in that, The line command arbitrator is configured to select, at least one clock cycle after the output of the non-copying line command, to participate in the final arbitration based on the clock cycle occupied by the output non-copying line command on the command bus, including: If the output non-copy line command occupies 0.5 clock cycles of the command bus, then in the 0.5 clock cycle after the current clock cycle of the output line command, the line command that does not conflict with the output line command and occupies 0.5 clock cycles of the command bus is selected to participate in the final arbitration; and in the first clock cycle after the current clock cycle, the non-copy line command is prohibited from participating in the arbitration. The line command arbitrator, used to output the copied line command and indicate transmission via the first virtual channel, includes: In the first clock cycle after the current clock cycle, the copied row commands are output sequentially according to the order in which they were sent in the current clock cycle, and the virtual channel address is indicated to belong to the first virtual channel. The selected row command that occupies 0.5 clock cycles of the command bus is output by the row command arbitrator in the first clock cycle. The output row command is copied by the command copier, and the corresponding copied row command is output by the row command arbitrator 0.5 clock cycles after the first clock cycle.

15. The memory controller according to any one of claims 1-14, characterized in that, The shared control logic circuit is also used to sample the command output by the final arbitrator when the final arbitrator outputs a non-replicated command, and to start timing parameter counting for the sampled command based on the timing parameter counting basis. The timing parameter counting is based on the following: for commands that occupy 1.5 clock cycles of the command bus, the timing parameter counting starts from the first rising edge of the clock output by the final arbiter, so that commands that occupy 1.5 clock cycles of the command bus start timing parameter counting one clock cycle earlier than the standard clock cycle.

16. The memory controller according to claim 15, characterized in that, The shared control logic circuit, based on the counting foundation of timing parameters, initiates timing parameter counting for the sampled command, including: Based on the counting of the timing parameters, the timing parameters of the currently sampled command are counted according to the clock cycle change of the command bus occupied by the adjacent commands.

17. The memory controller according to claim 16, characterized in that, The shared control logic circuit is used to count the timing parameters of the currently sampled command based on the counting basis of the timing parameters and according to the clock cycle changes of the command bus occupied by adjacent sampled commands, including: Determine the clock cycle variation of the command bus occupied by adjacent commands being sampled; If the clock cycle occupied by the adjacent sampled command on the command bus varies to any one of 1 clock cycle to 1 clock cycle, 0.5 clock cycles to 0.5 clock cycles, 1.5 clock cycles to 1.5 clock cycles, 1 clock cycle to 0.5 clock cycles, and 0.5 clock cycles to 1 clock cycle, then based on the counting basis of the timing parameters, the timing parameters of the currently sampled command are counted according to the standard clock cycle. If the clock cycle occupied by the adjacent sampled command on the command bus varies from 1.5 clock cycles to 0.5 clock cycles and from 1.5 clock cycles to 1 clock cycle, then based on the counting basis of the timing parameters, the currently sampled command is counted by adding 2 clock cycles to the standard clock cycle. If the clock cycle occupied by the adjacent sampled command on the command bus varies between 0.5 and 1.5 clock cycles, and between 1 and 1.5 clock cycles, then based on the counting basis of the timing parameters, the currently sampled command is counted by subtracting 1 clock cycle from the standard clock cycle.

18. The memory controller according to claim 15, characterized in that, The shared control logic circuit includes: a shared command queue, a shared page record module, a shared timing check module, a shared refresh control module, and a shared queue arbitrator shared by the zeroth virtual channel and the first virtual channel; The shared command queue is used to store memory access request information; The shared page record module is used to record block information of blocks in memory; The shared timing check module is used to perform the steps of sampling the command output by the final arbitrator when the final arbitrator outputs a non-replicated command, and starting timing parameter counting for the sampled command based on the timing parameter counting basis; The shared refresh control module is used to manage memory refresh operations; The shared queue arbitrator is used to determine the request command corresponding to the memory access request and send it to the final arbitrator based on information provided by at least one of the shared page record module, the shared timing check module, and the shared refresh control module, as well as the request information of the memory access request stored in the shared command queue.

19. The memory controller according to claim 18, characterized in that, The request information for the memory access request includes: the virtual channel address of the memory access request; the virtual channel address of the memory access request is used to indicate whether the column command is marked as real or fake when the final arbiter outputs the column command.

20. The memory controller according to claim 18, characterized in that, The memory controller further includes: a trigger and response queue for sending commands output by the final arbitrator to memory according to the virtual channel indicated by the commands output by the final arbitrator; Specifically, if the command output by the final arbiter is a column command marked as false, the start and response queues discard the column command marked as false; if the command output by the final arbiter is a column command marked as true, the start and response queues send the column command marked as true to memory.

21. The memory controller according to claim 20, characterized in that, The departure and response queues are also used to receive read data returned from memory, which corresponds to the column commands of the read operation output by the final arbitrator; If the data length of the memory access request corresponding to the column command of a read operation is the second data length, then after caching the read data corresponding to the column command of a read operation, the read data corresponding to the column command of the next read operation and the cached read data are returned to the processor core. The column command of a read operation corresponds to reading the first data length of read data; where the first data length is the data length that a virtual channel supports for data transmission, and the second data length is the data length that a memory channel supports for data transmission. If the data length of the memory access request corresponding to the column command of the read operation is the first data length, then the read data will be returned directly to the processor core; The departure and response queues are equipped with a read data cache, the depth of which supports at least one column command of a read operation, and the width which corresponds at least to the length of the first data.

22. The memory controller according to any one of claims 1-14, or any one of claims 16-21, characterized in that, The zeroth virtual channel and the first virtual channel included in the memory channel have independent data buses, but share a command bus.

23. A memory access method, characterized in that, Applied to a memory controller as described in any one of claims 1-22, wherein the memory controller and memory transmit data via a memory channel, the memory channel including a zeroth virtual channel and a first virtual channel; the method includes: Determine the request command corresponding to the memory access request, wherein the request command corresponding to the memory access request participates in the final arbitration; And when the final arbitration output command is a non-copyable request command, and the request command meets the preset copying conditions, the request command is copied to obtain the corresponding copied request command, wherein the copied request command participates in the final arbitration; The command participating in the final arbitration is subject to final arbitration; if the command that passes the final arbitration is a non-copying request command, the request command is output and indicated to be transmitted through the zeroth virtual channel; and at least one clock cycle after the output of the non-copying request command, if the final arbitration passes the corresponding copying request command, the copying request command is output and indicated to be transmitted through the first virtual channel.

24. The method according to claim 23, characterized in that, The step of copying the request command when the final arbitration output is a non-copyable request command and the request command meets the preset copying conditions includes: When the final arbitration output command is a non-copyable request command, or a column command other than a true memory access request originating from the first data length, the request command is copied to obtain the corresponding copied request command; the first data length is the data length that a virtual channel supports for transmitting data.

25. The method according to claim 24, characterized in that, The method further includes: When the final arbitration output request command is a column command and the conditions are met, the final arbitration output column command is marked as false; the conditions include: the column command output by the final arbitrator is a non-replicated column command, the data length of the memory access request corresponding to the column command is a first data length, and the virtual channel indicated by the memory access request corresponding to the column command is a first virtual channel; If the final arbitration output request command is a column command and does not meet the aforementioned conditions, mark the final arbitration output column command as true; In this process, column commands marked as genuine are sent to memory after the final arbitrator outputs, while column commands marked as false are discarded after the final arbitrator outputs.

26. The method according to claim 24, characterized in that, The method further includes: When the final arbitration output command is a non-replicated command, sample the final arbitration output command and start timing parameter counting for the sampled command based on the timing parameter counting basis. The timing parameter counting is based on the following: for commands that occupy 1.5 clock cycles of the command bus, the timing parameter counting starts from the first rising edge of the clock output after the command is finally arbitrated, so that commands that occupy 1.5 clock cycles of the command bus start timing parameter counting one clock cycle earlier than the standard clock cycle.

27. A chip, characterized in that, include: At least one processor core, at least one memory controller, a memory physical layer, and memory; The at least one processor core is connected to the at least one memory controller; the memory physical layer includes at least one memory channel, which includes a zeroth virtual channel and a first virtual channel; the memory controller and the memory transmit data through the memory channel; The memory controller is the memory controller as described in any one of claims 1-22.

28. An electronic device, characterized in that, Includes the chip as described in claim 27, or the memory controller as described in any one of claims 1-22.