Method for releasing cache units by group and its memory allocator

By replacing the CPU with a memory allocator agent to handle the polling of the memory allocator and the release of cache units, the problems of resource consumption and failure restart in the existing technology are solved, and more efficient cache unit management is achieved.

CN122309121APending Publication Date: 2026-06-30BEIJING STARBLAZE TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING STARBLAZE TECH CO LTD
Filing Date
2024-12-31
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In the prior art, when the CPU uses the memory allocator to allocate cache units, it needs to access the status register through polling, which leads to high resource consumption and performance degradation. At the same time, the cache units cannot be effectively released after the device fails and restarts, introducing additional time overhead.

Method used

A memory allocator agent is introduced, and the memory allocator is encapsulated as a bus device. The memory allocator agent is accessed through the bus to replace the CPU in polling the status register. The cache units are divided into multiple groups, and the cache unit groups are released through the group, reducing polling time and cache unit release overhead after a failure restart.

Benefits of technology

It reduces the CPU's resource consumption in using the memory allocator, decreases polling time, and can quickly release cache unit groups after a device failure and restart, avoiding additional time overhead.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309121A_ABST
    Figure CN122309121A_ABST
Patent Text Reader

Abstract

This application provides a method for releasing cache units in groups and a memory allocator thereof. The memory allocator divides the cache units it manages and allocates into multiple cache unit groups. The method includes: in response to receiving a first cache unit release request sent by a first device, obtaining a first value stored in a release register and a first Group value stored in a Group register in the memory allocator; in response to the first value being a specified value, determining the first cache unit group corresponding to the first device based on the first Group value; and releasing all cache units in the first cache unit group. This application enables batch release of cache units in groups, allowing devices using the memory allocator to obtain the required cache units based on the batch-released cache units after a fault restart. Furthermore, batch release of cache units does not introduce additional time overhead, avoiding the problem of long release times for multiple cache units.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of integrated circuit design, and more particularly to a method for releasing cache cells in groups and a memory allocator thereof. Background Technology

[0002] Large-scale integrated circuits typically include a large amount of memory resources. These resources are usually organized into cache cells, and memory utilization efficiency is improved by allocating cache cells on demand. Software-managed cache cell allocation and deallocation is a common existing technology. Hardware-implemented (rather than software-implemented) memory allocators also exist for managing cache cell allocation and deallocation. Unless otherwise stated, the memory allocator mentioned in this document refers to a hardware-implemented memory allocator.

[0003] Figure 1 This diagram illustrates how the CPU uses the memory allocator to obtain available cache units.

[0004] For example, such as Figure 1 As shown, the memory allocator includes three interface registers: a request register, an index register, and a status register. These three interface registers are connected to the CPU, and the CPU accesses the memory allocator through these registers. The CPU obtains cache units allocated by the memory allocator by accessing the memory allocator. Specifically, the request register records requests from devices such as the CPU or other devices using the memory allocator to write allocated cache units; the index register records the index of the cache unit allocated by the memory allocator; and the status register records the status of the allocated cache unit, such as Ready indicating allocation is complete and UnReady indicating incomplete allocation. The cache units are provided by materials such as DRAM (Dynamic Random Access Memory) or SRAM (Static Random-Access Memory), and have a specified size, such as 4KB. The memory allocator allocates one cache unit at a time, and each cache unit has an index, which determines the cache address for accessing that cache unit.

[0005] The specific process by which the memory allocator allocates cache units to the CPU based on the CPU's request is as follows:

[0006] The CPU writes data to the memory allocator's request register, requesting the memory allocator to allocate a cache unit. The size of the data that the request register can hold is, for example, 1 bit. In response to the CPU's request, the memory allocator acquires an available cache unit (an unallocated cache unit), writes the index of the cache unit to the index register, and sets the status register to a specified value (such as 1) to indicate that the status register is in the Ready state.

[0007] The CPU reads data from the status register. If the status register indicates a Ready state, it knows that the cache allocation is complete. The CPU then reads the index in the index register and uses that index to access the allocated cache unit. If the status register indicates an UnReady state, the CPU delays reading data from the status register until the status register indicates a Ready state, at which point it reads the index register to obtain the allocated cache unit. Summary of the Invention

[0008] During the process of the memory allocator allocating cache units to the CPU, the CPU cannot know when the memory allocator will complete the allocation. Therefore, it needs to use a polling method to access the status register. When the status register indicates "Ready," it is determined that the memory allocator has completed the allocation, and the CPU retrieves the index of the allocated cache unit from the index register. However, the CPU's polling operation requires repeatedly executing multiple CPU instructions, which consumes a lot of CPU resources and reduces the performance of the CPU when processing other tasks.

[0009] This application provides a solution to the problem of resource consumption and reduced performance of the target device in handling other tasks caused by the target device having to use a polling method to access registers during the process of allocating cache units using a memory allocator. By introducing a memory allocator agent, the memory allocator is encapsulated as a bus device, allowing the target device (such as a CPU) to access the memory allocator agent through the bus to use the memory allocator. This enables the memory allocator agent to replace the target device in polling the status register, reducing the overhead of the target device using the memory allocator. Furthermore, the memory allocator agent is directly connected to the memory allocator (instead of being connected through the bus), allowing the memory allocator agent to obtain the value of the status register more quickly and thus obtain the allocated cache unit, reducing the time required for the polling process.

[0010] Furthermore, the memory allocator can allocate different cache resources to different devices. For example, when allocating cache units, it divides them into multiple groups, each corresponding to a device that will use the memory allocator to allocate cache units, such as the CPU or the host command processing unit. Taking the CPU as an example, under normal circumstances, each cache unit requested by the CPU will subsequently be released. However, the CPU (or other devices using the memory allocator) may malfunction. After a malfunction, the CPU may restart. The restarted CPU does not know which cache units it allocated when the malfunction occurred, making it unable to effectively inform the memory allocator which cache units need to be released. Therefore, the memory allocator will consider these cache units not to be released, resulting in a reduction in the number of available cache units. Moreover, even if the CPU knows which cache units it has allocated, releasing all cache units introduces additional time overhead.

[0011] Based on this, the embodiments of this application add a new function to the memory allocator, enabling the memory allocator to release cache units by group. By releasing cache units by group, regardless of how many cache units are allocated in a specified group, after all cache units in the group are released, even if the CPU does not know which cache units were previously allocated, it can still release all cache units in the group by specifying the group. Moreover, the time required to release all cache units in a group is equal to the time required to release a single cache unit, without introducing additional time overhead.

[0012] In a first aspect, embodiments of this application provide a cache unit management method applied to a memory allocator, wherein the memory allocator divides the cache units it manages and allocates into multiple cache unit groups, the method comprising:

[0013] In response to receiving a first cache unit release request from a first device, the first value stored in the release register and the first Group value stored in the Group register in the memory allocator are obtained;

[0014] In response to the first value being a specified value, the first cache unit group corresponding to the first device is determined based on the first Group value;

[0015] Release all cache units in the first cache unit group.

[0016] Optionally, the first device writes a specified value to the release register in the memory allocator and writes a first Group value to the Group register in the memory allocator through the first cache unit release request.

[0017] Optionally, the method further includes:

[0018] In response to receiving a second cache unit release request from a second device, the second value stored in the release register and the second Group value stored in the Group register in the memory allocator are obtained;

[0019] In response to the second value being a specified value, the second cache unit group corresponding to the second device is determined based on the second Group value;

[0020] Release all cache units in the second cache unit group.

[0021] Optionally, the first cache unit group and the second cache unit group do not share cache units.

[0022] Optionally, the memory allocator maintains a cache unit index table, each entry of which records the cache unit index of one of the cache units allocated by the memory allocator.

[0023] The cache unit index table includes multiple index groups, each corresponding one-to-one with a cache unit group; each index group includes multiple entries in the cache unit index table, recording the cache unit indexes of all cache units in its corresponding cache unit group.

[0024] Optionally, within each index group, the cache units indicated by the cache unit indexes of adjacent entries are contiguous in storage space;

[0025] For adjacent index groups, the cache unit index of the last entry in the previous index group and the cache unit index of the first entry in the next index group indicate cache units that are contiguous in storage space.

[0026] Optionally, within each index group, adjacent entries are stored in adjacent locations.

[0027] Optionally, the memory allocator maintains a pair of read pointers and write pointers for each index group;

[0028] After initialization and before allocating cache units, the read pointer and write pointer of each index group in the cache unit index table point to the same specified entry.

[0029] Optionally, during initialization, the memory allocator sets the read pointer and write pointer of each index group in the cache unit index table to point to the entry corresponding to the starting address in the storage space storing the index group;

[0030] The memory allocator records the indexes of each cache unit in each cache unit group into the respective entries of each index group in a specified order.

[0031] Optionally, the method further includes:

[0032] In response to the cache unit allocation request sent by the first device, the first Group value stored in the Group register of the memory allocator is obtained;

[0033] The first cache unit group is determined based on the first Group value, and the entry indicated by the read pointer in the index group corresponding to the first cache unit group is read.

[0034] The first cache unit, indicated by the cache unit index recorded in the entry pointed to by the read pointer, is allocated to the first device, and the read pointer in the index group corresponding to the first cache unit group is updated to indicate the next entry.

[0035] Optionally, in response to a third cache unit release request sent by the first device indicating the release of the first cache unit, the third value stored in the release register and the first Group value stored in the Group register in the memory allocator are obtained;

[0036] In response to the third value being an unspecified value, the first cache unit group is determined based on the first Group value;

[0037] Write the first cache unit index into the entry pointed to by the write pointer in the index group corresponding to the first cache unit group to release the first cache unit. The first cache unit index is the cache unit index corresponding to the first cache unit.

[0038] Update the write pointer in the index group corresponding to the first cache unit group to indicate the next entry.

[0039] Optionally, the read pointer and write pointer in the index group corresponding to the first cache unit group are initialized; and

[0040] Initialize each entry in the index group corresponding to the first cache unit group to release all cache units in the first cache unit group.

[0041] Optionally, the memory allocator is coupled to a first bus via a memory allocator agent, and the first device accesses the memory allocator agent via the first bus to access the memory allocator;

[0042] The data written by the first device to the memory allocator agent via the first bus includes a release identifier indicating the release of a cache unit group, or the memory allocator agent receives a bus command indicating the release of a cache unit group.

[0043] Optionally, the memory allocator and the first device are located in the first control unit. The memory allocator is a first memory allocator and manages the allocation of multiple third cache units. The first control unit and the second control unit are connected via an inter-chip interconnect unit. The second control unit includes a second memory allocator and a third device. The second memory allocator manages the allocation of multiple fourth cache units. Both the first control unit and the second control unit include an internal bus for interaction between internal devices. The method includes:

[0044] In response to a first bus access request sent by the third device indicating the allocation of a third cache unit, the first bus access request is received through the internal bus of the second control unit, the inter-chip interconnect unit, and the internal bus of the first control unit.

[0045] In response to the first bus access request, the first memory allocator allocates a third cache unit and sends the index of the third cache unit corresponding to the allocated third cache unit to the third device through the internal bus of the first control unit, the chip interconnect unit, and the internal bus of the second control unit.

[0046] Optionally, in response to a second bus access request sent by the third device instructing the writing of data to the third cache unit allocated to it, the second control unit sends the second bus access request to the first control unit via the internal bus of the second control unit and the inter-chip interconnect unit;

[0047] The first control unit writes data to the allocated third cache unit based on the second bus access request.

[0048] Optionally, in response to a third bus access request sent by the third device instructing the reading of data written in the third cache unit, the second control unit sends the third bus access request to the first control unit via the internal bus of the second control unit and the inter-chip interconnect unit;

[0049] The first control unit reads the data in the allocated third cache unit based on the third bus access request and feeds it back to the third device.

[0050] Optionally, the plurality of third cache units managed by the first memory allocator in the first control unit and the plurality of fourth cache units managed by the second memory allocator in the second control unit are located in the same memory;

[0051] After obtaining the index of the third cache unit corresponding to the third cache unit allocated to it, the third device directly accesses the third cache unit to write data to the third cache unit.

[0052] When it is necessary to read data from the third cache unit, the third device directly accesses the third cache unit to read the data.

[0053] Secondly, embodiments of this application provide a memory allocator that divides the cache units it manages and allocates into multiple cache unit groups. The memory allocator includes a release register and a group register.

[0054] In response to a first cache unit release request sent by a first device, the memory allocator obtains a first value stored in the release register and a first group value stored in the group register;

[0055] In response to the first value being a specified value, the memory allocator determines the first cache unit group corresponding to the first device based on the first Group value and releases all cache units in the first cache unit group.

[0056] Optionally, the first device writes a specified value to the release register in the memory allocator and writes a first Group value to the Group register in the memory allocator through the first cache unit release request.

[0057] Optionally, in response to receiving a second cache unit release request from a second device, the memory allocator obtains the second value stored in the release register and the second Group value stored in the Group register;

[0058] In response to the second value being a specified value, the memory allocator determines the second cache unit group corresponding to the second device based on the second Group value and releases all cache units in the second cache unit group.

[0059] Optionally, the first cache unit group and the second cache unit group do not share cache units.

[0060] Optionally, the memory allocator maintains a cache unit index table, each entry of which records the cache unit index of one of the cache units allocated by the memory allocator.

[0061] The cache unit index table includes multiple index groups, each corresponding one-to-one with a cache unit group; each index group includes multiple entries in the cache unit index table, recording the cache unit indexes of all cache units in its corresponding cache unit group.

[0062] Optionally, within each index group, the cache units indicated by the cache unit indexes of adjacent entries are contiguous in storage space;

[0063] For adjacent index groups, the cache unit index of the last entry in the previous index group and the cache unit index of the first entry in the next index group indicate cache units that are contiguous in storage space.

[0064] Optionally, within each index group, adjacent entries are stored in adjacent locations.

[0065] Optionally, the memory allocator maintains a pair of read pointers and write pointers for each index group;

[0066] After initialization and before allocating cache units, the read pointer and write pointer of each index group in the cache unit index table point to the same specified entry.

[0067] Optionally, during initialization, the memory allocator sets the read pointer and write pointer of each index group in the cache unit index table to point to the entry corresponding to the starting address in the storage space storing the index group;

[0068] The memory allocator records the indexes of each cache unit in each cache unit group into the respective entries of each index group in a specified order.

[0069] Optionally, the memory allocator, in response to a cache unit allocation request sent by the first device, obtains the first Group value stored in the Group register;

[0070] The first cache unit group is determined based on the first Group value, and the entry indicated by the read pointer in the index group corresponding to the first cache unit group is read.

[0071] The first cache unit, indicated by the cache unit index recorded in the entry pointed to by the read pointer, is allocated to the first device, and the read pointer in the index group corresponding to the first cache unit group is updated to indicate the next entry.

[0072] Optionally, in response to a third cache unit release request sent by the first device indicating the release of a first cache unit, the memory allocator obtains the third value stored in the release register and the first Group value stored in the Group register;

[0073] In response to the third value being an unspecified value, the memory allocator determines the first cache unit group based on the first Group value, writes the first cache unit index into the entry indicated by the write pointer in the index group corresponding to the first cache unit group to release the first cache unit, and updates the write pointer in the index group corresponding to the first cache unit group to indicate the next entry; wherein, the first cache unit index is the cache unit index corresponding to the first cache unit.

[0074] Optionally, the memory allocator initializes the read and write pointers in the index group corresponding to the first cache unit group; and

[0075] Initialize each entry in the index group corresponding to the first cache unit group to release all cache units in the first cache unit group.

[0076] Optionally, the memory allocator is coupled to a first bus via a memory allocator agent, and the first device accesses the memory allocator agent via the first bus to access the memory allocator;

[0077] The data written by the first device to the memory allocator agent via the first bus includes a release identifier indicating the release of a cache unit group, or the memory allocator agent receives a bus command indicating the release of a cache unit group.

[0078] Thirdly, embodiments of this application provide a control component, which is a first control component. The first control component includes a memory allocator and a first device as described in the second aspect. The memory allocator is a first memory allocator and manages the allocation of multiple third cache units. The first control component is connected to a second control component through an inter-chip interconnect unit. The second control component includes a second memory allocator and a third device. The second memory allocator manages the allocation of multiple fourth cache units. Both the first control component and the second control component include an internal bus for interaction between internal devices.

[0079] In response to a first bus access request sent by the third device indicating the allocation of a third cache unit, the second control unit sends the first bus access request to the first memory allocator via the internal bus of the second control unit, the inter-chip interconnect unit, and the internal bus of the first control unit.

[0080] In response to the first bus access request, the first memory allocator allocates a third cache unit and sends the index of the third cache unit corresponding to the allocated third cache unit to the third device through the internal bus of the first control unit, the chip interconnect unit, and the internal bus of the second control unit.

[0081] Optionally, in response to a second bus access request sent by the third device instructing the writing of data to the third cache unit allocated to it, the second control unit sends the second bus access request to the first control unit via the internal bus of the second control unit and the inter-chip interconnect unit;

[0082] The first control unit writes data to the allocated third cache unit based on the second bus access request.

[0083] Optionally, in response to a third bus access request sent by the third device instructing the reading of data written in the third cache unit, the second control unit sends the third bus access request to the first control unit via the internal bus of the second control unit and the inter-chip interconnect unit;

[0084] The first control unit reads the data in the allocated third cache unit based on the third bus access request and feeds it back to the third device.

[0085] Optionally, the plurality of third cache units managed by the first memory allocator in the first control unit and the plurality of fourth cache units managed by the second memory allocator in the second control unit are located in the same memory;

[0086] After obtaining the index of the third cache unit corresponding to the third cache unit allocated to it, the third device directly accesses the third cache unit to write data to the third cache unit.

[0087] When it is necessary to read data from the third cache unit, the third device directly accesses the third cache unit to read the data.

[0088] According to an embodiment of this application, a memory allocator agent receives a bus command sent by a device requesting cache unit allocation, sends a cache unit allocation request to the memory allocator in response to the bus command, monitors the status register in the memory allocator, obtains the index of at least one cache unit allocated by the memory allocator or cache unit allocation failure information based on the status register, generates a bus command response based on the index of at least one cache unit or cache unit allocation failure information, and feeds it back to the device requesting cache unit allocation. This allows the device requesting cache unit allocation to use the memory allocator by accessing the memory allocator agent, with the memory allocator agent polling the status register, reducing the overhead of the device using the memory allocator and reducing the time required for the polling process.

[0089] In a further embodiment, by adding a release register to the memory allocator, batch release of cache units can be achieved in units of cache unit groups. This allows devices using the memory allocator to obtain the required cache units based on the batch-released cache units after a fault restart. Furthermore, batch release of all cache units within a cache unit group does not introduce additional time overhead, avoiding the problem of long release times for multiple cache units in the prior art.

[0090] In a further embodiment, in a scenario where a connection between two control components is established through an inter-chip interconnect unit, a request is made to the memory allocator of the other control component to allocate cache units based on the interaction between the current control component and other control components. This allows the cache units in the other control component to be allocated using a memory allocator that does not belong to the current control component, thereby achieving effective occupancy of cache units. Attached Figure Description

[0091] Figure 1 A schematic diagram illustrating how the CPU obtains available cache units using the memory allocator is shown.

[0092] Figure 2 This application illustrates a hardware block diagram of a CPU acquiring cache units through a memory allocator agent according to an embodiment of the present application.

[0093] Figure 3 Showing Figure 2 The illustrated embodiment provides a flowchart of the memory allocator agent's processing of bus write commands;

[0094] Figure 4 Showing Figure 2 The illustrated embodiment provides a flowchart of the memory allocator agent processing bus read commands;

[0095] Figure 5 This illustration shows a hardware block diagram of two target devices acquiring cache units through a memory allocator agent, according to another embodiment of this application.

[0096] Figure 6 This illustration shows a schematic diagram of the indexes of cache units of different devices provided in an embodiment of this application by a memory allocator proxy cache;

[0097] Figure 7 This paper presents a hardware block diagram illustrating two target devices acquiring cache units through a memory allocator agent, according to another embodiment of this application.

[0098] Figure 8 This illustration shows a hardware block diagram of a target device according to an embodiment of the present application that obtains a cache unit via a single bus read command;

[0099] Figure 9 This application illustrates a schematic diagram of the address carried by a bus read command according to an embodiment of the present application.

[0100] Figure 10 Showing Figure 8 The illustrated embodiment provides a processing flowchart of the memory allocator agent;

[0101] Figure 11 This illustration shows a hardware block diagram of a target device that obtains a cache unit via a single bus read command, according to another embodiment of this application.

[0102] Figure 12 Showing Figure 11 The illustrated embodiment provides a processing flowchart of the memory allocator agent;

[0103] Figure 13 A hardware block diagram illustrating the target device according to another embodiment of this application obtains a cache unit via a single bus read command is shown.

[0104] Figure 14 This application illustrates a hardware block diagram of a memory allocator managing a cache unit according to an embodiment of the present application. Figure 1 ;

[0105] Figure 15 This illustration shows a schematic diagram of updating read / write pointers based on allocation requests and release requests for a single cache unit, as provided in an embodiment of this application.

[0106] Figure 16 This application provides a schematic diagram illustrating the corresponding group of the cache unit index table initialized based on the release request of the instruction group, according to an embodiment of the present application.

[0107] Figure 17 This application illustrates a hardware block diagram of a memory allocator managing a cache unit according to an embodiment of the present application. Figure 2 ;

[0108] Figure 18 The hardware block diagram of the storage device is shown;

[0109] Figure 19A This application illustrates a hardware block diagram showing the connection of the control components provided in an embodiment.

[0110] Figure 19B A schematic diagram illustrating the interaction between control components provided in an embodiment of this application is shown. Detailed Implementation

[0111] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0112] Figure 2 This illustration shows a hardware block diagram of a CPU acquiring cache units through a memory allocator agent according to an embodiment of this application.

[0113] For example, such as Figure 2As shown, a memory allocator agent is deployed between the memory allocator and the CPU, and the CPU connects to the memory allocator agent via a bus. By deploying the memory allocator agent, the memory allocator is encapsulated as a bus device, and the CPU accesses the memory allocator agent via the bus to use the memory allocator.

[0114] The memory allocator agent includes a request register 310 and an index register 312, and occupies two addresses in the bus address space (denoted as Addr_req and Addr_buffer_index, respectively). Addr_req represents the address of the request register 310 on the bus, and Addr_buffer_index represents the address of the index register 312 on the bus. The CPU accesses the request register 310 and index register 312 included in the memory allocator agent through the bus. The memory allocator agent is directly connected to the memory allocator. The CPU accesses the memory allocator, which acts as a bus device, by reading and writing to Addr_req and Addr_buffer_index.

[0115] The memory allocator includes a request register 320, an index register 322, and a status register 324. Figure 2 The memory allocator is, for example, with Figure 1 The memory allocator shown is the same. The memory allocator agent utilizes... Figure 1 The CPU accesses the memory allocator in the same way as the CPU. Figure 2 The memory allocator.

[0116] according to Figure 2 In this embodiment, the process by which the CPU uses the memory allocator agent to allocate cache units using the memory allocator is as follows:

[0117] The CPU sends a bus write command to the bus address Addr_req via the bus, writing a specified value to indicate a request for cache units. For example, writing 1 to the bus address Addr_req represents a request for cache units. In response to the specified value being written to the bus address Addr_req, the memory allocator agent writes a specified value to the request register 320 in the memory allocator. After the specified value is written to the request register 320, the memory allocator allocates cache units according to existing technology.

[0118] The memory allocator agent monitors the status of the memory allocator's status register 324. When the status register 324 indicates the Ready state, the memory allocator agent reads the index from the memory allocator's index register 322 and caches it in the memory allocator agent's index register 312.

[0119] exist Figure 1 In the process, after the CPU reads the status register and obtains the UnReady status, it needs to poll the status register. Figure 2 In this embodiment, the memory allocator agent performs the polling operation on the status register 324 instead of the CPU. The memory allocator agent stops polling once it reads the Ready state from the status register 324 and retrieves the index of the cache unit from the memory allocator's index register 322, writing it to the memory allocator agent's index register 312. Because the memory allocator agent performs the polling operation instead of the CPU, it offloads the CPU's workload, saving the CPU from polling operations. The polling of the status register 324 by the memory allocator agent is performed by hard-wired logic circuitry. For example, in response to the Ready state indicated by the status register 324, the writing of the value of the index register 322 to the index register 312 can be completed within one clock cycle, thus its polling performance is significantly higher than that of CPU-performed polling.

[0120] After writing a specified value to the bus address Addr_req, the CPU also issues a bus read command to the bus address Addr_buffer_index to read the value of index register 312 in the memory allocator agent. Upon receiving the bus read command for Addr_buffer_index, the memory allocator agent sends the index of the cache cell obtained from the memory allocator and cached in index register 312 as a response to the bus read command to the CPU. By caching the index of the cache cell obtained from the memory allocator in index register 312, the CPU can respond promptly to the bus read command based on the cached index.

[0121] It should be noted that when the memory allocator allocates cache units, it needs to search for unallocated available cache units among multiple cache units. This operation takes some time. The CPU, when issuing a bus read command for bus address Addr_buffer_index, is unaware of whether the memory allocator's allocation operation has been completed. Therefore, there is a possibility that the memory allocator agent may receive a bus read command for bus address Addr_buffer_index (corresponding to the previous bus write command) from the CPU before the status register 324 is in the Ready state. In this case, although the CPU's bus read command has been received, because the memory allocator has not successfully allocated a cache unit, the memory allocator agent continues to monitor the status register 324 of the memory allocator and temporarily does not respond to the bus read command.

[0122] After the memory allocator successfully allocates a cache unit, the status register 324 indicates the Ready state. The memory allocator agent then retrieves the index of the allocated cache unit from the memory allocator's index register 322 and generates a response to the bus read command to provide the retrieved cache unit index to the CPU. Thus, the CPU can obtain the index of the allocated cache unit through the bus read command (without polling), enabling the memory allocator agent to handle the polling of the memory allocator's status register 324 on behalf of the CPU.

[0123] The memory allocator agent has a time limit for responding to bus read commands. If the memory allocator fails to allocate a cache unit for an extended period, the memory allocator agent cannot wait indefinitely. Instead, the memory allocator responds to the bus read command with a specified value that indicates to the CPU that cache unit allocation has failed.

[0124] Figure 3 Showing Figure 2 The illustrated embodiment provides a flowchart of the memory allocator agent's processing of bus write commands.

[0125] Step 301: Receive the instruction sent by the CPU via the bus to write a bus write command to the bus address Addr_req with the specified value.

[0126] Step 302: Write the specified value indicated by the bus write command into the request register 320 in the memory allocator.

[0127] Step 303: Read the status value of status register 324 in the memory allocator.

[0128] Step 304: Identify whether the status register 324 is in the Ready state. If the status register is in the UnReady state, proceed to step 305. If the status register indicates the Ready state, proceed to step 307.

[0129] Step 305: Wait for the time it takes for the status register 324 to change from the UnReady state to the Ready state to exceed the specified time. If it does, proceed to step 306; otherwise, return to step 303.

[0130] Step 306: Determine that the memory allocator failed to allocate cache units.

[0131] Step 307: Read the index of the allocated cache unit from the index register 322 of the memory allocator and cache it in the index register 312.

[0132] Understandable. Figure 3The flowchart shown illustrates the operations performed by the memory allocator agent when processing a bus write command to write data to request register 310, but does not show the steps of the CPU reading index register 312 via the bus.

[0133] Figure 4 Showing Figure 2 The illustrated embodiment provides a flowchart of the memory allocator agent's processing of bus read commands.

[0134] Step 410: Receive the bus read command sent by the CPU, which corresponds to the bus write command, via the bus.

[0135] Step 420: Generate a read command response based on the index of the cache unit cached in index register 312 or the cache unit allocation failure information, and send the read command response to the CPU.

[0136] It is important to note that if the status register 324 is in the Ready state, it indicates that the memory allocator has successfully allocated the cache unit, and the memory allocator agent directly returns the index of the cache unit in response to the bus read command. If the status register 324 is in the Unready state, it needs to wait for a while. If it changes to the Ready state during the waiting process, the index of the cache unit is returned. If it does not change to the Ready state during the waiting process, a specified value is returned to indicate that the cache unit allocation has failed.

[0137] It should be understood that steps 301 to 307 and steps 410 to 420 described above are all executed by the memory allocator agent. By introducing the memory allocator agent, which monitors the status register, the CPU can replace the memory allocator agent in polling the status register, reducing the overhead of the CPU using the memory allocator. Furthermore, since the memory allocator agent is directly connected to the memory allocator, rather than via a bus, it ensures that the memory allocator agent can efficiently poll the status register, thereby obtaining the cache unit index in a timely manner and reducing the time required for the polling process.

[0138] Figure 5 A hardware block diagram illustrating two target devices obtaining cache units through a memory allocator agent, according to another embodiment of this application, is shown.

[0139] For example, such as Figure 5As shown, a memory allocator agent is deployed between the memory allocator and the target device. The target device, acting as a bus master, can issue bus read / write commands to access the registers of the memory allocator agent. The target device includes, for example, a CPU and a bus master 2. The CPU and bus master 2 are connected to the memory allocator agent via the bus, and the memory allocator agent is directly connected to the memory allocator. The memory allocator can be accessed by the two masters on the bus (CPU and bus master 2) through the memory allocator agent.

[0140] Multiple cache units managed by a memory allocator can be grouped. When a target device requests cache unit allocation, it can describe a Group in the request, indicating which Group the cache unit needs to be allocated from. Accordingly, the memory allocator allocates cache units from the corresponding Group, not from other Groups. For example, if the multiple cache units managed by the memory allocator are divided into two groups, Group1 and Group2, and the CPU requests cache unit allocation from Group1, the memory allocator will allocate cache units from Group1, not from Group2. By grouping the managed cache units and allocating cache units only from matching Groups to the target device, resource conflicts are reduced. For example, for the CPU, cache units can be allocated only from Group1, and for bus master device 2, only from Group2. This way, even if many commands are received from bus master device 2, its cache unit needs can be met based on Group2, while reserving all cache units in Group1 for the CPU.

[0141] exist Figure 5 In the memory allocator agent, Group register 311 and index register 312 are included. The CPU and bus master device 2 access Group register 311 and index register 312 included in the memory allocator agent via the bus. The memory allocator agent is directly connected to the memory allocator. Group register 311, in addition to replacing... Figure 2In this embodiment, the request register 310 functions to receive cache unit allocation requests and also records the Group value (or GroupID) indicated in the allocation request. The memory allocator includes a Group register 321, a Port register 323, a status register 324, and an index register 322. The Group register 321 records the Group ID indicated by the write command; the Port register 323 records a value that indicates the target device from which the current cache unit allocation request originates (e.g., from the CPU or bus master device 2), and can also be used to characterize specific requirements of the cache unit allocation request.

[0142] It should be noted that the number of target devices is not limited to two; it can also be three, four, or other numbers, allowing the memory allocator to be accessed by multiple devices on the bus. The process of allocating cache units to different target devices is the same. The following section uses the CPU as an example to describe the process of the memory allocator allocating cache units to the CPU:

[0143] The memory allocator agent occupies two bus addresses in the bus address space: Addr_req and Addr_buffer_index. Addr_req is the address of Group register 311 on the bus, and Addr_buffer_index is the address of index register 312 on the bus. The CPU writes a specified Group ID to Addr_req via a bus write command (e.g., Group ID = 1, indicating a cache request from Group 1). Since Addr_req is the address of Group register 311 on the bus, writing a specified Group ID to Addr_req via a bus write command actually writes the specified Group ID to Group register 311.

[0144] In response to the bus address Addr_req being written with a specified Group ID, the memory allocator agent writes data to the memory allocator's Group register 321 and Port register 323. The value written to Group register 321 is the Group ID written by the CPU, and the value written to Port register 323 is determined based on the issuer of the current bus write command. For example, if the issuer of the bus write command is the CPU, the value written to Port register 323 is 0; if the issuer of the bus write command is bus master device 2, the value written to Port register 323 is 2. In response to the data written to Group register 321 and Port register 323, the memory allocator allocates cache units for the CPU. For example, in response to Group ID = 1 written to Group register 321 and Port = 0 written to Port register 323, cache units are allocated for the CPU from Group 1.

[0145] Optionally, the Group ID is used as a hint rather than a mandatory one. For example, the Group ID indicates that cache units should be allocated from Group 1, but if the cache units in Group 1 are exhausted, the memory allocator can allocate cache units from Group 2.

[0146] The memory allocator agent monitors the status register 324 of the memory allocator. When the status register 324 changes to a specified value (e.g., 1), indicating a Ready state, the memory allocator agent reads the index of the allocated cache unit stored in the index register 322 of the memory allocator and caches the read cache unit index in the index register 312. If the status register 324 indicates an UnReady state, it means that the memory allocator has not successfully allocated a cache unit, and the memory allocator agent needs to continue monitoring the status register 324 of the memory allocator.

[0147] The CPU issues a bus read command corresponding to the aforementioned bus write command to the bus address Addr_buffer_index. Since the bus address Addr_buffer_index is the address of index register 312 on the bus, the CPU issuing a bus read command to the bus address Addr_buffer_index is actually issuing a bus read command to index register 312. If the memory allocator successfully allocates a cache unit, the memory allocator agent, upon receiving the bus read command, will send the index of the cache unit obtained from the memory allocator as a response to the bus read command to the CPU.

[0148] The memory allocator agent's response time to bus read commands is limited. If the memory allocator fails to allocate a cache unit for an extended period, the memory allocator agent cannot wait indefinitely. Instead, the memory allocator responds to the bus read command with a specified value that indicates to the CPU that cache unit allocation has failed.

[0149] Because the memory allocator is accessed by one or more devices on the bus, the memory allocator agent may receive multiple bus write commands from different devices. Optionally, the memory allocator processes only a single cache allocation request at a time, while the memory allocator agent handles multiple bus write commands concurrently, so that other bus write commands can be processed by the memory allocator agent before a bus read command A corresponding to bus write command A is received. For example, the memory allocator agent can process bus write commands continuously without waiting for the corresponding bus read command to appear after processing a bus write command before processing subsequent bus write commands. The processing of bus write commands by the memory allocator agent does not mean that it replaces the memory allocator in allocating cache units, but rather, based on the requirements of the bus protocol, it receives the data to be written by the bus write command and provides a response indicating that the bus write command processing is complete. Furthermore, the memory allocator agent buffers the received multiple bus write commands and sends cache allocation requests to the memory allocator one by one.

[0150] For any bus write command, after writing data to the Group register 321 and Port register 323 of the memory allocator based on the bus write command, and monitoring the status register 324 to determine whether the memory allocator successfully or unsuccessfully allocated a cache unit for that bus write command, the memory allocator agent caches the cache unit allocation result, and then writes data to the Group register 321 and Port register 323 of the memory allocator according to the next bus write command. If the memory allocator successfully allocates a cache unit for a bus write command, the memory allocator agent records the source of the current bus write command (indicated by the Port register 323) when reading the index of the allocated cache unit, so that when a bus read command is received subsequently, the matching index can be fed back.

[0151] Figure 6 This illustration shows a schematic diagram of the indexes of cache units of different devices provided in an embodiment of this application by a memory allocator agent.

[0152] For example, such as Figure 6As shown, the CPU and bus master device 2 each send four bus write commands to the memory allocator agent sequentially. The memory allocator agent obtains the indices (index 1 to index 4) of the four cache units associated with the CPU from the memory allocator based on Port=0, and obtains the indices (index a to index d) of the four cache units associated with the bus master device 2 from the memory allocator based on Port=2. If a bus read command is received from the CPU, the memory allocator agent responds with index 1; if another bus read command is received from the CPU, the memory allocator agent responds with index 2; and if a bus read command is received from the bus master device 2, the memory allocator agent responds with index a.

[0153] The following example illustrates the process of the CPU requesting cache allocation from bus master device 2. The memory allocator agent receives a bus write command 1 from the CPU, instructing it to write Group ID = 1 to the bus address Addr_req, and receives a bus write command 2 from bus master device 2, instructing it to write Group ID = 2 to the bus address Addr_req. In response to bus write command 1, it writes Group ID = 1 to the Group register 321 of the memory allocator and writes a specified value 0 (identifying the CPU) to the Port register 323 of the memory allocator. For the memory allocator, in response to Group ID = 1 being written to Group register 321 and the specified value 0 (identifying the CPU) being written to Port register 323, it allocates cache units for the CPU from Group 1. The memory allocator agent reads the status value of the status register 324 in the memory allocator. If the status register 324 indicates a Ready state, it reads the index of the cache unit allocated to the CPU from the index register 322 of the memory allocator and caches it, while also recording the source of the current bus write command (Port = 0). If status register 324 remains in the UnReady state after the specified time, it indicates that the memory allocator failed to allocate cache units to the CPU.

[0154] In response to bus write command 2, Group ID = 2 is written to the Group register 321 of the memory allocator, and a specified value 2 identifying bus master device 2 is written to the Port register 323 of the memory allocator. For the memory allocator, in response to Group ID = 2 being written to Group register 321 and the specified value 2 identifying bus master device 2 being written to Port register 323, it allocates a buffer unit for bus master device 2 from Group 2. The memory allocator agent reads the status value of the status register 324 in the memory allocator. If the status register 324 indicates a Ready state, it reads the index of the buffer unit allocated to bus master device 2 from the index register 322 of the memory allocator and caches it, also recording the source of the current bus write command. If the status register 324 is still in an UnReady state after a specified time, it is determined that the memory allocator failed to allocate a buffer unit for bus master device 2.

[0155] In response to the bus read command 1 corresponding to the bus write command 1 sent by the CPU, the system feeds back the index of the cache unit allocated to the CPU or feeds back cache unit allocation failure information to the CPU in response to the bus read command 1; in response to the bus read command 2 corresponding to the bus write command 2 sent by the bus master device 2, the system feeds back the index of the cache unit allocated to the bus master device 2 or feeds back cache unit allocation failure information to the bus master device 2 in response to the bus read command 2.

[0156] Figure 7 This illustration shows a hardware block diagram of two target devices obtaining cache units through a memory allocator agent, according to another embodiment of this application.

[0157] For example, such as Figure 7 As shown, a memory allocator agent is deployed between the memory allocator and the CPU, and between the memory allocator and bus master device 2. The CPU and bus master device 2 share the memory allocator agent and the memory allocator. The CPU and bus master device 2 are connected to the memory allocator agent via the bus. The memory allocator agent is directly connected to the memory allocator. The memory allocator can be accessed by the CPU and bus master device 2 on the bus through the memory allocator agent.

[0158] Figure 7 The memory allocator and memory allocator agent shown are... Figure 5 The embodiments are largely the same. Figure 7In this architecture, the memory allocator manages multiple cache units, which are divided into two groups (distinguished by Group values). Each cache unit group can be further subdivided based on the value indicated by Port register 323. Since the value recorded in Port register 323 indicates the target device from which the cache unit allocation request originates, it can be viewed as a further grouping of cache unit groups based on different devices on the bus. When allocating cache units, the memory allocator can allocate cache units from the corresponding Port group based on the device to which the cache unit allocation request belongs. Optionally, the cache unit allocation request carries a Group value but not a Port value. The Port value is determined by the memory allocator agent based on the source of the current cache unit allocation request. Thus, users of bus master devices can specify the Group value when issuing a cache unit allocation request, but the Port value is objective and cannot be specified by the user. The memory allocator can provide diverse cache unit allocation strategies based on the Group and Port values. For example, cache units can be divided into different groups based on the Group and Port values, each bus master device has a different priority relative to each group, and some groups can be reserved for specific bus master devices.

[0159] For example, the CPU writes Group ID = 1 to the bus address Addr_req. In response to the Group ID = 1 written to the bus address Addr_req, the memory allocator agent writes data to the Group register 321 and the Port register 323 of the memory allocator. Writing Group ID = 1 to Group register 321 indicates a request for a cache unit from Group 1, and writing Port = 0 to Port register 323 indicates that the cache unit allocation request originates from the CPU. Based on the data written to Group register 321 and Port register 323, the memory allocator selects an available cache unit from the two cache units corresponding to Port 0 of Group 1 and allocates it to the CPU.

[0160] Bus master device 2 writes Group ID = 2 to the bus address Addr_req. In response to the Group ID = 2 written to Addr_req, the memory allocator agent writes data to the Group register 321 and Port register 323 of the memory allocator. Specifically, writing Group ID = 2 to Group register 321 indicates a request for a cache unit from Group 2, and writing Port = 2 to Port register 323 indicates that the cache unit allocation request originates from bus master device 2. Based on the data written to Group register 321 and Port register 323, the memory allocator selects an available cache unit from the two cache units corresponding to Port 2 of Group 2 and allocates it to bus master device 2.

[0161] Optionally, the Port is used as a prompt rather than a mandatory setting. For example, the value of Port register 323 indicates that a cache unit can be allocated from the packet corresponding to Port 0. If the cache unit of the packet corresponding to Port 0 is exhausted, the memory allocator can allocate a cache unit from the packet corresponding to Port 2.

[0162] In the aforementioned embodiments of this application, the target device (such as a CPU) needs to issue one bus write command and one bus read command to obtain one cache unit. When the CPU needs to obtain a large number of cache units (e.g., N), the CPU needs to issue 2N bus commands proportionally, resulting in excessive CPU resource consumption and a large bus occupation, which limits the overall system performance. To further improve performance, it is necessary to achieve a large allocation of cache units with lower resource overhead. Based on this, a new processing strategy is proposed, which combines sending one bus write command and one bus read command into sending only one bus read command. A single bus read command is used instead of the one bus write command plus one bus read command method in the aforementioned embodiments to complete the allocation of cache units, thereby reducing the resource overhead of the cache unit allocation process.

[0163] Figure 8 A hardware block diagram illustrating a target device according to an embodiment of this application obtains a cache unit via a single bus read command is shown.

[0164] For example, such as Figure 8 As shown, a memory allocator agent is deployed between the memory allocator and the CPU, and between the memory allocator and bus master device 2. The CPU and bus master device 2 share the memory allocator agent and the memory allocator. The CPU and bus master device 2 are connected to the memory allocator agent via the bus. The memory allocator agent is directly connected to the memory allocator. The memory allocator can be accessed by the CPU and bus master device 2 on the bus through the memory allocator agent.

[0165] In this embodiment, the memory allocator agent occupies multiple addresses or multiple address segments in the bus address space (in contrast, Figures 2 to 7 In this embodiment, the memory allocator agent occupies two bus addresses (Addr_req and Addr_buffer_index) in the bus address space. In this embodiment, the number of bus addresses occupied by the memory allocator agent in the bus address space is, for example, 2^(W1+W2), where W1 is the bit width (e.g., 5 bits) of the Group value carried in the bus address information in the bus read command, and W2 is the bit width (e.g., 8 bits) of the Port value carried in the bus address information in the bus read command. In this case, the number of bus addresses occupied by the memory allocator agent in the bus address space is 2^13. As an example, if the base address of the memory allocator agent is B, and the Group value and Port value are located in the least significant bits of the bus address in the bus read command, then all read and write commands accessing the address space [B, B+8KB] on the bus are handled by the memory allocator agent. Optionally, the Group / Port value carried by the bus read command through the bus address information may omit some bits for compression or other purposes, or include additional bits for alignment or other purposes, resulting in a difference between the bit width of the Group / Port value and the bit width of the Group / Port register of the memory allocator agent. Those skilled in the art should understand the existence of this difference and, consequently, understand how the number and location of bus addresses occupied by the memory allocator agent in the bus address space are calculated.

[0166] As is understandable, when the Group / Port value is located in some of the least significant bits of the bus address information carried by the bus read command, the multiple bus addresses occupied by the memory allocator agent in the bus address space are contiguous. Conversely, when the Group / Port value is not located in some of the least significant bits of the bus address information carried by the bus read command, the multiple bus addresses occupied by the memory allocator agent in the bus address space are not contiguous. Regardless of contiguousness, the address range corresponding to the multiple bus addresses occupied by the memory allocator agent in the bus address space can always be identified. Bus read commands whose bus addresses fall within the address range corresponding to the multiple bus addresses occupied by the memory allocator agent in the bus address space are delivered to the memory allocator agent for processing.

[0167] It is still understood that the bus read command can carry either the Group value or the Port value through its bus address information, rather than both simultaneously. Accordingly, the number and location of bus addresses occupied by the memory allocator agent in the bus address space are determined by the Group value or the Port value. The Group value and / or Port value are carried by designated bits of the bus address. For example, for a 32-bit bus address, the first 5 bits starting from the most significant bit are used to record the Group value, and the next 8 bits are used to record the Port value. As another example, for a 32-bit bus address, the first 5 bits starting from the least significant bit are used to record the Group value, while the middle bits [17:10] are used to record the Port value. Optionally, the positions of the multiple bits used to record the Group value or Port value do not need to be consecutive.

[0168] exist Figure 8 In this configuration, the memory allocator agent includes Group register 311, Port register 313, and Index register 312. The CPU and bus master 2 are connected via a bus to Group register 311, Port register 313, and Index register 312. The memory allocator also includes Group register 321, Port register 323, Status register 324, and Index register 322. The memory allocator corresponds to two cache unit groups and allocates cache units to the matching cache unit groups based on instructions from the CPU and bus master 2.

[0169] Figure 8 In this embodiment, the CPU and bus master device 2 access the memory allocator agent based on a single bus read command. The memory allocator agent requests the memory allocator to allocate cache units based on a single bus read command, thus achieving cache unit allocation through a single bus read command. The following describes the process of the memory allocator allocating cache units to the CPU using the CPU as an example:

[0170] When the CPU needs to request cache unit allocation from the memory allocator, it issues a bus read command. The specified bits of the bus address (R_addr) carried in the bus read command describe the Group value and the Port value. The Group value indicates which Group the cache unit should be allocated from, while the Port value indicates which device the current cache unit allocation request originates from; for example, Port = 0 indicates that the current cache unit allocation request originates from the CPU. The other bits of the bus address carried in the bus read command are filled with specified values ​​(e.g., all 0s, all 1s, or other values), and these specified values ​​are determined by the base address B. Optionally, the bus address carried by the bus read command does not include the Port value, and the memory allocator agent generates the Port value based on the source of the bus read command. In this case, the Port value is independent of the multiple bus addresses occupied by the memory allocator agent in the bus address space. Again, alternatively, the address carried by the bus read command does not include the Port value, and the memory allocator does not need to use the Port value (see...). Figure 2 (Example).

[0171] Continue reading Figure 8 The bus recognizes that the bus read command should be processed by the memory allocator agent because the bus address R_Addr carried by the bus read command falls into the address range corresponding to the multiple bus addresses occupied by the memory allocator agent in the bus address space (R_Addr is one of the multiple bus addresses occupied by the memory allocator agent in the bus address space).

[0172] In response to a received bus read command, the memory allocator agent does not immediately issue a response to the bus read command. Instead, it extracts the Group and Port values ​​from the bus address (R_Addr) it carries and writes them to the memory allocator's Group register 321 and Port register 323. After extracting the Group and Port values, the memory allocator agent can first write them to Group register 311 and Port register 313, and then write the data written in Group register 311 and Port register 313 to Group register 321 and Port register 323.

[0173] Based on the data written to Group register 321 and Port register 323, the memory allocator selects available cache units from the corresponding cache unit groups and allocates them to the CPU. It writes the index of the allocated cache unit to index register 322 and updates the status register 324. The memory allocator agent monitors the status register 324. When the status register 324 indicates a Ready state, it reads the index of the cache unit allocated to the CPU from the memory allocator's index register 322 and caches it. The memory allocator agent then returns the read index to the CPU via the bus as a response to a bus read command. Thus, the CPU can allocate cache units with a single bus read command, omitting a bus write command compared to the previous implementation.

[0174] The bus specifies a processing time for bus read commands. If the memory allocator fails to provide a valid cache unit within the specified time, status register 324 indicates an Unready state. However, the memory allocator agent still needs to respond to the bus read command, using specified data as the response. This specified data represents the memory allocator's failure to allocate a cache unit. After the CPU obtains this specified data and determines that the memory allocator has failed to allocate a cache unit, it can respond to the memory allocator's failure to allocate a cache unit by issuing another bus read command to request the allocation of a cache unit.

[0175] The bus address (R_addr) carried by the bus read command is of a fixed size, such as 32 bits. For example, 13 bits of the 32-bit R_addr are used to describe the Group and Port values, and the remaining bits are used to describe the multiple bus addresses occupied by the memory allocator agent in the bus address space. The bus read command is determined to be processed by the memory allocator agent based on whether the address described by R_addr falls within the address range corresponding to the multiple bus addresses occupied by the memory allocator agent in the bus address space. Optionally, the actual range of Group and Port values ​​in the bus address (R_addr) carried by the bus read command issued by the CPU is much smaller than the size of the address space occupied by the memory allocator agent on the bus; that is, the actual range of Group and Port values ​​is much smaller than the number of bus addresses occupied by the memory allocator agent in the bus address space. For example, the address space occupied by the memory allocator agent is 8KB, but the Group and Port values ​​may only have 2, 4, or 16 possible values. Although this approach wastes bus address space, it improves the allocation efficiency of cache units by using a single bus read command to complete the allocation of cache units through a single bus read command by using part of the bus address bits carried by the bus read command as the Group / Port value indicating the cache unit allocation request.

[0176] Still optional, see also Figure 2 The memory allocator can allocate cache units without using Group and Port values. Accordingly, in Figure 8 In the embodiments, the bus read command carries a bus address including a Group value and / or a Port value. However, for cases where cache units do not need to be allocated based on the Group value / Port value, the memory allocator agent may not provide the Group value and / or Port value to the memory allocator, but instead write it to the request register 320. Figure 2 The method is to instruct the cache unit to allocate requests.

[0177] It should be noted that the specified bits in R_addr that describe the Group value are used to indicate one or more Group values. For example, indicating a Group value (such as Group1), the memory allocator will allocate cache units from Group1. Another example is... Figure 9 As shown, the specified bits describing the Group value in the bus address (R_addr) carried by the bus read command indicate two Group values, Group1 and Group2. Therefore, the memory allocator can allocate cache units from either Group1 or Group2. Furthermore, the positions of Group1 and Group2 in R_addr (e.g., their order) indicate priority to the memory allocator; for example, the memory allocator preferentially allocates cache units from Group1 and then from Group2.

[0178] Figure 9 In the bus address R_Addr, the specified value is used to enable the bus or memory allocator agent to identify that the current R_Addr falls into the address range corresponding to multiple bus addresses occupied by the memory allocator agent in the bus address space.

[0179] Figure 10 Showing Figure 8 The illustrated embodiment provides a processing flowchart for the memory allocator agent.

[0180] Step 1001: Receive the bus read command from the CPU carrying the bus address (R_addr). The bus address R_addr is in the format [xxx, group, port], where xxx represents the multiple bus addresses occupied by the memory allocator agent in the bus address space (i.e.,...). Figure 9(The specified values ​​are shown in the image). Group and Port are variable values. Bus read commands carrying addresses in the format [xxx, group, port] are provided to the memory allocator agent. For bus read commands carrying other addresses, the bus is sent to other devices.

[0181] Step 1002: Write the Group value and Port value from the bus address carried by the bus read command into the Group register 321 and Port register 323 of the memory allocator.

[0182] Step 1003: Read the status value of status register 324 in the memory allocator.

[0183] Step 1004: Identify whether the status register 324 is in the Ready state. If the status register 324 is in the UnReady state, proceed to step 1005. If the status register 324 is in the Ready state, proceed to step 1007.

[0184] Step 1005: Check if the time it takes for the status register 324 to change from UnReady to Ready exceeds the specified duration. If it does, proceed to step 1006; otherwise, return to step 1003.

[0185] Step 1006: Determine that the memory allocator failed to allocate a cache unit, and then proceed to step 1008.

[0186] Step 1007: Read the index of the allocated cache unit from the index register 322 of the memory allocator and cache it in the index register 312, and then execute step 1008.

[0187] Step 1008: Generate a read command response based on the index of the cached unit or the cached unit allocation failure information, and send the read command response to the CPU.

[0188] Figure 10 In the process shown, each step is executed by the memory allocator agent. In response to a single bus read command provided by the bus, the memory allocator agent writes the Group value and Port value in the address carried by the bus read command into the Group register 321 and Port register 323 of the memory allocator. The memory allocator then allocates cache units, and the allocation of cache units is achieved through a single bus read command.

[0189] Figure 11 This illustration shows a hardware block diagram of a target device that obtains a cache unit via a single bus read command, according to another embodiment of this application.

[0190] Figure 11 The hardware block diagram shown is Figure 8They are basically the same, the difference being that the index register 312, which is handled by the memory allocator, can hold multiple indices. Figure 11 In the memory allocation mechanism, a special Port value is provided to indicate that the current bus read command intends to acquire multiple cache units. This special Port value is carried by the bus address corresponding to the bus read command. For example, using Port = 100 indicates that two cache units need to be acquired. The target device (CPU or bus master 2) sends a bus read command to the memory allocator agent. The specified bits describing the Port value in the address carried by the bus read command are used to indicate the special Port value, enabling the allocation of multiple cache units with a single bus read command. This improves cache unit allocation efficiency compared to allocating one cache unit with a single bus read command.

[0191] For the memory allocator agent, in response to a bus read command, it writes data multiple times to the Group register 321 and Port register 323 of the memory allocator, with the same data written each time. The memory allocator allocates cache units based on the data written each time, providing indices for multiple cache units. After obtaining the indices for multiple cache units, the memory allocator agent provides them to the target device via a single bus transfer. Optionally, the data written to the Port register 323 of the memory allocator does not need to be a special Port value, but can be obtained using the method described in one or more of the preceding embodiments. Alternatively, the special Port value can also be written to the Port register 323 of the memory allocator.

[0192] As an example, a special Port value indicates that two cache units are needed. The process of the memory allocator allocating two cache units to the CPU is described below:

[0193] When the CPU needs to request cache space from the memory allocator, it issues a bus read command. The specified bits of the bus address (R_addr) carried in the bus read command describe the Group value and Port value. In response to the specific Port value indicated in the bus address carried by the received bus read command, the memory allocator agent requests the memory allocator to allocate cache space for the CPU twice.

[0194] Upon initial request to the memory allocator for cache space allocation, the memory allocator agent provides the Group and Port values ​​to the memory allocator's Group register 321 and Port register 323. Based on the data written to the Group register 321 and Port register 323, the memory allocator allocates a cache space for the CPU from the available cache spaces. The memory allocator agent monitors the status register 324. When the status register 324 indicates a Ready state, the memory allocator agent reads the index (e.g., index 1) of the cache space allocated to the CPU from the memory allocator's index register 322, and uses the read index 1 as part A of the read data in the bus read command.

[0195] Next, the memory allocator agent initiates a second allocation request, continuing to provide the Group and Port values ​​to the Group register 321 and Port register 323 of the memory allocator. Based on the data written to the Group register 321 and Port register 323, the memory allocator continues to allocate cache units for the CPU from the available cache units. The memory allocator agent monitors the status register 324. When the status register 324 indicates the Ready state, it retrieves the index of the cache unit allocated to the CPU (e.g., index 2) and uses the read index 2 as part B of the read data in the bus read command.

[0196] After obtaining index 1 as part A and index 2 as part B, the memory allocator agent combines index 1 and index 2 as a response to the bus read command and sends the response to the CPU via the bus.

[0197] In this embodiment, the CPU allocates two cache units through a single bus read command, which improves the allocation efficiency of cache units compared to the allocation method of obtaining one cache unit through a single bus read command.

[0198] It should be noted that the memory allocator may fail to allocate cache units to the CPU. The memory allocator agent's second allocation request is unrelated to the initial cache unit allocation result. Because the bus read command sent by the CPU carries a special Port value in its bus address, indicating that two cache units are to be acquired, the success or failure of the initial cache unit allocation will not affect the second cache unit allocation request. In other words, the memory allocator agent will initiate a second cache unit allocation request regardless of the success or failure of the initial allocation.

[0199] If the memory allocator agent successfully allocates cache units in both cache unit allocation requests, it determines the response to a single bus read command based on the indices of the allocated cache units. The agent then provides the acquired index to the CPU via the bus, enabling the CPU to acquire the two cache units allocated by the memory allocator through a single bus read command. If the memory allocator agent fails to allocate cache units in both cache unit allocation requests, it sends a cache unit allocation failure message to the CPU via the bus. If the memory allocator agent successfully allocates a cache unit in only one of the two requests, it sends a cache unit allocation failure message and the index of the acquired cache unit to the CPU via the bus. Furthermore, when sending a response to the CPU, flag A can indicate successful cache unit allocation, and flag B can indicate failure. The cache unit allocation failure message sent by the memory allocator agent is represented by flag B. For example, if the memory allocator successfully allocates cache units for both allocation requests, the response from the memory allocator agent is (idx A, flag A, idx B, flag A), where idx A represents the index of the cache unit allocated in the first allocation and idx B represents the index of the cache unit allocated in the second allocation. If the memory allocator successfully allocates a cache unit for the first allocation request but fails to allocate a cache unit for the second allocation request, the response from the memory allocator agent is (idx A, flag A, idx B, flag B), where idx A represents the index of the cache unit allocated in the first allocation and idx B is null.

[0200] The number of indices that can be returned for a single bus read command is limited by the index size and the allowed data width of the bus read command. That is, the number of indices that can be returned for a single bus read command must be less than or equal to the set value. For example, if the allowed data width for a single bus read command is 32 bits and the index size of the cache unit is 16 bits, then a single bus read command can return 2 indices; or, if the index size of the cache unit is 10 bits, then a single bus read command can return 3 indices.

[0201] Figure 12 Showing Figure 11 The illustrated embodiment provides a processing flowchart for the memory allocator agent.

[0202] Step 1201: Receive the bus read command sent by the CPU. The specified bits of the bus address carried in the bus read command describe the Group value and the Port value. The Port value is a special Port value and indicates that two cache units need to be acquired.

[0203] Step 1202: Provide the Group value and Port value indicated by the bus read command to the Group register 321 and Port register 323 of the memory allocator. The memory allocator allocates cache units for the CPU in the available cache units based on the data written to the Group register 321 and Port register 323.

[0204] Step 1203: Read the status value of status register 324 in the memory allocator.

[0205] Step 1204: Identify whether the status register 324 is in the Ready state. If the status register 324 is in the UnReady state, proceed to step 1205. If the status register 324 is in the Ready state, proceed to step 1207.

[0206] Step 1205: Check if the time it takes for the status register 324 to change from UnReady to Ready exceeds the specified duration. If it does, proceed to step 1206; otherwise, return to step 1203.

[0207] Step 1206: Determine that the memory allocator failed to allocate a cache unit, and then proceed to step 1208.

[0208] Step 1207: Read the index 1 of the allocated cache unit from the index register 322 of the memory allocator and cache it in the index register 312, and then execute step 1208.

[0209] Step 1208: Provide the Group value and Port value indicated by the bus read command to the Group register 321 and Port register 323 of the memory allocator again.

[0210] Step 1209: Read the status value of status register 324 in the memory allocator.

[0211] Step 1210: Identify whether the status register 324 is in the Ready state. If the status register 324 is in the UnReady state, proceed to step 1211. If the status register 324 is in the Ready state, proceed to step 1213.

[0212] Step 1211: Check if the time it takes for the status register 324 to change from the UnReady state to the Ready state exceeds the specified time. If it does, proceed to step 1212; otherwise, return to step 1209.

[0213] Step 1212: Determine that the memory allocator failed to allocate a cache unit, and then proceed to step 1214.

[0214] Step 1213: Read the index 2 of the allocated cache unit from the index register 322 of the memory allocator and cache it in the index register 312, and then execute step 1214.

[0215] Step 1214: Generate a response to the bus read command based on the allocation results of the first allocation and the second allocation, and provide it to the CPU via the bus.

[0216] Figure 12 In the process shown, each step is executed by the memory allocator agent. In response to a single bus read command, the memory allocator agent writes the Group value and Port value (a special Port value indicating the acquisition of 2 cache units) in the bus address carried by the bus read command into the Group register 321 and Port register 323 of the memory allocator twice. The memory allocator allocates cache units twice, and the allocation of two cache units is achieved through a single bus read command.

[0217] Figure 13 This illustration shows a hardware block diagram of a target device according to another embodiment of the present application that obtains a cache unit via a single bus read command.

[0218] Figure 13 The hardware block diagram shown is Figure 11 They are basically the same, the difference is that... Figure 13 The index register 312 of the memory allocation agent is used to store multiple copies of data (such as R_Data1 and R_Data2). Figure 13 In this embodiment, a single bus read command issued by the memory allocator agent instructs the reading of multiple data items with consecutive addresses, thereby obtaining the indices of multiple allocated cache units based on a single bus read command. From the perspective of the target device, the bus read command issued by the target device instructs the reading of multiple consecutive data items starting from bus address R_addr in the address space (e.g., the data at addresses R_addr and R_addr+1), thus achieving the reading of multiple consecutive data items based on a single bus read command.

[0219] For the memory allocator agent, in response to a single bus read command, it writes data multiple times consecutively to the Group register 321 and Port register 323 of the memory allocator, with each write containing the same data. The memory allocator allocates cache units based on the data written each time, providing indices for multiple cache units. For example, the memory allocator agent feeds back the indices obtained from the memory allocator twice, as R_Data1 and R_Data2, to the target device in response to the bus read command.

[0220] Taking the CPU as the target device as an example, the process of the memory allocator allocating two cache units to the CPU is described below:

[0221] The CPU issues a bus read command indicating that two consecutive data sets need to be read. The specified bits of the bus address (R_addr and R_addr+1) carried in the bus read command describe the Group and Port values. The memory allocator agent provides the Group and Port values ​​to the memory allocator's Group register 321 and Port register 323, and the memory allocator allocates cache units. The memory allocator agent reads the index (e.g., index 1) of the cache unit allocated to the CPU from the memory allocator's index register 322 and caches the read index 1 as R_Data1.

[0222] Since the bus read command indicates that two consecutive copies of data need to be read, after obtaining R_Data1, the memory allocator agent provides the Group value and Port value to the Group register 321 and Port register 323 of the memory allocator again, so that the memory allocator can allocate cache units. The memory allocator agent obtains the index of the cache unit allocated to the CPU (such as index 2) and caches the read index 2 as R_Data2.

[0223] After obtaining R_Data1 and R_Data2, the memory allocator agent combines R_Data1 and R_Data2 and feeds them back to the CPU via the bus. This enables the allocator agent to obtain the cache unit index from the memory allocator twice based on a single bus read command, and respond to the bus read command issued by the CPU.

[0224] Based on a single bus read command indicating the reading of multiple consecutive data sets, if the Port value described by the specified bits of the bus address carried by the bus read command is a special Port value, and the special Port value indicates that multiple cache units need to be acquired, then more cache units can be allocated based on a single bus read command.

[0225] For example, the CPU issues a bus read command indicating that two consecutive data sets need to be read. The specified bits of the bus address carried by the bus read command describe the Group value and Port value. The Port value is a special Port value that indicates that two cache units need to be acquired. During the first data read, the memory allocator agent provides the Group value and Port value to the Group register 321 and Port register 323 of the memory allocator. The memory allocator allocates cache units to the CPU based on the Group value and Port value. Then, it provides the Group value and Port value to the Group register 321 and Port register 323 of the memory allocator again, and the memory allocator allocates cache units again. This allows the memory allocator agent to obtain the indices corresponding to the two cache units allocated by the memory allocator during the first data read, and combine the two obtained indices (such as index 1 and index 2) as R_Data1.

[0226] During the second data read, the memory allocator agent continues to provide the Group and Port values ​​to the Group register 321 and Port register 323 of the memory allocator. The memory allocator allocates cache units to the CPU based on the Group and Port values, and then provides the Group and Port values ​​to the Group register 321 and Port register 323 again. The memory allocator allocates cache units again, allowing the memory allocator agent to obtain the indices corresponding to the two cache units allocated by the memory allocator during the second data read. These two indices (e.g., indices 3 and 4) are used as R_Data2. The memory allocator agent combines R_Data1 and R_Data2 and provides them to the CPU in one go via the bus.

[0227] In this embodiment, R_Data1 and R_Data2 are of fixed size, for example, both R_Data1 and R_Data2 are 32 bits. R_Data1 and R_Data2 can be concatenated into 64 bits of data and provided to the CPU in a single bus transfer. For the memory allocator, it allocates cache units to the CPU from available cache units based on the data written to Group register 321 and Port register 323. Even if the data written to Group register 321 and Port register 323 are the same twice, it does not affect the memory allocating cache units to the CPU from available cache units.

[0228] In this embodiment, the CPU allocates four cache units with a single bus read command. Compared to the previous method of allocating two cache units with a single bus read command, this further improves the allocation efficiency of cache units.

[0229] It's important to note that the memory allocator may fail to allocate cache units to the CPU. If a cache unit allocation fails, this failure can be indicated in R_Data. For example, four flags can be used to indicate whether four cache units were successfully allocated, or a special value (such as 0xff) can be used to indicate a cache unit allocation failure.

[0230] Figure 14 This application illustrates a hardware block diagram of a memory allocator managing a cache unit according to an embodiment of the present application. Figure 1 .

[0231] exist Figure 14 In this system, the cache units managed by the memory allocator are divided into two groups. Each group does not share cache units, and each cache unit can only belong to one group. Devices using the memory allocator (such as the CPU and bus master 2) correspond to a Group. For example, a cache unit allocation request for the CPU is only allocated from Group 1, and a cache unit allocation request for the bus master 2 is only allocated from Group 2.

[0232] The memory allocator includes a Group register, an Index register, a Status register, and a newly added Interface register (called the Release register). The value of the Release register indicates whether cache units should be released, while the value of the Group register indicates the Group value (or Group ID) of the Group to be released. For example, a value of 1 in the Release register indicates that cache units need to be released; a value of 0 indicates that cache units should not be released. The CPU (and other devices using the memory allocator) instructs the memory allocator to release all cache units within a specific Group by writing a specified value to the Release register and a Group ID to the Group register. For example, if the CPU writes a value of 1 to the Release register and a Group ID of Group1 to the Group register, the memory allocator will release all cache units within Group1 based on the values ​​in the Release and Group registers.

[0233] For a memory allocator, it maintains a cache cell index table, which is implemented in hardware as memory. In the cache cell index table, each entry stores one of the cache cell indices of the cache cells managed by the memory allocator; that is, a memory is used to store the cache cell indices corresponding to the cache cells managed by the memory allocator.

[0234] The cache unit index table also includes multiple groups (e.g., index groups), and each group corresponds one-to-one with a cache unit's Group. For example, Group 1 of the cache unit index table records the cache unit indices corresponding to all cache units in cache unit group 1, and Group 2 records the cache unit indices corresponding to all cache units in cache unit group 2. Optionally, the cache unit indices are arranged consecutively within the groups of the cache unit index table, but this arrangement is not mandatory. For example, ... Figure 14 As shown, Group 1 of the cache unit index table includes four entries (entry 0, entry 1, entry 2, and entry 3). For example, entry 0 records the cache unit index as Buffer index 0, entry 1 records the cache unit index as Buffer index 1, entry 2 records the cache unit index as Buffer index 2, and entry 3 records the cache unit index as Buffer index 3. Group 2 of the cache unit index table includes four entries (entry 4, entry 5, entry 6, and entry 7). For example, entry 4 records the cache unit index as Buffer index 4, entry 5 records the cache unit index as Buffer index 5, entry 6 records the cache unit index as Buffer index 6, and entry 7 records the cache unit index as Buffer index 7. It should be understood that during initialization, the cache unit indices are arranged consecutively within the groups of the cache unit index table. After multiple allocations and releases of cache units, the cache unit indices are not necessarily arranged consecutively within the groups of the cache unit index table.

[0235] Optionally, within a group of cache unit index tables, the storage locations of each entry are contiguous, and the next entry can be obtained from one entry by incrementing the address.

[0236] The memory allocator maintains a pair of read pointers and write pointers for each group in the cache unit index table (the read pointer is denoted as RPRT, and the write pointer is WPRT). Figure 14 In the cache unit index table, G1_RPTR represents the read pointer for group 1, and G1_WPTR represents the write pointer for group 1; G2_RPTR represents the read pointer for group 2, and G2_WPTR represents the write pointer for group 2. After the memory allocator is initialized and before cache units are allocated, the RPTR and WPTR of each group in the cache unit index table point to the first entry in that group. For example, Figure 14In Group 1, both G1_WPTR and G1_RPTR point to the entry corresponding to the starting address of the storage space of Group 1 in the storage cache unit index table (such as entry 0). For Group 2, both G2_WPTR and G2_RPTR point to the entry where Buffer index 4 is located (such as entry 4).

[0237] When the memory allocator receives a cache unit allocation request indicating a single cache unit, it reads the entry indicated by the RPRT in the specified cache unit index table group. The specified cache unit index table group refers to the group corresponding to the device that sent the cache unit allocation request; for example, if the CPU sends a cache unit allocation request, the specified cache unit index table group is the group corresponding to Group 1. The cache unit represented by the cache unit index recorded in the entry indicated by the read pointer in that group is used as the allocated cache unit, thus allocating a single cache unit. Subsequently, the RPRT is incremented (here, the increment instruction adds 1 to the address indicated by the current RPRT entry), for example, +1, to point to the next entry. For example, as shown... Figure 15 As shown, after receiving a cache unit allocation request from the CPU indicating a single cache unit, the memory allocator reads entry 0 of group 1 of the cache unit index table based on G1_RPTR, allocates the cache unit 0 indicated by the cache unit index Buffer index 0 recorded in entry 0 to the CPU, and then increments G1_RPTR (here, the increment indicates that the address pointed to by the current G1_RPTR is increased by 1) so that G1_RPTR points to entry 1.

[0238] When the memory allocator receives a cache cell release request indicating a single cache cell, it writes the cache cell index corresponding to the cache cell to be released into the entry pointed to by the WPRT of the specified group, and then increments the WPRT, for example, by 1. For example, ... Figure 15 As shown, after receiving the release request for cache unit 6 sent by bus device 2, the memory allocator writes the cache unit index Buffer index 6 corresponding to cache unit 6 into the entry 4 pointed to by G2_WPRT in group 2, thereby releasing cache unit 6. Then G2_WPRT is incremented (the increment indicator is increased by 1) to point to entry 5.

[0239] The above content describes the operation process of reading and writing pointers when allocating or releasing a single cache unit. The following section introduces the operation process of reading and writing pointers when releasing all cache units within a cache unit group.

[0240] When the memory allocator receives a request to release all cache units within a cache unit group (implemented through the release register and the group register), it needs to initialize the RPRT and WPRT corresponding to that group, and fill each entry of the corresponding group in the cache unit index table with the initial value. This completes the release of all cache units within that group.

[0241] As an example, the CPU writes a specified value to the release register and a specified Group ID (such as Group1) to the Group register via a bus write command to instruct the memory allocator to release all cache units within Group1. In response to the data written to the release register and the Group register, the memory allocator releases all cache units within Group1 for the CPU.

[0242] As an example, during the process of releasing cache units allocated to the CPU in Group 1, regardless of which entry WPRT and RPTR in Group 1 of the cache unit index table corresponding to Group 1 indicate, the memory allocator initializes the RPRT and WPRT corresponding to Group 1. For example, Figure 16 As shown, both RPRT and WPRT corresponding to Group1 are set to point to the first entry (entry 0) of Group 1 in the cache unit index table. Furthermore, each entry in Group 1 of the cache unit index table is filled with its initial value; for example, the initial value of each entry in Group 1 of the cache unit index table is... Figure 14 As shown, each entry in Group 1 of the cache unit index table is filled with cache unit index 0 to cache unit index 3, that is, the cache unit indexes of all cache units corresponding to this group. This completes the release of all cache units in Group 1.

[0243] In this embodiment, the hardware implementation of the cache unit index table in the memory allocator is memory; that is, the groups within the cache unit index table are implemented by memory. Group 1 and Group 2 of the cache unit index table are adjacent in storage space in memory, such that when RPRT / WPRT points to, for example... Figure 14 When storing entry 3 at buffer index 3, incrementing RPRT / WPRT will point to entry 4 storing buffer index 4. In this case, the memory allocator can be easily configured to manage cache units and provide different numbers of groups.

[0244] For example, the memory allocator provides only one group to manage eight cache units. During memory allocator initialization, both RPTR and WPRT point to the first entry in memory.

[0245] For example, a memory allocator provides three groups, each managing two cache units: Group 0 (managing cache units 0 and 1), Group 1 (managing cache units 2 and 3), and Group 2 (managing cache units 4 and 5). By providing three sets of RPTR / WPRT, which point to the first, third, and fifth entries in the cache unit index table (memory) respectively during initialization, the method of allocating and releasing cache units using RPTR / WPRT remains unchanged, thus implementing a memory allocator that manages three groups.

[0246] Figure 14 The method shown, which releases all cache units within a group by group, is applicable to... Figure 5 , Figure 7 , Figure 8 , Figure 11 as well as Figure 13 An example of grouping cache units. And in Figure 14 In the implementation scheme of the memory allocator-managed cache unit shown, there is no memory allocator agent. Devices using the memory allocator, such as the CPU and host command processing unit (bus master device 2), can directly operate the memory allocator without going through the memory allocator agent. For hardware structures with a memory allocator agent (such as...), Figure 5 , Figure 7 , Figure 8 , Figure 11 as well as Figure 13 The hardware structure shown provides, in order to adapt to the memory allocator releasing cache units in groups, [the following is provided]. Figure 17 The hardware block shown illustrates the memory allocator managing the cache unit. Figure 2 ,exist Figure 17 In this process, the memory allocator is coupled to the bus through a memory allocator agent. The memory allocator agent can add a 1-bit flag to its Group register when data is written from the bus to indicate a release operation of the Group, or it can add a bus-accessible CMD (Command Prompt) register in the memory allocator agent to receive commands indicating the release of the Group.

[0247] Figure 18 The hardware block diagram of the storage device is shown.

[0248] for Figure 18The storage device includes two control units connected via an inter-chip interconnect. This structure allows for multiple operating modes. For example, in one mode, the host command processing unit 1 of control unit 1 operates and processes host I / O commands from the host. The host command processing unit 2 of control unit 2 does not operate. Host command processing unit 1 may write data from host I / O commands to the cache of control unit 2 and further write the data to an NVM chip coupled to control unit 2. Therefore, in this mode, the cache of control unit 2 needs to be allocated via its memory allocator. That is, the cache of another control unit needs to be allocated via its memory allocator, and the device using the memory allocator and the cache allocated by the memory allocator belong to different control units.

[0249] The cache units allocated by the memory allocator can be located in SRAM or DRAM. Each control unit has its own SRAM, and each control unit can have its own dedicated DRAM, or... Figure 18 The two control units shown share DRAM.

[0250] To enable the current control unit to use the memory allocator in other control units, this application provides a solution that, based on the connection established between two control units through an inter-chip interconnect unit, requests the memory allocator of another control unit to allocate a cache unit based on the interaction between the control units, so as to realize the use of the memory allocator of another control unit to write data to the cache unit of another control unit.

[0251] Figure 19A A hardware block diagram showing the connection of the control components provided in an embodiment of this application is illustrated.

[0252] exist Figure 19A In this system, each control unit has its own memory allocator and memory allocator agent. For example, control unit 0 includes memory allocator 0, and control unit 1 includes memory allocator 1. Memory allocator 0 and memory allocator 1 are connected to the bus through their respective memory allocator agents.

[0253] The cache units allocated by memory allocator 0 are located in DRAM0 coupled to control unit 0. That is, CPU 0 and host command processing unit 0 can access the cache units in DRAM0 to write data to or read data from the cache units. The cache units allocated by memory allocator 1 are located in DRAM1 coupled to control unit 1. That is, CPU 1 and host command processing unit 1 can access the cache units in DRAM1 to write data to or read data from the cache units.

[0254] If the CPU is used as the device accessing the memory allocator, in the aforementioned embodiments, CPU0 accesses memory allocator agent 0 via bus 0, and then uses memory allocator 0 to allocate cache units; CPU1 accesses memory allocator agent 1 via bus 1, and then uses memory allocator 1 to allocate cache units. That is, the CPU, as a device using the memory allocator, can only use the memory allocator within the corresponding control unit.

[0255] exist Figure 19A In the illustrated embodiment, the two control units interact through an inter-chip interconnect unit. CPU0 can also access the memory allocator agent 1 located within control unit 1 via bus 0, and then use memory allocator 1 to allocate cache units. The cache units allocated by memory allocator 1 belong to the cache units of control unit 1, thereby enabling CPU0 to access control unit 1, write data to or read data from the cache units belonging to control unit 1.

[0256] When CPU0 accesses memory allocator agent 1, it issues a bus access request via bus 0. This request is transmitted to control unit 1 via the inter-chip interconnect unit. Inside control unit 1, the request is provided to memory allocator agent 1 via bus 1, allowing CPU0 to interact with memory allocator agent 1 through bus 0, the inter-chip interconnect unit, and bus 1, thereby enabling CPU0 to use memory allocator 1. Memory allocator 1 allocates cache units belonging to control unit 1 to CPU0 and, based on the interaction between CPU0 and memory allocator agent 1, provides CPU0 with the cache unit index corresponding to the allocated cache unit.

[0257] After allocating a cache unit belonging to control unit 1 to CPU0 using memory allocator 1, CPU0 can write data to the allocated cache unit. CPU0 can then issue another bus access request to write data to the cache unit allocated by memory allocator 1; this request is transmitted to control unit 1 via the inter-chip interconnect unit. Control unit 1, after receiving the request to write data to the allocated cache unit based on bus 1, can directly write data to the corresponding cache unit in DRAM1 based on the interaction between bus 1 and DRAM1. After completing the data writing to the allocated cache unit, CPU0 can continue to access the allocated cache unit to read the data written to that cache unit, thus realizing data writing and reading.

[0258] Figure 19B A schematic diagram illustrating the interaction between control components provided in an embodiment of this application is shown.

[0259] Taking the example of the host command processing unit 0 in control unit 0 needing to use the cache unit belonging to control unit 1. Figure 19B In the middle, the host command processing unit 0 issues a bus access request 1 ( Figure 19B (represented by (1)) indicates that the bus access request 1 in the request control unit 1 requests the memory allocator 1 to allocate a cache unit. Control unit 0 and control unit 1 interact based on the inter-chip interconnect unit. The inter-chip interconnect unit of control unit 0 forwards the bus access request 1 issued by the host command processing unit 0 to control unit 1. Figure 19B (represented by (2)).

[0260] Inside control unit 1, bus access request 1 is sent to memory allocator agent 1. Figure 19B (3) represents the memory allocator agent 1. In response to bus access request 1, the memory allocator agent 1 requests the memory allocator 1 to allocate cache units. The memory allocator agent 1 can monitor the status register of the memory allocator 1. When the status register indicates the Ready status, the memory allocator agent 1 reads the index in the index register of the memory allocator 1 and caches it in the index register of the memory allocator agent 1.

[0261] After obtaining the cache cell index corresponding to the allocated cache cell through memory allocator 1, memory allocator agent 1 uses the cache cell index as a response to bus access request 1. Figure 19B (4) represents the control unit 1, which generates a response to bus access request 1 via the inter-chip interconnect unit and bus 0, and provides it to the host command processing unit 0. Figure 19B (represented by (5)).

[0262] After obtaining the cache unit index corresponding to the allocated cache unit, host command processing unit 0 determines the address of the allocated cache unit on the bus based on the cache unit index. Host command processing unit 0 then issues bus access request 2. Figure 19B The (6) is used to write data to the allocated cache unit. The inter-chip interconnect unit forwards the bus access request 2 from control unit 0 to control unit 1. Figure 19B (represented by (7)), the control unit 1 writes data to the allocated cache unit based on the bus access request 2. Figure 19B (represented by (8)).

[0263] This completes the process of allocating a cache unit belonging to control unit 1 to host command processing unit 0 in control unit 0 and writing data to the allocated cache unit. After the data writing is completed, host command processing unit 0 can issue bus access request 3 to read the data written to the allocated cache unit.

[0264] As an example, neither the CPU1 of the control unit 1 nor the host command processing unit 1 is working. All host commands issued by the host are processed by the host command processing unit 0 of the control unit 0. For host commands that need to access the NVM chip connected to the control unit 1, the host command processing unit 0 needs to allocate a cache unit through the memory allocator 1 in accordance with the processing method described above, so as to access the cache unit belonging to the control unit 1 and further write the data to the NVM chip coupled to the control unit 1.

[0265] In an alternative implementation, if control unit 0 and control unit 1 are coupled to the same DRAM (e.g., Figure 19B Control unit 0 is coupled to control unit 1 to DRAM 1, or... Figure 19B In this case, DRAM0 and DRAM1 are the same DRAM, and control unit 0 shares the DRAM with control unit 1. For host command processing unit 0, it processes data through... Figure 19B After obtaining the cache unit index corresponding to the allocated cache unit in (5), since the allocated cache unit is located in DRAM1 and control unit 0 shares DRAM1 with control unit 1, DRAM1 can be directly accessed to write data to the allocated cache unit. Figure 19B (6a) indicates that there is no need to relay bus access requests through the chip interconnect unit. Compared with the method of relaying bus access requests, this method can improve the access efficiency of cache unit.

[0266] The storage device disclosed in Chinese patent application No. 2023112407535 includes multiple control components, and two control components are connected through an inter-chip interconnect unit. In the embodiments of this application, the operation of the multiple control components included in the storage device is similar to that of the patent, and the content of the patent is incorporated herein by reference.

[0267] Although preferred embodiments of this application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of this application. Clearly, those skilled in the art can make various alterations and variations to this application without departing from its spirit and scope. Thus, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.

Claims

1. A cache unit management method, applied to a memory allocator, characterized in that, The memory allocator divides the cache units it manages and allocates into multiple cache unit groups, the method including: In response to receiving a first cache unit release request from a first device, the first value stored in the release register and the first Group value stored in the Group register in the memory allocator are obtained; In response to the first value being a specified value, the first cache unit group corresponding to the first device is determined based on the first Group value; Release all cache units in the first cache unit group.

2. The method according to claim 1, characterized in that, The memory allocator maintains a cache unit index table, and each entry in the cache unit index table records the cache unit index of one of the cache units allocated by the memory allocator. The cache unit index table includes multiple index groups, each corresponding one-to-one with a cache unit group; each index group includes multiple entries in the cache unit index table, recording the cache unit indexes of all cache units in its corresponding cache unit group.

3. The method according to claim 2, characterized in that, in, The memory allocator maintains a pair of read pointers and write pointers for each index group; After initialization and before allocating cache units, the read pointer and write pointer of each index group in the cache unit index table point to the same specified entry.

4. The method according to claim 3, characterized in that, Also includes: In response to the cache unit allocation request sent by the first device, the first Group value stored in the Group register of the memory allocator is obtained; The first cache unit group is determined based on the first Group value, and the entry indicated by the read pointer in the index group corresponding to the first cache unit group is read. The first cache unit, indicated by the cache unit index recorded in the entry pointed to by the read pointer, is allocated to the first device, and the read pointer in the index group corresponding to the first cache unit group is updated to indicate the next entry.

5. The method according to claim 4, characterized in that, In response to a third cache unit release request sent by the first device indicating the release of the first cache unit, the third value stored in the release register and the first Group value stored in the Group register in the memory allocator are obtained. In response to the third value being an unspecified value, the first cache unit group is determined based on the first Group value; Write the first cache unit index into the entry pointed to by the write pointer in the index group corresponding to the first cache unit group to release the first cache unit. The first cache unit index is the cache unit index corresponding to the first cache unit. Update the write pointer in the index group corresponding to the first cache unit group to indicate the next entry.

6. The method according to any one of claims 3 to 5, characterized in that, Initialize the read and write pointers in the index group corresponding to the first cache unit group; and Initialize each entry in the index group corresponding to the first cache unit group to release all cache units in the first cache unit group.

7. The method according to any one of claims 1 to 6, characterized in that, The memory allocator and the first device are located in the first control unit. The memory allocator is a first memory allocator and manages the allocation of multiple third cache units. The first control unit and the second control unit are connected through an inter-chip interconnect unit. The second control unit includes a second memory allocator and a third device. The second memory allocator manages the allocation of multiple fourth cache units. Both the first control unit and the second control unit include an internal bus for interaction between internal devices. The method includes: In response to a first bus access request sent by the third device indicating the allocation of a third cache unit, the first bus access request is received through the internal bus of the second control unit, the inter-chip interconnect unit, and the internal bus of the first control unit. In response to the first bus access request, the first memory allocator allocates a third cache unit and sends the index of the third cache unit corresponding to the allocated third cache unit to the third device through the internal bus of the first control unit, the chip interconnect unit, and the internal bus of the second control unit.

8. The method according to claim 7, characterized in that, in, In response to a second bus access request sent by the third device instructing the writing of data to a third cache unit allocated to it, the second control unit sends the second bus access request to the first control unit via the internal bus of the second control unit and the inter-chip interconnect unit; The first control unit writes data to the allocated third cache unit based on the second bus access request.

9. The method according to claim 7, characterized in that, The multiple third cache units managed by the first memory allocator in the first control unit and the multiple fourth cache units managed by the second memory allocator in the second control unit are located in the same memory; After obtaining the index of the third cache unit corresponding to the third cache unit allocated to it, the third device directly accesses the third cache unit to write data to the third cache unit. When it is necessary to read data from the third cache unit, the third device directly accesses the third cache unit to read the data.

10. A memory allocator, characterized in that, The memory allocator divides the cache units it manages and allocates into multiple cache unit groups, and the memory allocator includes a release register and a group register; In response to a first cache unit release request sent by a first device, the memory allocator obtains a first value stored in the release register and a first group value stored in the group register; In response to the first value being a specified value, the memory allocator determines the first cache unit group corresponding to the first device based on the first Group value and releases all cache units in the first cache unit group.