Preloading method and device of consistency catalog and related equipment

By grouping and selecting modes for the preload address sequence of the consistency directory, the problem of low preload efficiency in the existing technology is solved, flexible preload operation is realized, and the system's resource utilization and task processing efficiency are improved.

CN122240530APending Publication Date: 2026-06-19HAIGUANG INFORMATION TECH (SUZHOU) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HAIGUANG INFORMATION TECH (SUZHOU) CO LTD
Filing Date
2026-03-23
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

The current consistency directory preloading efficiency is low, and it is impossible to dynamically select the directory with the optimal granularity, resulting in low efficiency when preloading in a large-scale cached state.

Method used

By obtaining the preload address sequence, grouping based on the address information, creating preload tasks, allocating cache information to each group of preload address sequences, determining the execution mode as either coarse-grained or fine-grained consistent directory preload mode, and flexibly selecting the appropriate preload operation.

Benefits of technology

It improves the preloading efficiency of the consistency catalog, and can flexibly select the appropriate preloading mode under different conditions, thereby improving the system's resource utilization and task processing efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240530A_ABST
    Figure CN122240530A_ABST
Patent Text Reader

Abstract

This application provides a method, apparatus, and related equipment for preloading a consistency directory. The method includes: obtaining a preload address sequence; grouping the addresses in the preload address sequence based on address information in the preload address sequence; creating a preload task for each group of preload address sequences and allocating cache information for each group of preload address sequences; the cache information includes at least the access granularity of the group of preload address sequences and the current state of the addresses in the group of preload address sequences; determining the execution mode of the preload task corresponding to each group of preload address sequences based on the access granularity of each group of preload address sequences and the current state of the addresses in each group of preload address sequences; wherein the execution mode of the preload task includes a coarse-grained consistency directory preload mode and a fine-grained consistency directory preload mode; and performing a preload operation based on the execution mode of each preload task. This application can improve the preloading efficiency of the consistency directory.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, specifically to a method, apparatus, and related equipment for preloading a consistency catalog. Background Technology

[0002] In multiprocessor systems, cache consistency protocols are crucial mechanisms for ensuring data accuracy and reliability. A system typically comprises multiple processor domains, each consisting of multiple processor cores, and each domain is equipped with an independent cache to accelerate data access within its cores. However, when processor cores in different domains access or modify the same data, inconsistencies in the cached data copies can easily arise. The cache consistency protocol ensures that these data copies remain consistent. The consistency catalog, as a core component of the cache consistency protocol, records cache information such as the consistency status of the data copies held by the cache. Accessing and updating the catalog effectively manages cached data consistency.

[0003] However, the preloading efficiency of the current consistency catalog still needs to be improved. Summary of the Invention

[0004] In view of this, embodiments of this application provide a method, apparatus, and related equipment for preloading a consistency catalog to improve the preloading efficiency of the consistency catalog.

[0005] To achieve the above objectives, the embodiments of this application provide the following technical solutions.

[0006] In a first aspect, embodiments of this application provide a method for preloading a consistency catalog, including: Obtain the preload address sequence; Based on the address information in the preloaded address sequence, the addresses in the preloaded address sequence are grouped; A preload task is created for each preload address sequence, and cache information is allocated for each preload address sequence; the cache information includes at least the access granularity of the preload address sequence and the current state of the addresses in the preload address sequence. Based on the access granularity of each preload address sequence and the current state of the addresses in each preload address sequence, the execution mode of the preload task corresponding to each preload address sequence is determined; wherein, the execution mode of the preload task includes a coarse-grained consistent directory preload mode and a fine-grained consistent directory preload mode. Based on the execution mode of each preloaded task, perform the preload operation.

[0007] Optionally, the coarse-grained consistent directory preloading mode is used to indicate preloading on a page-by-page basis, and the fine-grained consistent directory preloading mode is used to indicate preloading on a cache line-by-cache basis.

[0008] Optionally, the step of determining the execution mode of the preload task corresponding to each preload address sequence based on the access granularity of each preload address sequence and the current state of the addresses in each preload address sequence includes: If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is a preset state, then the execution mode of the preload task corresponding to the preload address sequence is either coarse-grained consistent directory preload mode or fine-grained consistent directory preload mode. If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is not a preset state, then the execution mode of the preload task corresponding to the preload address sequence is coarse-grained consistent directory preload mode. If the access granularity of the preload address sequence is fine-grained, then the execution mode of the preload task corresponding to the preload address sequence is fine-grained consistent directory preload mode.

[0009] Optionally, the step of performing the preloading operation based on the execution mode of each preloading task includes: For preload tasks with the execution mode of coarse-grained consistent directory preload mode, perform the preload operation; For preload tasks with the execution mode of fine-grained consistent directory preload mode, perform the preload operation.

[0010] Optionally, the step of grouping the addresses in the preloaded address sequence based on the address information in the preloaded address sequence includes: Analyze the distribution characteristics of addresses in the preloaded address sequence; Based on the distribution characteristics, addresses with the same characteristics are grouped together.

[0011] Optionally, the addresses with the same characteristics include: Addresses belonging to the same page; Addresses with the same consistent directory index granularity.

[0012] Optionally, the step of obtaining the preload address sequence includes: Based on preset conditions, a preloaded address sequence is selected from the local memory range of the consistency master node corresponding to the consistency directory.

[0013] Optionally, the preset conditions include at least one of the following: The type of the index; Does the data access permission need to be downgraded? The update ratio of coarse-grained consistency catalog and fine-grained consistency catalog.

[0014] Optionally, in the step of performing the preload operation based on the execution mode of each preload task, if the current state of the preload task is different from the state supported in the consistency directory, or if the entries in the consistency directory are full, then conflict handling is performed.

[0015] Secondly, embodiments of this application provide a pre-loading device for a consistency catalog, comprising: The acquisition module is used to obtain the preload address sequence; The grouping module is used to group the addresses in the preloaded address sequence based on the address information in the preloaded address sequence; A creation module is used to create a preload task for each set of preload address sequences and allocate cache information for each set of preload address sequences; the cache information includes at least the access granularity of the set of preload address sequences and the current state of the addresses in the set of preload address sequences. The determination module is used to determine the execution mode of the preload task corresponding to each preload address sequence based on the access granularity of each preload address sequence and the current state of the addresses in each preload address sequence; wherein, the execution mode of the preload task includes a coarse-grained consistent directory preload mode and a fine-grained consistent directory preload mode. The execution module is used to perform preload operations based on the execution mode of each preloaded task.

[0016] Thirdly, embodiments of this application provide an electronic device including at least one memory and at least one processor, wherein the memory stores one or more computer-executable instructions, and the processor invokes the one or more computer-executable instructions to execute the preloading method of the consistency catalog as described in the first aspect above.

[0017] Fourthly, embodiments of this application provide a storage medium that stores one or more computer-executable instructions, which, when executed, implement the preloading method for a consistent directory as described in the first aspect above.

[0018] Fifthly, embodiments of this application provide a computer program product including one or more computer-executable instructions, which, when executed, implement the preloading method for a consistency catalog as described in the first aspect above.

[0019] This application provides a preloading method for a consistency directory, comprising: obtaining a preloading address sequence; grouping the addresses in the preloading address sequence based on address information in the preloading address sequence; creating a preloading task for each group of preloading address sequences and allocating cache information for each group of preloading address sequences; the cache information includes at least the access granularity of the group of preloading address sequences and the current state of the addresses in the group of preloading address sequences; determining the execution mode of the preloading task corresponding to each group of preloading address sequences based on the access granularity of each group of preloading address sequences and the current state of the addresses in each group of preloading address sequences; wherein the execution mode of the preloading task includes a coarse-grained consistency directory preloading mode and a fine-grained consistency directory preloading mode; and performing a preloading operation based on the execution mode of each preloading task.

[0020] As can be seen, the preloading method for the consistency directory provided in this application determines the execution mode of the preloading task for each preloading address sequence based on the access granularity of each preloading address sequence and the current state of the addresses in each preloading address sequence. The execution mode of the preloading task includes a coarse-grained consistency directory preloading mode and a fine-grained consistency directory preloading mode, which allows for flexible selection of either the coarse-grained or fine-grained consistency directory preloading mode for different situations, thereby improving the preloading efficiency of the consistency directory. Attached Figure Description

[0021] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of this application. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0022] Figure 1 It is an architecture diagram of a computer system with a multiprocessor system; Figure 2 This is an optional flowchart illustrating the preloading method of the consistency catalog provided in the embodiments of this application; Figure 3 This is a schematic diagram of the structure of the entries in the coarse-grained consistency catalog and the fine-grained consistency catalog provided in the embodiments of this application; Figure 4 This is a schematic diagram of the structure of the index for accessing the coarse-grained consistency directory and the fine-grained consistency directory provided in the embodiments of this application; Figure 5 This is a schematic diagram of an optional structure of the pre-loading device for the consistency catalog provided in the embodiments of this application; Figure 6 This is an optional block diagram of the electronic device provided in the embodiments of this application. Detailed Implementation

[0023] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of the embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0024] For ease of understanding, please refer to Figure 1 An exemplary diagram of a computer system architecture with a multiprocessor system is shown, such as Figure 1 As shown, the computer system includes multiple processor domains 110, and a processor domain may include multiple processor cores; wherein, the processor core is the computing core of the processor domain, responsible for executing instructions and processing data, such as a CPU (Central Processing Unit) core.

[0025] Each processor domain has 120 caches. The cache is an intermediate layer between the processor domain and memory, used to accelerate data access by the processor cores in the processor domain. From the perspective of the cache hierarchy, the cache of any processor domain can be divided into private caches and shared caches. Among them, private caches are independent caches of each processor core in the processor domain, belonging to a single processor core in the processor domain, such as L1 cache, L2 cache, etc. Shared caches are caches shared by multiple processor cores in the processor domain, belonging to the processor domain, and used by multiple processor cores in the processor domain to share data, such as LLC (Last Level Cache).

[0026] Multiple consistency master nodes (130) are core components for implementing cache consistency protocols, responsible for managing cache data consistency across processor domains. For example, when a processor core in one processor domain needs to access data in the cache of another processor domain, the consistency master node can coordinate data access to ensure that the latest state of the data is correctly obtained. Consistency master nodes, as hardware or logical components coordinating cache data consistency operations in multiprocessor systems (i.e., multiprocessor domain systems) or multicore systems, can vary in form and implementation depending on the computer system architecture. For instance, based on consistency protocols such as MESI (Modified Exclusive Shared Invalid), the bus controller can act as a consistency master node.

[0027] Multiple consistency directories 140 are used to record data caching information, including but not limited to: data owner information (e.g., the processor core and / or processor domain where the cache storing the data is located, i.e., the cache location of the data), data consistency status, etc.; the consistency directories work together with the consistency master node, so that the consistency master node obtains data owner information and consistency status and other caching information by querying the consistency directories.

[0028] Specifically, for intra-domain data access, if the data accessed by a processor core exists only in the cache of its own processor domain, it can be read or written directly. That is, the data exists only in the cache of the current processor domain and not in the caches of multiple processor domains, allowing the processor core to perform read and write operations directly. For cross-domain data access, when the data accessed by a processor core in one processor domain is not in the cache of its current processor domain but is in another processor domain, the consistency master node needs to query the consistency catalog to obtain the data's owner information and consistency status information. The consistency master node then transmits the data from the target processor domain (i.e., the processor domain where the cache storing the data is located) to the requesting processor domain (i.e., the processor domain where the processor core requesting access to the data is located) via the bus. Subsequently, the consistency master node updates the cache information of the relevant data in the consistency catalog to ensure consistency for subsequent accesses.

[0029] Bus 150 and multiple system memories 160; wherein, the bus connects all processor domains, consistency master nodes and consistency catalogs, and serves as the main channel for data transmission, cross-domain data transmission and consistency maintenance operations are transmitted through the bus; system memory is the final storage location of data, and when data is not cached in a processor domain or cannot be found in other processor domains, the system memory is accessed through the consistency master nodes and the bus to achieve data access. System memory includes, for example, the memory set up in the computer system.

[0030] The consistency directory, as a component for recording cached data information, establishes a logical mapping between itself and the cache via addresses (e.g., cache addresses). This logical mapping allows computer systems to quickly query and maintain cached data information through the consistency directory. Specifically, entries in the consistency directory are the basic units for recording cached data information. Each entry contains information associated with the cached data, enabling the consistency directory to track the state and maintain the consistency of cached data. In other words, an entry is a record unit in the consistency directory used to store cached data information, including but not limited to: the data's consistency status, owner information, and tags. The data's consistency status includes, for example, Modified, Exclusive, Shared, and Invalid states; the data's owner information, i.e., the cache location, indicates the processor core or processor domain where the cache storing the data resides; and tags are used to identify the data associated with the entry.

[0031] Furthermore, a cache line is the basic storage unit of the cache, used to store a small, contiguous segment of data. The size of each cache line can be set, such as 64 bytes, 128 bytes, etc. The number of cache lines corresponding to an entry in the consistency directory, i.e., the number of cache lines corresponding to the cache information recorded in one entry, is crucial. Consistency directories can be divided into fine-grained consistency directories and coarse-grained consistency directories. A fine-grained consistency directory means that one entry records the cache information of one cache line; that is, one entry in a fine-grained consistency directory corresponds to one cache line. A coarse-grained consistency directory means that one entry records the cache information of multiple cache lines. Compared to a fine-grained consistency directory, a coarse-grained consistency directory has fewer entries than a fine-grained consistency directory. However, the specific number of entries in a coarse-grained consistency directory depends on the specific granularity of the coarse-grained consistency directory (the specific number of cache lines corresponding to one entry) and the ratio between the coarse-grained consistency directory and the cache lines to be implemented. Therefore, the number of entries in a coarse-grained consistency directory is not necessarily less than the number of cache lines in the computer system.

[0032] Before conducting simulation verification, in order to ensure that the cache and consistency directory have the initial conditions of specific states when simulating the real operating environment, it is necessary to perform a pre-loading operation of the consistency directory, that is, to store consistency state information such as shared, exclusive, and invalid in the consistency directory in advance, so as to provide a basis for maintaining cache consistency when subsequent tasks are executed.

[0033] However, the inventors found that the current consistency directory uses a single-granularity directory structure to track the status and owner of cache lines. During cache preloading or consistency state initialization, operations are only performed on a single-granularity directory. The preloading requires first randomly determining a consistency state, and then determining the position of the cache line in the cache and consistency directory based on the determined state. This makes it difficult to fully utilize the advantages of different granularity directories, resulting in poor scalability. Furthermore, each preloading is only performed on a single-granularity directory. When performing large-scale cache state preloading, it is impossible to dynamically select the optimal granularity directory, leading to low preloading efficiency.

[0034] In view of this, embodiments of this application provide a preloading method for a consistency directory, comprising: obtaining a preloading address sequence; grouping the addresses in the preloading address sequence based on address information in the preloading address sequence; creating a preloading task for each group of preloading address sequences and allocating cache information for each group of preloading address sequences; the cache information includes at least the access granularity of the group of preloading address sequences and the current state of the addresses in the group of preloading address sequences; determining the execution mode of the preloading task corresponding to each group of preloading address sequences based on the access granularity of each group of preloading address sequences and the current state of the addresses in each group of preloading address sequences; wherein the execution mode of the preloading task includes a coarse-grained consistency directory preloading mode and a fine-grained consistency directory preloading mode; and performing a preloading operation based on the execution mode of each preloading task.

[0035] As can be seen, the preloading method for the consistency directory provided in this application determines the execution mode of the preloading task for each preloading address sequence based on the access granularity of each preloading address sequence and the current state of the addresses in each preloading address sequence. The execution mode of the preloading task includes a coarse-grained consistency directory preloading mode and a fine-grained consistency directory preloading mode, which allows for flexible selection of either the coarse-grained or fine-grained consistency directory preloading mode for different situations, thereby improving the preloading efficiency of the consistency directory.

[0036] The technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments.

[0037] refer to Figure 2 , Figure 2 This is an optional flowchart illustrating the pre-loading method of the consistency catalog provided in this application embodiment. For example... Figure 2 As shown, the preloading method for the consistency catalog may include the following steps: Step S100: Obtain the preload address sequence.

[0038] In an optional implementation, the step of obtaining the preload address sequence may include: selecting a preload address sequence from the local memory range of the consistency master node corresponding to the consistency directory based on preset conditions, thereby providing a data basis for subsequent preload operations. Here, the local memory of the consistency master node refers to the memory resources owned by the consistency master node itself, used to store data, instructions, and runtime state information.

[0039] Understandably, if the selected preload address sequence contains too many preload addresses, exceeding the capacity of the consistency directory, some state information will not be traceable, thus compromising consistency. Therefore, the number of preload addresses in the selected preload address sequence should not exceed the capacity of the consistency directory.

[0040] In a specific implementation, the preset conditions include at least one of the following: the type of index; whether the data permissions need to trigger a degradation; and the update ratio of the coarse-grained consistency directory and the fine-grained consistency directory. This application embodiment can select a preloaded address sequence from the local memory range of the consistency master node corresponding to the consistency directory based on the index type, whether the data permissions need to trigger a degradation, and the update ratio of the coarse-grained and fine-grained consistency directories. In the data structure, the index is an auxiliary structure used to accelerate data lookup. In the consistency directory, the index is used to quickly locate the consistency status information of a specific memory block (such as a cache line or page).

[0041] Specifically, accessing an index in the coarse-grained consistency directory requires checking or updating the consistency state information of the entire page, while accessing an index in the fine-grained consistency directory requires checking or updating the consistency state information of a single cache line. The selected preload address sequence can include the addresses of the required page addresses or cache lines.

[0042] In a fine-grained consistency directory, when other nodes need to read or modify cache lines, the master node must relinquish its exclusive privileges. At this point, the data's privileges need to be downgraded from high to low. By including the addresses of the required cache lines in the selected preload address sequence, cache lines that trigger degradation can be identified and processed in advance, transforming passive degradation (handling during real-time response to requests) into proactive preprocessing (completing degradation operations in advance), thereby reducing real-time latency.

[0043] During consistency maintenance, the update ratio of the coarse-grained consistency catalog and the fine-grained consistency catalog, for example, 70% coarse-grained consistency catalog and 30% fine-grained consistency catalog, indicates that the system prefers to maintain consistency on a page-by-page basis, and only maintains consistency on a cache line-by-cache basis for a small amount of critical data. The selected preload address sequence can include the addresses of the required page addresses or cache lines.

[0044] Step S200: Based on the address information in the preloaded address sequence, group the addresses in the preloaded address sequence.

[0045] In an optional implementation, the step of grouping addresses in the preloaded address sequence based on address information may include: analyzing the distribution characteristics of addresses in the preloaded address sequence; and grouping addresses with the same characteristics into the same group based on the distribution characteristics. This application can identify addresses with specific characteristics by analyzing the distribution characteristics of addresses in the preloaded address sequence. Then, based on the distribution characteristics, addresses with the same characteristics are grouped into the same group, thereby grouping addresses in the preloaded address sequence and enabling more efficient management and processing of addresses in the preloaded address sequence.

[0046] The addresses with the same characteristics may include: addresses belonging to the same page (e.g., addresses within a continuous 4KB range); addresses with the same consistency directory index granularity (e.g., addresses with the same coarse-grained consistency directory index, or addresses with the same fine-grained consistency directory index).

[0047] Coarse-grained consistency directories are indexed by pages (typically 4KB). Addresses within a consecutive 4KB range may belong to the same page and share a consistent state. Therefore, addresses belonging to the same page can be grouped together for page-level consistency maintenance. For example, the high-order bits of the address (such as the page number) can be checked; if they are the same, they belong to the same page.

[0048] Coarse-grained consistency directory indexes point to the consistency status of remote memory pages, and pages with the same index need to be maintained synchronously. Therefore, addresses with the same coarse-grained consistency directory index can be grouped together for page-level consistency maintenance. For example, the corresponding field in the coarse-grained consistency directory index of an address can be checked; if they are the same, they are grouped together.

[0049] Fine-grained consistency directory indexes point to the state of local cache lines, and cache lines with the same index share a cache set. Therefore, addresses with the same fine-grained consistency directory index can be grouped together for cache line-level consistency maintenance. For example, you can check the fields corresponding to the fine-grained consistency directory indexes in the addresses; if they are the same, they are grouped together.

[0050] Step S300: Create a preload task for each preload address sequence and allocate cache information for each preload address sequence.

[0051] After grouping the addresses in the preloaded address sequence, a preload task can be created for each group, and cache information can be allocated to each group. In a specific implementation, a task structure can be generated for each group, and cache information can be allocated within the task structure. The cache information includes at least the access granularity of the group and the current state of the addresses in the group.

[0052] The access granularity refers to the level of granularity at which access is performed (e.g., coarse-grained or fine-grained). If the addresses in the preload address sequence are page-level, the access granularity of the preload address sequence is assigned as coarse-grained. If the addresses in the preload address sequence are cache line-level, the access granularity of the preload address sequence is assigned as fine-grained.

[0053] The current state refers to the current consistency state of the page or cache line corresponding to the table entry (consistency state can be simply referred to as state). For example, in consistency protocols such as MESI, the state may be Modified (M), Shared (S), Exclusive (E), Invalid (I), etc. If the addresses in the preload address sequence are page-level, then page-level states (such as Modified, Shared) are assigned. If the addresses in the preload address sequence are cache line-level, then cache line-level states (such as Exclusive, Invalid) are assigned.

[0054] In an optional implementation, the cache information may further include owner information, i.e., the cache location of the page or cache line corresponding to the entry, indicating the processor core or processor domain holding the page or cache line. If the addresses in the preload address sequence are page-level, a unified device ID is assigned to the addresses in the preload address sequence to ensure consistent page permissions. If the addresses in the preload address sequence are cache line-level, different device IDs can be assigned to the addresses in the preload address sequence.

[0055] In an optional implementation, to facilitate the execution of subsequent preload operations, the preload task can be converted into a specific hardware or software executable request. For example, a preload request including cache information can be generated based on the cache information allocated for each set of preload address sequences.

[0056] Step S400: Based on the access granularity of each group of preloaded address sequences and the current state of the addresses in each group of preloaded address sequences, determine the execution mode of the preloaded task corresponding to each group of preloaded address sequences.

[0057] The preloading task can be executed in two modes: a coarse-grained consistent directory preloading mode and a fine-grained consistent directory preloading mode. The coarse-grained consistent directory preloading mode indicates preloading on a page-by-page basis and is suitable for processing contiguous addresses or large blocks of data. The fine-grained consistent directory preloading mode indicates preloading on a cache line-by-cache basis and is suitable for processing cache lines.

[0058] In an optional implementation, the step of determining the execution mode of the preload task corresponding to each preload address sequence based on the access granularity of each preload address sequence and the current state of the addresses in each preload address sequence may include: If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is a preset state, then the execution mode of the preload task corresponding to the preload address sequence is either coarse-grained consistent directory preload mode or fine-grained consistent directory preload mode. For example, the access granularity of this preloaded address sequence is coarse-grained, and the current state of the addresses in this preloaded address sequence is exclusive. Exclusivity itself is a cache line-level state, but if the data belongs to a complete page and that page is not shared by other nodes, then the entire page can be considered logically exclusive. For example, if all cache lines of a page are in the Exclusive state, and no other node holds a copy of that page, then the page can be considered "coarse-grained exclusive." Therefore, pages in the exclusive state can be placed in either the coarse-grained consistency directory or the cache lines within that page can be placed in the fine-grained consistency directory. For example, if the system primarily performs coarse-grained operations (such as large file transfers), placing exclusive pages in the coarse-grained consistency directory can reduce the number of directory entries and lower maintenance overhead. If the system requires frequent access to specific cache lines within a page (such as database indexes), placing exclusive cache lines in the fine-grained consistency directory can improve locality and concurrency performance.

[0059] If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is not a preset state, then the execution mode of the preload task corresponding to the preload address sequence is coarse-grained consistent directory preload mode. If the access granularity of the preload address sequence is fine-grained, then the execution mode of the preload task corresponding to the preload address sequence is fine-grained consistent directory preload mode.

[0060] In optional implementations, resource allocation can be increased for execution modes with a large number of preloaded tasks to improve system performance and processing efficiency. For example, if the number of preloaded tasks in the coarse-grained consistent directory preload mode is large (e.g., exceeding 70%), resource allocation for the coarse-grained consistent directory can be increased.

[0061] In an optional implementation, resource conflicts can occur when multiple preloaded tasks request the same physical address range. In this case, priority can be prioritized for grouping. Specifically, based on the priorities defined in the simulation sequence, groups with higher importance can be processed first. For example, critical tasks or frequently accessed groups can be allocated resources preferentially. Alternatively, resource allocation can be adjusted between the coarse-grained and fine-grained consistency directories to ensure more reasonable allocation and better alignment with actual needs.

[0062] Step S500: Perform preloading operations based on the execution mode of each preloaded task.

[0063] In a coarse-grained consistency directory, a single entry can record cache information for multiple cache lines. In a fine-grained consistency directory, a single entry can record cache information for a single cache line. For example... Figure 3 As shown, the table structure of a coarse-grained consistency directory may include the following fields: Tag, State, SecVal, and Owner; the table structure of a fine-grained consistency directory may include the following fields: Tag, State, LDV, RSV, and Owner.

[0064] In the coarse-grained consistency directory, the tag is used to identify the uniqueness of a page (such as the high-order part of the page number); the state is used to indicate the consistency status of the page (such as Modified, Shared, Invalid); the SecVal is used to store the values ​​of multiple sub-sections (Sections), reducing the number of directory entries and achieving compression; and the owner is used to indicate the device or processor core ID that holds the permissions for the page.

[0065] In the table structure of the fine-grained consistency catalog, the Tag is used to identify the uniqueness of the cache line (such as the tag part of the address); the State is used to indicate the consistency status of the cache line (such as Exclusive, Shared, Invalid); the LDV (LocalData Version) is used to indicate the local data version number for tracking modifications; the RSV (Reserved) is a reserved field for future expansion or specific protocol use; and the Owner is used to indicate the device or processor core ID that holds the permissions for the cache line.

[0066] like Figure 4As shown, the fields for accessing the index of the coarse-grained consistency directory can include rpf_sec, rpf_bank, rpf_index, and rpf_tag. rpf_bank determines which storage block in the coarse-grained consistency directory the address maps to; rpf_sec determines which group within that block; rpf_index determines the specific entry within that group; and rpf_tag identifies the uniqueness of the page, determined by comparing it with a query tag. If multiple addresses have the same rpf_tag, rpf_index, rpf_bank, State, and Owner, but different SecVal values, their SecVal values ​​can be merged into the same entry in the coarse-grained consistency directory, thus achieving compression.

[0067] Fields for accessing the index of the fine-grained consistency directory can include lpf_bank, lpf_index, and lpf_tag. lpf_bank determines which storage block in the fine-grained consistency directory the address maps to; lpf_index determines the specific entry within that storage block; and lpf_tag identifies the uniqueness of a cache line, used to determine a hit by comparing it with a query tag.

[0068] In a specific implementation, the step of performing preloading operations based on the execution mode of each preloading task may include: performing preloading operations on preloading tasks with a coarse-grained consistent directory preloading mode; and performing preloading operations on preloading tasks with a fine-grained consistent directory preloading mode.

[0069] In this application, to improve preloading efficiency, preloading operations can be performed first on preloading tasks in the coarse-grained consistency directory preloading mode. Specifically, preloading operations of consecutive addresses that can be mapped to the same coarse-grained consistency directory entry are merged into one operation. Each group in a coarse-grained consistency directory has a variable capacity; the number of cache lines it stores is not fixed but dynamically adjusted according to system configuration or actual needs, accommodating a minimum of 8 cache lines and a maximum of 128 cache lines. When a group in the coarse-grained consistency directory reaches its maximum capacity (e.g., full of 128 cache lines) and there are still remaining preloading tasks to process, the system can temporarily store these excess preloading tasks in the preloading task queue of the fine-grained consistency directory. This ensures that when coarse-grained consistency directory resources are scarce, preloading tasks will not be discarded due to capacity limitations but will be buffered and processed through the fine-grained consistency directory, thereby improving system resource utilization and task processing flexibility.

[0070] Understandably, since the coarse-grained consistency directory does not support false positives (i.e., it cannot produce incorrect positive judgments), the state of the cache lines in the upstream processor needs to be consistent with the state in the coarse-grained consistency directory to ensure data consistency.

[0071] After the preload tasks in the coarse-grained consistent directory preload mode are processed, the preload tasks in the fine-grained consistent directory preload mode can be processed next. Since shared-mode is more common in multi-core systems, to improve overall system performance, the preload operations for shared-mode preload tasks can be performed first.

[0072] Specifically, when the state of an entry in the fine-grained consistency catalog is F (indicating free or allocable), the system combines the random bit of the LDV / RSV field to randomly select a cache line in an upstream processor within the legal range, and determines the initial state of the preload (such as S state or I state) based on the random value. For example, if the random bit indicates that the preload is in the S state, the data is loaded into the target cache and marked as shared, while the catalog information is updated to maintain consistency.

[0073] In the specific implementation, during the preloading operation step based on the execution mode of each preloading task, if the current state of the preloading task differs from the states supported in the consistency directory, or if the entries in the consistency directory are full, conflict handling is performed. Considering that entries in the fine-grained consistency directory only support specific states (e.g., only Invalid, Exclusive, Shared), but the states required by preloading tasks in the coarse-grained consistency directory may exceed its supported range (e.g., S1 or F1), forcibly writing unsupported states into the fine-grained consistency directory may cause consistency errors. Therefore, conflict handling is necessary. For example, if S1 / F1 is a variant of a state supported in the fine-grained consistency directory, it can be mapped to a state supported in the fine-grained consistency directory, and sub-state information can be recorded through additional fields (e.g., LDV / RSV). If the states are completely incompatible, preloading can be rejected and an error reported.

[0074] If a group in the fine-grained consistency directory is full and cannot accept new preload tasks, the tasks will remain in the directory, potentially causing preload delays or data inconsistencies. Therefore, conflict resolution is necessary. For example, an LRU (Least Recently Used) or random replacement algorithm can be used to evict an entry in the group to free up space. Alternatively, the task can be moved to another available group within the same directory.

[0075] As can be seen, the preloading method for the consistency directory provided in this application determines the execution mode of the preloading task for each preloading address sequence based on the access granularity of each preloading address sequence and the current state of the addresses in each preloading address sequence. The execution mode of the preloading task includes a coarse-grained consistency directory preloading mode and a fine-grained consistency directory preloading mode, which allows for flexible selection of either the coarse-grained or fine-grained consistency directory preloading mode for different situations, thereby improving the preloading efficiency of the consistency directory.

[0076] The following describes the preloading device for the consistency catalog provided in the embodiments of this application. The preloading device for the consistency catalog described below can be considered as a software or hardware functional module required to implement the preloading method for the consistency catalog provided in the embodiments of this application. The content of the preloading device for the consistency catalog described below can be referred to in correspondence with the method described above.

[0077] In the optional implementation, Figure 5 An exemplary schematic diagram of an optional structure of a consistency directory preloading device provided in an embodiment of this application is shown. The consistency directory preloading device is used to implement the consistency directory preloading method provided in an embodiment of this application, such as... Figure 5 As shown, the pre-loading device for the consistency catalog may include: Module 11 is used to obtain the preload address sequence; Grouping module 12 is used to group the addresses in the preloaded address sequence based on the address information in the preloaded address sequence; The creation module 13 is used to create a preload task for each group of preload address sequences and allocate cache information for each group of preload address sequences; the cache information includes at least the access granularity of the group of preload address sequences and the current state of the addresses in the group of preload address sequences. The determination module 14 is used to determine the execution mode of the preload task corresponding to each preload address sequence based on the access granularity of each group of preload address sequences and the current state of the addresses in each group of preload address sequences; wherein, the execution mode of the preload task includes a coarse-grained consistent directory preload mode and a fine-grained consistent directory preload mode. Execution module 15 is used to perform preload operations based on the execution mode of each preload task.

[0078] Optionally, the coarse-grained consistent directory preloading mode is used to indicate preloading on a page-by-page basis, and the fine-grained consistent directory preloading mode is used to indicate preloading on a cache line-by-cache basis.

[0079] Optionally, the determining module 14 is used to determine the execution mode of the preload task corresponding to each preload address sequence based on the access granularity of each group of preload address sequences and the current state of the addresses in each group of preload address sequences, including: If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is a preset state, then the execution mode of the preload task corresponding to the preload address sequence is either coarse-grained consistent directory preload mode or fine-grained consistent directory preload mode. If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is not a preset state, then the execution mode of the preload task corresponding to the preload address sequence is coarse-grained consistent directory preload mode. If the access granularity of the preload address sequence is fine-grained, then the execution mode of the preload task corresponding to the preload address sequence is fine-grained consistent directory preload mode.

[0080] Optionally, the execution module 15 is used to perform preloading operations based on the execution mode of each preloading task, including: For preload tasks with the execution mode of coarse-grained consistent directory preload mode, perform the preload operation; For preload tasks with the execution mode of fine-grained consistent directory preload mode, perform the preload operation.

[0081] Optionally, the grouping module 12 is used to group the addresses in the preloaded address sequence based on the address information in the preloaded address sequence, including: Analyze the distribution characteristics of addresses in the preloaded address sequence; Based on the distribution characteristics, addresses with the same characteristics are grouped together.

[0082] Optionally, the addresses with the same characteristics include: Addresses belonging to the same page; Addresses with the same consistent directory index granularity.

[0083] Optionally, the acquisition module 11 is used to acquire a preload address sequence, including: Based on preset conditions, a preloaded address sequence is selected from the local memory range of the consistency master node corresponding to the consistency directory.

[0084] Optionally, the preset conditions include at least one of the following: The type of the index; Does the data access permission need to be downgraded? The update ratio of coarse-grained consistency catalog and fine-grained consistency catalog.

[0085] Optionally, the execution module 15 is used to perform conflict handling during the preloading operation based on the execution mode of each preloading task. If the current state of the preloading task is different from the state supported by the consistency catalog, or if the entries in the consistency catalog are full, then conflict handling is performed.

[0086] This application also provides an electronic device that may include at least one memory and at least one processor. The memory stores one or more computer-executable instructions, and the processor invokes the one or more computer-executable instructions to execute the preloading method of the conformance catalog provided in this application.

[0087] As an optional implementation, refer to Figure 6 , Figure 6 This is an optional block diagram of the electronic device provided in the embodiments of this application. For example... Figure 6 As shown, the electronic device may include: at least one processor 21, at least one communication interface 22, at least one memory 23 and at least one communication bus 24.

[0088] In this embodiment, the number of processor 21, communication interface 22, memory 23 and communication bus 24 is at least one, and processor 21, communication interface 22 and memory 23 communicate with each other through communication bus 24.

[0089] Optionally, the processor 21 may be a CPU (Central Processing Unit), GPU (Graphics Processing Unit), NPU (Neural-network Processing Unit), FPGA (Field Programmable Gate Array), TPU (Tensor Processing Unit), AI chip, ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement the embodiments of this application.

[0090] Optionally, the communication interface 22 can be an interface for a communication module used for network communication.

[0091] The memory 23 may include high-speed RAM, and may also include non-volatile memory, such as at least one disk storage device. The memory 23 stores one or more computer-executable instructions, which the processor 21 invokes to execute the pre-loading method of the conformance catalog provided in this application embodiment.

[0092] This application also provides a storage medium that stores one or more computer-executable instructions. When the one or more computer-executable instructions are executed, the preloading method for the consistency directory provided in this application is implemented.

[0093] This application also provides a computer program product that may include one or more computer-executable instructions. When the one or more computer-executable instructions are executed, they implement the preloading method for the consistency catalog provided in this application.

[0094] The foregoing describes multiple embodiment schemes provided by the embodiments of this application. The optional methods described in each embodiment scheme can be combined and cross-referenced with each other without conflict, thereby extending to a variety of possible embodiment schemes. These can all be considered as the embodiment schemes disclosed and published by the embodiments of this application.

[0095] While the embodiments disclosed above are described in this application, this application is not limited thereto. Any person skilled in the art can make various modifications and alterations without departing from the spirit and scope of this application; therefore, the scope of protection of this application should be determined by the scope defined in the claims.

Claims

1. A method for preloading a consistent directory, characterized in that, include: Obtain the preload address sequence; Based on the address information in the preloaded address sequence, the addresses in the preloaded address sequence are grouped; A preload task is created for each preload address sequence, and cache information is allocated for each preload address sequence; the cache information includes at least the access granularity of the preload address sequence and the current state of the addresses in the preload address sequence. Based on the access granularity of each preload address sequence and the current state of the addresses in each preload address sequence, the execution mode of the preload task corresponding to each preload address sequence is determined; wherein, the execution mode of the preload task includes a coarse-grained consistent directory preload mode and a fine-grained consistent directory preload mode. Based on the execution mode of each preloaded task, perform the preload operation.

2. The preloading method for the consistency directory according to claim 1, characterized in that, The coarse-grained consistent directory preload mode is used to indicate preload on a page-by-page basis, while the fine-grained consistent directory preload mode is used to indicate preload on a cache line-by-cache basis.

3. The preloading method for the consistency directory according to claim 1, characterized in that, The step of determining the execution mode of the preload task corresponding to each preload address sequence based on the access granularity of each preload address sequence and the current state of the addresses in each preload address sequence includes: If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is a preset state, then the execution mode of the preload task corresponding to the preload address sequence is either coarse-grained consistent directory preload mode or fine-grained consistent directory preload mode. If the access granularity of the preload address sequence is coarse-grained and the current state of the addresses in the preload address sequence is not a preset state, then the execution mode of the preload task corresponding to the preload address sequence is coarse-grained consistent directory preload mode. If the access granularity of the preload address sequence is fine-grained, then the execution mode of the preload task corresponding to the preload address sequence is fine-grained consistent directory preload mode.

4. The preloading method for the consistency directory according to claim 3, characterized in that, The steps for performing pre-loading operations based on the execution mode of each pre-loading task include: For preload tasks with the execution mode of coarse-grained consistent directory preload mode, perform the preload operation; For preload tasks with the execution mode of fine-grained consistent directory preload mode, perform the preload operation.

5. The preloading method for the consistency directory according to claim 1, characterized in that, The step of grouping addresses in the preloaded address sequence based on address information in the preloaded address sequence includes: Analyze the distribution characteristics of addresses in the preloaded address sequence; Based on the distribution characteristics, addresses with the same characteristics are grouped together.

6. The preloading method for a consistency directory according to claim 5, characterized in that, The addresses with the same characteristics include: Addresses belonging to the same page; Addresses with the same consistent directory index granularity.

7. The preloading method for a consistency directory according to claim 1, characterized in that, The step of obtaining the preload address sequence includes: Based on preset conditions, a preloaded address sequence is selected from the local memory range of the consistency master node corresponding to the consistency directory.

8. The preloading method for a consistency catalog according to claim 7, characterized in that, The preset conditions include at least one of the following: The type of the index; Does the data access permission need to be downgraded? The update ratio of coarse-grained consistency catalog and fine-grained consistency catalog.

9. The preloading method for a consistency directory according to claim 1, characterized in that, In the step of performing preloading operations based on the execution mode of each preloading task, if the current state of the preloading task is different from the state supported by the consistency catalog, or if the entries in the consistency catalog are full, then conflict handling is performed.

10. A preloading device for a consistency catalog, characterized in that, include: The acquisition module is used to obtain the preload address sequence; The grouping module is used to group the addresses in the preloaded address sequence based on the address information in the preloaded address sequence; A module is created to create preload tasks for each set of preload address sequences and to allocate cache information for each set of preload address sequences; The cache information includes at least the access granularity of the preloaded address sequence and the current state of the addresses in the preloaded address sequence; The determination module is used to determine the execution mode of the preload task corresponding to each preload address sequence based on the access granularity of each preload address sequence and the current state of the addresses in each preload address sequence; wherein, the execution mode of the preload task includes a coarse-grained consistent directory preload mode and a fine-grained consistent directory preload mode. The execution module is used to perform preload operations based on the execution mode of each preloaded task.

11. An electronic device, characterized in that, It includes at least one memory and at least one processor, the memory storing one or more computer-executable instructions, the processor invoking the one or more computer-executable instructions to perform the preloading method of the conformance catalog as described in any one of claims 1 to 9.

12. A storage medium, characterized in that, The storage medium stores one or more computer-executable instructions, which, when executed, implement the preloading method of the consistency directory as described in any one of claims 1 to 9.

13. A computer program product, characterized in that, It includes one or more computer-executable instructions, which, when executed, implement the preloading method of the consistency catalog as described in any one of claims 1 to 9.