A Ceph-based hierarchical cache method, device, equipment and product
By optimizing data processing in the caching and storage layers of Ceph's tiered cache using data segments as the granularity of operations, the problem of low performance in Ceph's tiered cache is solved, resulting in more efficient data read and write performance and reduced write latency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA TELECOM CLOUD TECH CO LTD
- Filing Date
- 2022-07-29
- Publication Date
- 2026-06-26
AI Technical Summary
Ceph's tiered caching scheme operates at the object level, which can lead to lower performance in certain scenarios, especially in RBD cache pools and EC pools, where performance can be even lower than not using tiered caching at all.
A Ceph-based hierarchical caching approach is adopted, with the data segment of an object as the operation granularity. The data segment is cached in memory through the caching layer and the metadata is stored on disk. The memory settings ensure that the data is up-to-date and can be recovered in the event of a power failure. The caching layer and the storage layer work together to ensure the full amount of data and optimize the operation granularity to improve performance.
It effectively reduces the amplification of user read and write operations, reduces write data latency, and improves the performance of tiered caching.
Smart Images

Figure CN115454888B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing, and specifically to a Ceph-based hierarchical caching method, apparatus, device, and product. Background Technology
[0002] Ceph is a distributed storage system whose decentralized design offers better performance, reliability, and scalability. Currently, Ceph's tiered caching scheme operates at the object level, caching the entire object based on different patterns. In certain scenarios, this can even result in lower performance than not using tiered caching at all.
[0003] Therefore, improving the performance of Ceph's tiered storage is an important issue that urgently needs to be addressed. Summary of the Invention
[0004] In view of this, embodiments of the present invention provide a Ceph-based hierarchical caching method, apparatus, device, and product to solve the problem of low performance of Ceph-based hierarchical caching.
[0005] According to a first aspect, embodiments of the present invention provide a Ceph-based hierarchical caching method applied to a server, the method comprising:
[0006] The caching layer determines the client's user request and the requested data segment within that request.
[0007] Determine the object corresponding to the requested data segment and the layer where the corresponding object resides; the cache layer caches the object's data segment in memory and stores metadata on disk, and based on the data segment's popularity, the cache layer passes the cached data segment in memory to the storage layer;
[0008] Based on the identified objects and layers, the requested data segments are read and written.
[0009] In conjunction with the first aspect, in the first embodiment of the first aspect, when the user request is a user read request, the method includes:
[0010] The client's user write request is determined by the caching layer, and the user write request is formatted as a log; the user write request carries the object to be written, the data segment to be written, and the data to be written.
[0011] Based on the logs, determine the objects to be written and their metadata;
[0012] Write logs to a log object; the log object is set in a cache pool;
[0013] Write the log object and its metadata to the cache layer's disk;
[0014] Once the log write is confirmed to be successful, the memory is updated based on the log content, a successful write flag is generated, and the successful write flag is returned to the client.
[0015] In conjunction with the first aspect, in the second embodiment of the first aspect, when a user requests to write a request for the user, the method includes:
[0016] The cache layer determines the client's user read request and identifies the requested data segment within that read request.
[0017] Determine the object corresponding to the requested data segment and the layer in which the corresponding object resides;
[0018] If the layer where the corresponding object resides is only the cache layer, the requested data segment is determined from the memory of the cache layer, and the determined requested data segment is returned to the client.
[0019] The process involves determining the layer containing the corresponding object, including the storage layer, retrieving the requested data segment from the cache layer's memory as the first data segment, retrieving the requested data segment from the storage layer as the second data segment, concatenating the first and second data segments to obtain the requested data segment, and returning the concatenated requested data segment to the client.
[0020] In conjunction with the first embodiment of the first aspect, in the third embodiment of the first aspect, when the user request is a user metadata request, the method includes:
[0021] The caching layer determines the client's user metadata request and identifies the requested data segment within that user metadata request.
[0022] Determine the object corresponding to the requested data segment and the layer in which the corresponding object resides;
[0023] Based on the identified objects and layers, metadata is read and written.
[0024] In conjunction with the first aspect, in the fourth embodiment of the first aspect, the method further includes the following steps:
[0025] Identify the log object and log corresponding to the transmitted data segment, and delete the corresponding log from the corresponding log object; the data segment is transferred from the cache layer's memory to the storage layer.
[0026] In conjunction with the first aspect, in the fifth embodiment of the first aspect, the method further includes the following steps:
[0027] Confirm the successful restart signal and identify all log objects in the cache layer corresponding to the restarted node;
[0028] Parse the logs from the identified log objects and reconstruct the memory data segments based on the parsed logs.
[0029] According to a second aspect, embodiments of the present invention also provide a Ceph-based hierarchical caching device applied to a server, the device comprising:
[0030] The first determining module is used to determine the client's user request and the data segment requested in the user request through the cache layer;
[0031] The second determination module is used to determine the object corresponding to the requested data segment and the layer where the corresponding object is located; the cache layer caches the data segment of the object in memory and stores the metadata on disk, and based on the popularity of the data segment, the cache layer passes the data segment cached in memory to the storage layer.
[0032] The data read / write module is used to read and write requested data segments based on defined objects and layers.
[0033] According to a third aspect, embodiments of the present invention also provide an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of any of the Ceph-based hierarchical caching methods described above.
[0034] According to a fourth aspect, embodiments of the present invention also provide a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the Ceph-based hierarchical caching method as described above.
[0035] According to a fifth aspect, embodiments of the present invention also provide a computer program product, including a computer program that, when executed by a processor, implements the steps of any of the Ceph-based hierarchical caching methods described above.
[0036] The Ceph-based hierarchical caching method, apparatus, device, and product provided by this invention no longer operate on the entire object, but on the data segment of the object. By optimizing the operation granularity of Ceph-based hierarchical caching, and by caching the object's data segment in memory through the cache layer and storing metadata on disk through the cache layer, the memory settings ensure that the cached data is up-to-date and that the memory data is recoverable after power failure. The cache layer and storage layer together ensure that all data is available. This application can effectively reduce the amplification of user read and write operations and reduce the latency when users write data, thereby improving the performance of hierarchical caching. Attached Figure Description
[0037] The features and advantages of the invention will be more clearly understood by referring to the accompanying drawings, which are schematic and should not be construed as limiting the invention in any way. In the drawings:
[0038] Figure 1 This paper illustrates one of the flowcharts of the Ceph-based hierarchical caching method provided by the present invention.
[0039] Figure 2 The second schematic diagram of the Ceph-based hierarchical caching method provided by the present invention is shown.
[0040] Figure 3 The third flowchart of the Ceph-based hierarchical caching method provided by this invention is shown.
[0041] Figure 4 The fourth flowchart of the Ceph-based hierarchical caching method provided by this invention is shown.
[0042] Figure 5 The fifth flowchart illustrates the Ceph-based hierarchical caching method provided by this invention.
[0043] Figure 6 A schematic diagram of the structure of the Ceph-based hierarchical caching device provided by the present invention is shown;
[0044] Figure 7 A schematic diagram of the structure of the electronic device provided by the present invention is shown. Detailed Implementation
[0045] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0046] Ceph is a distributed storage system whose decentralized design provides better performance, reliability, and scalability. The core and foundation of Ceph is the underlying RADOS (Reliable Autonomic Distributed ObjectStore) cluster. RADOS provides object storage services; client data is stored in the RADOS cluster as objects. Therefore, the lowest-level storage unit in Ceph is the object, and each object contains both raw data and metadata. Ceph clients use a hash algorithm to calculate which Placement Group (PG) an object is stored on, and then use a CRUSH algorithm to calculate which Object Storage Daemon (OSD) that PG resides on. The client then directly communicates with that OSD to transfer data.
[0047] Because stored data has a certain level of access frequency, it's not accessed evenly; most applications only access a small portion of the data. This smaller portion is called "hot" data. Please refer to [link / reference]. Figure 1 Ceph itself provides a tiered caching scheme: the cache tier provides better I / O performance for Ceph clients, while data is stored in the storage tier. In actual deployments, pools created using relatively fast / expensive devices (such as SSDs) are commonly used as the cache tier, while erasure coding (EC) pools or pools created using relatively slow / inexpensive devices (such as HDDs) are used as economical storage pools. Ceph's object manager is responsible for the location of objects, and the tiering agent determines when to migrate data between the two pools. Therefore, the cache and storage tiers are completely transparent to Ceph clients. For some frequently accessed data in the backend data pool, this frequently accessed hot data can be promoted to a high-performance, small-capacity cache tier, while a large amount of cold data can be placed in an economical, large-capacity storage pool. The cache tier improves the performance of accessing hot data while reducing storage costs.
[0048] Therefore, a tiered proxy is essentially a tiered cache data migration solution. The tiered proxy is responsible for the automatic migration of data between the cache layer and the storage layer. The tiered proxy allows cold data in the cache layer to be freely and safely migrated to the storage layer, thus saving storage costs. The tiered proxy can also automatically migrate hot data from the storage layer to the cache layer, thus improving the performance of accessing hot data. Tiered caching can provide users with better I / O performance, and administrators have the right to configure how the data is migrated.
[0049] However, Ceph's tiered caching design has certain flaws: its basic unit of operation is the object, meaning the cache pool stores complete objects, and cache promotion, flushing, and eviction operations are all performed at the object level. In certain specific scenarios, such as when the RBD (RADOS Block Device) cache pool is a replica data pool or an EC (Extended Cache), its performance is even lower than not using a tiered cache.
[0050] The following section explains several main patterns of tiered caching:
[0051] Writeback Mode: For write requests, after the write operation is completed at the cache layer, the cache layer's proxy thread (background thread) is responsible for flushing the data to the storage layer. For read requests, it checks if the cache layer is hit. If it is hit, the data is read directly from the cache layer. If it is not hit, it can be redirected to the storage layer. If the object has been accessed recently, it is considered hot data and can be promoted to the cache layer.
[0052] Read-only mode: Write requests are directly redirected to the storage layer for access; read requests that hit the cache layer are processed directly, while those that do not hit the cache layer need to be promoted from the storage layer to the cache layer to complete the read request, and the next read will directly hit the cache.
[0053] Read Proxy Mode: Similar to write-back mode, read requests are handled by the caching layer retrieving the data from the data layer. The caching layer does not save the retrieved object but directly responds to the client. In other words, both read and write requests are proxied; instead of forwarding, the caching layer performs the operation on behalf of the client, and it does not save the data itself.
[0054] As can be seen, in a tiered cache, when a client accesses an object, it prioritizes accessing the object in the cache layer. If a hot object is in the storage layer, it will be promoted to the cache layer for further processing. Cold objects in the cache layer are flushed to the storage layer according to the eviction algorithm. Furthermore, regardless of the mode, Ceph's tiered cache operates at the object level. This leads to significant cache amplification; for example, if a client wants to read a 4KB data fragment from a 4MB object, the entire object must first be read from the storage layer and then cached before being returned to the client. Simultaneously, memory (cache) capacity utilization is low, as the cache layer caches the entire object, while in reality, hot data may only be a portion of the object.
[0055] The following is combined Figure 1 The Ceph-based hierarchical caching method of the present invention is described. This method is applied to the server side. First, a Ceph-based hierarchical cache is built on the server side. After deploying the cluster, two pools are configured: one pool is used as the storage layer, and the other pool is used as the caching layer. The two pools are associated. The method includes:
[0056] S10. Determine the client's user request through the caching layer. In this application, the caching layer is responsible for directly interacting with the client, receiving the user request from the client, and determining the requested data segment in the user request when the user request reaches the caching layer.
[0057] In this method, the user request includes read / write requests. Regardless of the type of request, these requests contain information about the data segment of this request. For example, if the user request is a write request, more specifically, the request is to write data ABCD sequentially to positions 0-3 of object (Object) A. That is, the data segment requested in the user write request is positions 0-3 of object A, the object to be written is object A, and the data to be written is ABCD.
[0058] S20. Determine the object corresponding to the requested data segment and the layer where the corresponding object resides. The cache layer caches the object's data segment in memory and stores metadata on disk. Furthermore, based on the frequency of data segments, the cache layer transfers cached data segments from memory to the storage layer; for example, it migrates infrequently accessed (cold) data from the cache layer to the storage layer.
[0059] S30. Based on the determined object and layer, read and write the requested data segment.
[0060] The Ceph-based hierarchical caching method of this invention no longer operates on the entire object, but on the data segment of the object. By optimizing the operation granularity of Ceph-based hierarchical caching, and by caching the object's data segment in memory (i.e., caching the object's data in memory) and storing metadata on disk in the cache layer, the memory settings ensure that the cached data is up-to-date and that the memory data is recoverable after power failure. The cache layer and the storage layer together ensure that all data is available. This application can effectively reduce the amplification of user read and write operations and reduce the latency when users write data, thereby improving the performance of hierarchical caching.
[0061] The following is combined Figure 2 The Ceph-based hierarchical caching method of the present invention is described, taking data writing as an example. In these embodiments, when a user request is a write request, the method includes:
[0062] A10. Determine the client's user write request through the caching layer, and format / serialize the user write request into a log. In this embodiment, the user write request carries information such as the object to be written, the data segment to be written, and the data to be written.
[0063] In this method, when a user write request reaches the cache layer, the cache layer formats / serializes the user write request into a log. All data to be written can be considered "hot" data. In this application, the objects to be written to are all located in the cache layer, and the data segment of the user write request can only be objects entirely located in the cache layer. The granularity of cache layer operations is no longer the entire object, but rather the data segment of the object.
[0064] A20. Based on the logs, determine the objects to be written and the metadata.
[0065] A30. Write the logs to a log object (Log_Object). In this application, the log object is set in a cache pool and is an internal object of the cache pool, meaning the log object is not visible to the client.
[0066] In this application, Ceph also has a logging module (CacheLogger), which is deployed in the cache pool. Specifically, the logging module can request the log object for this log writing.
[0067] In this embodiment, the log object only provides append-only functionality and does not support overwrite. The log object provides continuous and infinitely growing space for writing logs, and each log entry in the log object can be used for memory reconstruction after a power outage. The log objects are managed uniformly by the caching layer. If the data segment of an object cached in memory is transferred to the storage pool as follows, the logs in the corresponding log object can be cleaned up.
[0068] A40. Write the log object and its metadata to the cache layer's disk.
[0069] In this application, when writing data, metadata is generated and written to the disk of the cache pool in the original object format.
[0070] A50. Confirm that the log was written successfully. Based on the log content, update the memory and generate a successful write flag, such as generating a successful write signal. Then, return the generated successful write flag to the client.
[0071] Since the log is obtained by the caching layer from the formatting / serialization of user write requests, the log content contains various information required for this caching, such as the object being written, the location being written, and the data being written. In this method, the data of the object is cached in memory by the caching layer, and the memory settings ensure that the cached data is the latest data.
[0072] In this embodiment, the successful write flag can be displayed visually on the client side. Users can know that the data has been successfully written by the successful write flag returned by the server, and then continue to write data to the client.
[0073] In some possible implementations, a log write failure is determined, and a failure write flag is generated, such as by generating a failure write signal. This generated failure write flag is then returned to the client. Similarly, in these implementations, the failure write flag can also be visualized on the client side. Users can learn that data writing has failed based on the failure write flag returned by the server and take appropriate measures, including checking which step went wrong and caused the data to fail to write.
[0074] The following is combined with Figure 3 The Ceph-based hierarchical caching method of the present invention is described, taking data writing as an example. In these embodiments, when a user request is a write request, the method includes:
[0075] B10. Determine the client's user read request through the caching layer, and identify the data segment requested in the user read request.
[0076] B20. Determine the object corresponding to the requested data segment and the layer in which the corresponding object resides.
[0077] B30. Determine that the corresponding object is located only in the cache layer. That is, all user read requests hit the memory of the cache layer. Determine the requested data segment from the memory of the cache layer and return the determined requested data segment to the client.
[0078] Step B30 corresponds to the situation where all user read requests hit the cache layer. At this point, the corresponding data is read from memory and returned directly to the client. For example, by traversing the information in memory, all requested data segments are read, and then the read requested data segments are packaged and returned to the client.
[0079] B40. Determine that the corresponding object is located in the storage layer. That is, the user read request partially hits the cache layer's memory or does not hit the cache layer's memory at all. Determine the requested data segment from the cache layer's memory and use it as the first data. The first data is the data that is hit in the cache layer. Then, determine the requested data segment from the storage layer and use it as the second data. The second data is the data that is missing in the cache layer. Concatenate the first data and the second data to obtain the requested data segment, and return the concatenated requested data segment to the client.
[0080] Step B40 corresponds to the user's read request partially hitting or completely missing the cache layer. In this case, the requested data segment is read from the cache layer's memory and used as the first data. The first data is actually a part of the data segment requested in the user's read request or is empty. After that, the cache layer pulls the missing data segment requested in the user's read request from the storage layer. At this time, the object pulled is not the object in the storage layer itself, but the data segment contained in the object.
[0081] The following is combined with Figure 4 The Ceph-based hierarchical caching method of the present invention is described. Whether the cached data is hot data has a certain time sensitivity. Subsequently, memory can perform promotion, flushing, and eviction / removal of the object's data according to the eviction algorithm. For example, memory can flush all or part of the data segments of the object to the storage layer. Alternatively, when the cache pool capacity is insufficient or there are too many cache layer log objects, memory can flush all or part of the data segments of the object to the storage layer, while releasing the cache layer log objects. The object's metadata continues to be stored on disk and is not flushed to the storage layer. Therefore, user requests can also be metadata requests. In these embodiments, the method further includes the following steps:
[0082] C10. Determine the client's user metadata request through the caching layer. It should be noted that the user metadata request can be a metadata read / write request. When the user metadata request reaches the caching layer, determine the data segment requested in the user metadata request.
[0083] C20. Determine the object corresponding to the requested data segment and the layer in which the corresponding object resides.
[0084] C30. Based on the identified objects and layers, perform metadata reading and writing.
[0085] Since the metadata in this application is written to the disk of the cache pool in the original object format, regardless of the type of metadata request, the corresponding object and the layer where the corresponding object is located are all cache layers. This application can complete the operation directly in the cache layer, that is, the reading and writing of metadata are all performed in the cache layer.
[0086] The following is combined with Figure 5 The Ceph-based hierarchical caching method of the present invention is described below, which further includes the following steps:
[0087] D10. After the data segments of objects cached in memory are passed to the storage layer, the log objects and logs corresponding to the passed data segments are identified, and the corresponding logs are deleted from the corresponding log objects. Thus, after the data segments are passed, the logs in the log objects are cleaned up, providing continuous and infinitely growing space for the logs.
[0088] In this application, the data segment, i.e., the object's data, is stored in the cache layer's memory. Therefore, the data cache is susceptible to invalidation after power failure. To enable rapid data recovery, the method also includes the following steps:
[0089] E10. Confirm a successful restart signal, for example, when a node restarts after a power outage, and then confirms all log objects in the cache layer corresponding to the restarted node. The above operations are still performed by the cache layer that carries memory and log objects.
[0090] E20. Parse the log from the identified log object, and reconstruct the data segment in memory based on the parsed log, that is, replay the memory based on the parsed log.
[0091] Based on the characteristics of Ceph's layered storage, it can be seen that the log objects of a certain node are written to Ceph in a redundant manner, and the logs in the log objects are written in an append-only manner. Therefore, even if the data of the current node is corrupted, the data of the current node can be recovered from other nodes, achieving distributed caching and consistency with the cluster failure domain.
[0092] In this application, when a user requests a write request, the application introduces logs, log objects, and memory. The caching layer of the layered cache is used to store the log objects. The caching layer receives the user write request from the client, formats the user write request into a log, records the log into the log object, and then caches the log object in the disk of the caching layer. Finally, the memory, i.e. the caching layer, directly responds to the client regarding the write status of this user write request. It can be seen that when this application writes data to the server, the operation granularity is the data segment of the object.
[0093] When a user request is a read request, the system first checks if all data segments are hit in memory. If not, it reads the missing data segments from the data pool. It can be seen that when this application reads data from the server, the operation granularity is the data segments of the object.
[0094] When a node restarts, the memory recovers the cached data from the log object before providing services to the client.
[0095] Therefore, this application improves the performance of hierarchical caching by modifying the operation granularity of hierarchical caching to the data segment of the object.
[0096] The Ceph-based hierarchical caching device provided by the present invention will be described below. The Ceph-based hierarchical caching device described below and the Ceph-based hierarchical caching method described above can be referred to in correspondence.
[0097] The following is combined with Figure 6 The present invention describes a Ceph-based hierarchical caching device applied to a server. On the server, a Ceph-based hierarchical cache is first built, and after deploying a cluster, two pools are configured: one pool serves as the storage layer, and the other pool serves as the caching layer. The two pools are associated. The device includes:
[0098] The first determining module 10 is used to determine the user request of the client through the cache layer. That is, in this application, the cache layer is responsible for interacting directly with the client, receiving the user request from the client, and determining the data segment requested in the user request when the user request reaches the cache layer.
[0099] In this device, user requests include read / write requests. Regardless of the type of request, these requests contain information about the data segment requested. For example, a user request is a write request. More specifically, the request is to write data ABCD sequentially to positions 0-3 of object (Object) A. That is, the data segment requested in the user write request is positions 0-3 of object A, the object to be written is object A, and the data to be written is ABCD.
[0100] The second determining module 20 is used to determine the object corresponding to the requested data segment and the layer where the corresponding object resides. Specifically, the cache layer caches the object's data segment in memory and stores metadata on disk. Furthermore, based on the data segment's frequency of use, the cache layer transfers the cached data segment from memory to the storage layer; for example, it migrates infrequently used data from the cache layer to the storage layer.
[0101] The data read / write module 30 is used to read and write requested data segments based on the determined objects and layers.
[0102] The Ceph-based hierarchical caching device of the present invention no longer operates on the entire object, but on the data segment of the object. By optimizing the operation granularity of the Ceph-based hierarchical caching, and by caching the object's data segment in memory through the cache layer (i.e., caching the object's data in memory) and storing metadata on disk through the cache layer, the memory settings ensure that the cached data is up-to-date and that the memory data is recoverable after power failure. The cache layer and the storage layer together ensure that all data is available. This application can effectively reduce the amplification of user read and write operations and reduce the latency when users write data, thereby improving the performance of the hierarchical caching.
[0103] Figure 7 An example is a schematic diagram of the physical structure of an electronic device, such as... Figure 7 As shown, the electronic device may include: a processor 810, a communications interface 820, a memory 830, and a communication bus 840, wherein the processor 810, the communications interface 820, and the memory 830 communicate with each other via the communication bus 840. The processor 810 can call logical instructions in the memory 830 to execute a Ceph-based hierarchical caching method, which includes:
[0104] The caching layer determines the client's user request and the requested data segment within that request.
[0105] Determine the object corresponding to the requested data segment and the layer where the corresponding object resides; the cache layer caches the object's data segment in memory and stores metadata on disk, and based on the data segment's popularity, the cache layer passes the cached data segment in memory to the storage layer;
[0106] Based on the identified objects and layers, the requested data segments are read and written.
[0107] Furthermore, the logical instructions in the aforementioned memory 830 can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, essentially, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0108] On the other hand, the present invention also provides a computer program product, the computer program product comprising a computer program that can be stored on a non-transitory computer-readable storage medium, wherein when the computer program is executed by a processor, the computer is able to execute the Ceph-based hierarchical caching method provided by the above methods, the method comprising:
[0109] The caching layer determines the client's user request and the requested data segment within that request.
[0110] Determine the object corresponding to the requested data segment and the layer where the corresponding object resides; the cache layer caches the object's data segment in memory and stores metadata on disk, and based on the data segment's popularity, the cache layer passes the cached data segment in memory to the storage layer;
[0111] Based on the identified objects and layers, the requested data segments are read and written.
[0112] In another aspect, the present invention also provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, is implemented to perform the Ceph-based hierarchical caching method provided by the methods described above, the method comprising:
[0113] The caching layer determines the client's user request and the requested data segment within that request.
[0114] Determine the object corresponding to the requested data segment and the layer where the corresponding object resides; the cache layer caches the object's data segment in memory and stores metadata on disk, and based on the data segment's popularity, the cache layer passes the cached data segment in memory to the storage layer;
[0115] Based on the identified objects and layers, the requested data segments are read and written.
[0116] The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without any creative effort.
[0117] Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus necessary general-purpose hardware platforms, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions, in essence or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a computer-readable storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in the various embodiments or some parts of the embodiments.
[0118] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A Ceph-based hierarchical caching method, applied to the server side, characterized in that, The method includes: The caching layer determines the client's user request and the requested data segment within that request. Determine the object corresponding to the requested data segment and the layer where the corresponding object resides; the cache layer caches the object's data segment in memory and stores metadata on disk, and based on the data segment's popularity, the cache layer passes the cached data segment in memory to the storage layer; Based on the identified objects and layers, read and write the requested data segments.
2. The Ceph-based hierarchical caching method according to claim 1, characterized in that, When the user request is a read request, the method includes: The client's user write request is determined through the caching layer, and the user write request is formatted as a log; the user write request carries the object to be written, the data segment to be written, and the data to be written. Based on the logs, determine the objects to be written and their metadata; Write logs to a log object; the log object is set in a cache pool; Write the log object and its metadata to the cache layer's disk; Once the log write is confirmed to be successful, the memory is updated based on the log content, a successful write flag is generated, and the successful write flag is returned to the client.
3. The Ceph-based hierarchical caching method according to claim 1, characterized in that, When a user requests a write request, the method includes: The caching layer determines the client's user read request and identifies the requested data segment within that read request. Determine the object corresponding to the requested data segment and the layer in which the corresponding object resides; If the layer where the corresponding object resides is only the cache layer, the requested data segment is determined from the memory of the cache layer, and the determined requested data segment is returned to the client. The process involves determining the layer containing the corresponding object, including the storage layer, retrieving the requested data segment from the cache layer's memory as the first data segment, retrieving the requested data segment from the storage layer as the second data segment, concatenating the first and second data segments to obtain the requested data segment, and returning the concatenated requested data segment to the client.
4. The Ceph-based hierarchical caching method according to claim 2, characterized in that, When the user request is a user metadata request, the method includes: The caching layer determines the client's user metadata request and identifies the requested data segment within that user metadata request. Determine the object corresponding to the requested data segment and the layer in which the corresponding object resides; Based on the identified objects and layers, metadata is read and written.
5. The Ceph-based hierarchical caching method according to claim 1, characterized in that, The method also includes the following steps: Identify the log object and log corresponding to the transmitted data segment, and delete the corresponding log from the corresponding log object; the data segment is transferred from the cache layer's memory to the storage layer.
6. The Ceph-based hierarchical caching method according to claim 1, characterized in that, The method also includes the following steps: Confirm the successful restart signal and identify all log objects in the cache layer corresponding to the restarted node; Parse the logs from the identified log objects and reconstruct the memory data segments based on the parsed logs.
7. A Ceph-based hierarchical caching device, applied to the server side, characterized in that, The device includes: The first determining module is used to determine the client's user request and the data segment requested in the user request through the cache layer; The second determination module is used to determine the object corresponding to the requested data segment and the layer where the corresponding object is located; the cache layer caches the data segment of the object in memory and stores the metadata on disk, and based on the popularity of the data segment, the cache layer passes the data segment cached in memory to the storage layer. The data read / write module is used to read and write requested data segments based on defined objects and layers.
8. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the Ceph-based hierarchical caching method as described in any one of claims 1 to 6.
9. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the steps of the Ceph-based hierarchical caching method as described in any one of claims 1 to 6.
10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the steps of the Ceph-based hierarchical caching method as described in any one of claims 1 to 6.