Task scheduling method, magnetoelectric disc, storage system, and program product
By grouping task requests into queues based on priority and characteristics in the tape storage system, the problem of poor continuity in tape task execution is solved, achieving efficient task scheduling and bandwidth enhancement, and meeting the requirements of the service level agreement.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2024-12-30
- Publication Date
- 2026-06-30
AI Technical Summary
In existing tape storage systems, the first-in-first-out (FIFO) strategy used by the drives results in poor continuity of tape task execution, failure to guarantee maximum bandwidth under the service level agreement, and untimely task scheduling.
The controller or processor groups task requests into multiple request queues based on their priority and characteristics, ensuring that tasks in the same queue have the same magneto disk identifier. High-priority tasks are scheduled first, reducing magneto disk switching latency and improving task execution continuity and bandwidth.
It improves the timeliness of task scheduling, ensures the bandwidth of the storage system and magneto disk, meets the requirements of the service level agreement, and improves the continuity and efficiency of task processing.
Smart Images

Figure CN122309050A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of storage technology, and in particular to a task scheduling method, a magneto-electric disk, a storage system, and a program product. Background Technology
[0002] Magnetic tape is the primary storage medium for backup and archiving scenarios, offering advantages such as low cost, large capacity, and long lifespan. In storage technology, magnetic tape, with its low cost, high reliability, and security, is the optimal choice for backup and archiving. During data access, drives in the storage system manage multiple magnetic disks through a single input / output (I / O) channel, employing a first-come, first-served (FCFS) strategy. An I / O channel is a processor capable of handling I / O requests and executing channel programs to control I / O operations. At any given time, a drive can only mount one magnetic disk through this I / O channel. However, the FCFS strategy does not account for tape disk switching latency during access, disrupting the continuity of tape task execution in the storage system, failing to guarantee maximum bandwidth under the service-level agreement (SLA), and causing untimely task scheduling. Summary of the Invention
[0003] This application provides a task scheduling method, a magnetic disk, a storage system, and a program product, which solves the problem of poor execution continuity of tape tasks caused by the drive using the FCFS strategy to process multiple requests. It also improves the bandwidth of the storage system and magnetic disk while ensuring the priority of different task requests, which is beneficial to improving the timeliness of task scheduling.
[0004] The technical solution adopted in this application is as follows.
[0005] Firstly, this application provides a task scheduling method. This task scheduling method is applied to a storage system, which includes a controller, a cache, and multiple disks. The following description uses the execution of the task scheduling method by the controller as an example. The task scheduling method provided in this application includes: the controller acquiring multiple task requests; wherein the task requested indicates any one of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation task. The controller acquires multiple request queues based on the priority and characteristics of the multiple task requests; wherein the aforementioned characteristics indicate one or more combinations of the following: disk identifier, execution deadline, and access address; one of the multiple request queues includes at least one task request with the same disk identifier and belonging to the same priority among the multiple task requests. Furthermore, the controller obtains the scheduling order of the multiple request queues based on the cache state, and executes the tasks indicated by the multiple request queues according to the scheduling order.
[0006] In the first aspect of this application, the controller combines multiple task requests according to their priority and characteristics to obtain multiple request queues. Since task requests in the same request queue belong to the same priority, the task scheduling process ensures the priority of different task requests, that is, the storage system can batch process task requests with the same SLA or priority. Moreover, task requests in the same request queue have the same disk identifier. Thus, during the execution of all task requests in the request queue, the controller does not need to switch the target disk to be operated, reducing disk switching latency and ensuring the continuity of task execution in the storage system. This is beneficial for improving the timeliness of task scheduling and increasing the bandwidth of the storage system.
[0007] In conjunction with the task scheduling method provided in the first aspect, in one optional implementation, the aforementioned user operation tasks include one or both of tape import and tape export tasks. Specifically, the tape import task includes importing a first tape into a first magnetic disk among multiple magnetic disks. The tape export task includes exporting a second tape from a second magnetic disk among multiple magnetic disks. In the first aspect of this application, the controller in the storage system can not only schedule tasks related to data access services, but also schedule tasks such as tape import and tape export, ensuring the continuity of task execution in the storage system while implementing the SLA.
[0008] In conjunction with the task scheduling method provided in the first aspect, in one optional implementation, the scheduling order of multiple request queues includes: the scheduling order of the request queue with higher priority is before the scheduling order of the request queue with lower priority. In the first aspect of this application, the controller prioritizes scheduling the request queue with higher priority, which is beneficial for maintaining the SLA adopted by different applications, enabling timely scheduling of higher priority tasks, and reducing the task execution latency of the storage system.
[0009] In conjunction with the task scheduling method provided in the first aspect, in one optional implementation, the controller obtains multiple request queues based on the priority and characteristics of multiple task requests. This includes: the controller obtaining the characteristics of the multiple task requests and dividing the multiple task requests into multiple first candidate request queues based on the magneto disk identifiers indicated by the characteristics; the first candidate request queues include one or more task requests having the same magneto disk identifier. Furthermore, the controller classifies and combines the multiple first candidate request queues according to the priority of the multiple task requests to obtain multiple request queues.
[0010] In the first aspect of this application, task requests belonging to the same request queue have the same magneto disk identifier. During the execution of all task requests in the request queue, the controller does not need to switch the target magneto disk to be operated, which reduces the magneto disk switching latency and ensures the continuity of task execution in the storage system. This is beneficial to improving the timeliness of task scheduling and increasing the bandwidth of the storage system.
[0011] In conjunction with the task scheduling method provided in the first aspect, in one optional implementation, the controller classifies and combines multiple first candidate request queues according to the priority of multiple task requests to obtain multiple request queues. This includes: the controller dividing the multiple first candidate request queues into multiple second candidate request queues according to the priority of the multiple task requests; each second candidate request queue includes M task requests with the same magneto disk identifier and belonging to the same priority. Furthermore, the controller uses the deadline execution time and access address indicated by the feature as cluster centers to process the multiple second candidate request queues to obtain multiple request queues. The first request queues among the multiple request queues include N task requests belonging to the same priority and having a first magneto disk identifier. Among the N task requests, the difference between the deadline execution times of any two task requests is less than or equal to a first threshold. The first magneto disk identifier indicates the first magneto disk among the multiple magneto disks, and among the N task requests, the tape regions indicated by the access addresses of the N task requests are adjacent in position within the first magneto disk.
[0012] In an alternative scenario, the proximity of the tape regions indicated by the access addresses of the N task requests within the first magnetodisk means that the first magnetodisk includes a first magnetic tape, and the tape regions indicated by the aforementioned access addresses of the N task requests are distributed along a first direction of the first magnetic tape, which is either the length direction or the width direction of the first magnetic tape.
[0013] In the first aspect of this application, the controller groups and aggregates task requests based on their priority and characteristics, ensuring that task requests with the same priority and identical tape objects (magnetic disk identifiers) are executed in the same request queue. This improves the continuity of task execution in the storage system. Furthermore, since the tape regions corresponding to different task requests within the same request queue are adjacent, the tape reel distance on the magnetic disk decreases during task execution, reducing access latency and further improving task execution efficiency in the storage system.
[0014] In conjunction with the task scheduling method provided in the first aspect, in one optional implementation, the controller uses the deadline execution time and access address indicated by the feature as cluster centers to process multiple second candidate request queues to obtain multiple request queues. This includes: the controller clusters multiple second candidate request queues using the deadline execution time and access address indicated by the feature as cluster centers to obtain multiple third candidate request queues. Furthermore, the controller aggregates the multiple third candidate request queues based on multiple estimated execution times corresponding to multiple task requests and the total data volume to obtain multiple request queues. Here, one estimated execution time corresponds to one task request, and the first request queue includes at least two third candidate request queues from the multiple third candidate request queues, and the estimated execution time of the first request queue is less than the sum of the estimated execution times of the at least two third candidate request queues.
[0015] In an optional scenario, the multiple request queues may also include a second request queue that corresponds to one of the multiple third candidate request queues.
[0016] In the first aspect of this application, the controller can aggregate two or more third candidate request queues based on the estimated execution time of different queues, thereby reducing the overall execution time of multiple task requests, improving the execution effect of task scheduling, and helping to improve the execution efficiency of tasks in the storage system.
[0017] In conjunction with the task scheduling method provided in the first aspect, in one optional implementation, the controller determines the scheduling order of multiple request queues based on the cache state. This includes: for one request queue among the multiple request queues, the controller takes the cache state and the queue characteristics of the request queue as input to a queuing model and outputs a queuing score for the request queue. The queue characteristics include one or more combinations of the following: the amount of data corresponding to a request queue, seek time, maximum tape wear information, and the maximum execution wait time for executing a request queue. This queuing model is used to determine the queuing score of each request queue among the multiple request queues. Furthermore, the controller determines the scheduling order of the multiple request queues based on the queuing score of each request queue among the multiple request queues.
[0018] In the first aspect of this application, for different request queues, the controller calls the queuing model to calculate the queuing score for each request queue. For example, the queuing model calculates the queuing score for different request queues based on information such as data volume, seek time, maximum tape wear information, and maximum execution waiting time, so that different request queues have different scheduling weights in the task scheduling process. This realizes that the scheduling process of tasks under multi-dimensional characteristics can store quantifiable expected benefits (queuing score), which is conducive to improving the task scheduling effect of different task requests in the storage system and reducing the execution latency of tasks in the storage system.
[0019] In conjunction with the task scheduling method provided in the first aspect, in a first optional scenario, the controller obtains the scheduling order of multiple request queues based on the queuing score of each request queue. This includes: when the deadline for execution of at least one request queue is less than or equal to the reserved time, the controller obtains a first sub-order based on the queuing score of at least one request queue. In the first aspect of this application, for the case where the deadline for execution is less than or equal to the reserved time, the controller prioritizes sorting the request queue to ensure that the controller schedules the request queue first. This avoids conflicts between the switching of the magneto disk and task scheduling, and also helps to ensure the SLA corresponding to the request queue, thereby improving the user experience.
[0020] In conjunction with the task scheduling method provided in the first aspect, in the second optional scenario, the controller obtains the scheduling order of the multiple request queues based on the queuing score of each request queue. This includes: when the cache occupancy rate reaches a first threshold, the controller sorts the request queues with write data task type among the multiple request queues to obtain a second sub-order; this second sub-order follows the aforementioned first sub-order. In the first aspect of this application, when a large amount of data is stored in the cache, the controller prioritizes scheduling the request queues that execute write data tasks, thereby reducing the space occupied in the cache. This not only helps improve the utilization rate of cache storage resources but also helps improve the write data bandwidth in the storage system, thus improving the execution efficiency of tasks in the storage system.
[0021] In conjunction with the task scheduling method provided in the first aspect, in the third optional scenario, the controller obtains the scheduling order of the multiple request queues based on the queuing score of each request queue, including: when the cache occupancy rate is lower than a second threshold, sorting the request queues with read data task type among the multiple request queues to obtain a third sub-order; the second threshold is less than the aforementioned first threshold. In the first aspect of this application, when there is a large amount of available storage space in the cache, the controller prioritizes scheduling the request queues that execute read data tasks, thereby improving the storage resource utilization of the cache, and also helping to improve the read data bandwidth in the storage system, thus improving the execution efficiency of tasks in the storage system.
[0022] The first and third optional scenarios mentioned above can exist independently, or they can be combined in pairs, or all three scenarios can be used to form a controller to determine the scheduling order of the aforementioned multiple request queues. This application does not limit the specific implementation of these scenarios.
[0023] Secondly, this application provides a task scheduling method. This task scheduling method is applied to a magnetic disk, which includes a processor, a cache, and a magnetic tape. The task scheduling method provided in this second aspect is executed by the processor. The task scheduling method provided in this second aspect includes: the processor acquiring multiple task requests; wherein the task indicated by the task request includes any one of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation task. The processor acquires multiple request queues based on the priority and characteristics of the multiple task requests; wherein the characteristics indicate one or more combinations of the following: tape identifier, execution deadline, and access address; one of the aforementioned multiple request queues includes at least one task request among the multiple task requests that has the same magnetic disk identifier and belongs to the same priority. Furthermore, the processor obtains the scheduling order of the multiple request queues based on the cache state, and executes the tasks indicated by the multiple request queues according to the scheduling order.
[0024] In the second aspect of this application, the processor classifies and combines multiple task requests according to their priority and characteristics to obtain multiple request queues. Since task requests in the same request queue belong to the same priority, the task scheduling process ensures the priority of different task requests, that is, the magnetic disk can batch process task requests with the same SLA or priority. Moreover, task requests in the same request queue have the same tape identifier. Thus, during the execution of all task requests in the request queue, the processor does not need to switch the target tape to be operated on, reducing tape switching latency and ensuring the execution continuity of tasks in the magnetic disk. This is beneficial for improving the timeliness of task scheduling and increasing the bandwidth of the magnetic disk.
[0025] In conjunction with the task scheduling method provided in the second aspect, in one optional implementation, the processor determines the scheduling order of multiple request queues based on the cache state. This includes: for one request queue among the multiple request queues, the processor takes the cache state and the queue characteristics of the request queue as input to a queuing model and outputs a queuing score for the request queue. The queue characteristics include one or more combinations of the following: the amount of data corresponding to a request queue, seek time, maximum tape wear information, and maximum execution wait time for executing a request queue; this queuing model is used to determine the queuing score of each request queue among the multiple request queues. Furthermore, the processor determines the scheduling order of the multiple request queues based on the queuing score of each request queue among the multiple request queues.
[0026] In conjunction with the task scheduling method provided in the second aspect, in the first optional scenario, the processor determines the scheduling order of the multiple request queues based on the queuing score of each request queue. This includes: if the deadline for execution of at least one request queue is less than or equal to the reserved time, the processor determines the first sub-order based on the queuing score of at least one request queue. For cases where the deadline is less than or equal to the reserved time, the processor prioritizes sorting this request queue to ensure its priority scheduling. This avoids conflicts between tape switching and task scheduling, helps guarantee the SLA corresponding to the request queue, and improves the user experience.
[0027] In conjunction with the task scheduling method provided in the second aspect, in the second optional scenario, when the cache occupancy rate reaches a first threshold, the processor sorts the request queues with write data task type from multiple request queues to obtain a second sub-order. This second sub-order follows the aforementioned first sub-order. When a large amount of data is stored in the cache, the processor prioritizes scheduling the request queues that execute write data tasks, thereby reducing the space occupied in the cache. This not only helps improve the utilization rate of cache storage resources but also helps improve the write data bandwidth of the magnetic disk, thus increasing the execution efficiency of tasks on the magnetic disk.
[0028] In conjunction with the task scheduling method provided in the second aspect, in the third optional scenario, when the cache occupancy rate is lower than the second threshold, the processor sorts the request queues of task type read data in multiple request queues to obtain a third sub-order. This second threshold is less than the aforementioned first threshold. When there is a large amount of available storage space in the cache, the processor prioritizes scheduling the request queues that execute read data tasks, thereby improving the utilization rate of cache storage resources and also helping to increase the read data bandwidth in the magnetic disk, thus improving the execution efficiency of tasks in the magnetic disk.
[0029] The first and third optional scenarios mentioned above can exist independently, or they can be combined in pairs, or all three scenarios can be combined to form a specific implementation method for the processor to determine the scheduling order of the aforementioned multiple request queues. This application does not limit this specific implementation method.
[0030] Thirdly, this application provides a magnetoelectric disk. The magnetoelectric disk includes: a communication interface, a processor, a cache, and a magnetic tape. The magnetic tape is used to store data. The communication interface is used to acquire multiple task requests. The task requests indicate tasks including any of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation tasks. The processor is used to execute the task scheduling method provided in the second aspect or any optional implementation of the second aspect, based on the priority and characteristics of the multiple task requests and the state of the cache.
[0031] Fourthly, this application provides a storage system. The storage system includes: a communication interface, a controller, a cache, and multiple magnetic disks. The communication interface is used to: acquire multiple task requests. The task requests indicate tasks including any of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation tasks. The controller is used to: acquire multiple request queues based on the priority and characteristics of the multiple task requests. The characteristics indicate one or more combinations of the following: magnetic disk identifier, execution deadline, and access address; one of the aforementioned request queues includes: at least one task request among the multiple task requests that has the same magnetic disk identifier and belongs to the same priority. The cache is used to: temporarily store the multiple request queues. The controller is also used to: execute the task scheduling method provided in the first aspect or any optional implementation of the first aspect, based on the cache state and the multiple request queues.
[0032] Fifthly, this application provides a data access system. The data access system includes a client and a storage system provided in the fourth aspect. The client is used to send multiple task requests to the storage system, and the storage system is used to execute the task scheduling method provided in the first aspect or any optional implementation or scenario of the first aspect according to the multiple task requests.
[0033] Sixthly, this application provides a computer program product. When the computer program product runs on an electronic device, the electronic device executes the task scheduling method provided by any of the optional implementations or scenarios of the first or second aspect.
[0034] In a seventh aspect, this application provides a chip. The chip includes a communication interface and a processor. The communication interface is used to acquire multiple task requests, and the processor and the communication interface are used to collaboratively execute the task scheduling method provided by any of the optional implementations or scenarios in the first or second aspect.
[0035] For example, the chip refers to a controller in a storage system that can manage multiple magneto-disks and other possible memories within the storage system.
[0036] For example, the chip refers to the processor in a magneto disk, which is capable of managing the magnetic tape in the magneto disk.
[0037] Eighthly, this application provides a computer-readable storage medium. The computer-readable storage medium includes computer instructions. When the computer instructions are executed in an electronic device, the electronic device is used to implement the operational steps of the method provided in the first aspect or any optional implementation of the first aspect. For example, the electronic device may be a magnetodisk provided in the third aspect, a storage system provided in the fourth aspect, a data access system provided in the fifth aspect, or a chip provided in the seventh aspect.
[0038] The beneficial effects of aspects three through eight can be found in the description of any optional implementation or situation in aspect one or two, and will not be repeated here. Based on the implementations provided in the above aspects, this application can be further combined to provide more implementations. Attached Figure Description
[0039] Figure 1 This is a schematic diagram of the structure of a data access system provided in this application.
[0040] Figure 2 This is a schematic diagram of the structure of a magnetoelectric disk provided in this application.
[0041] Figure 3 A schematic diagram of the structure of a reel 201 and a magnetic tape 210 provided in this application.
[0042] Figure 4 This application provides a software architecture diagram for task scheduling.
[0043] Figure 5 A flowchart illustrating a task scheduling method provided in this application. Figure 1 .
[0044] Figure 6 A flowchart illustrating a task scheduling method provided in this application. Figure 2 .
[0045] Figure 7 A flowchart illustrating a task scheduling method provided in this application. Figure 3 .
[0046] Figure 8 A flowchart illustrating a task scheduling method provided in this application. Figure 4 .
[0047] Figure 9 A flowchart illustrating a task scheduling method provided in this application. Figure 5 .
[0048] Figure 10 A flowchart illustrating a task scheduling method provided in this application. Figure 6 .
[0049] Figure 11 This is a flowchart illustrating a data read / write task provided in this application.
[0050] Figure 12 A flowchart illustrating a task scheduling method provided in this application embodiment. Figure 7 .
[0051] Figure 13This is a schematic diagram of the chip structure provided in this application. Detailed Implementation
[0052] This application provides a task scheduling method in which a controller combines multiple task requests based on their priority and characteristics to obtain multiple request queues. Since task requests within the same request queue have the same priority, the task scheduling process ensures the priority of different task requests, meaning the storage system can batch-process task requests with the same SLA. Furthermore, task requests within the same request queue have the same disk identifier. Thus, during the execution of all task requests in the queue, the controller does not need to switch the target disk to be operated on, reducing disk switching latency and ensuring the continuity of task execution in the storage system. This improves the timeliness of task scheduling and increases the bandwidth of the storage system.
[0053] Specifically, this task scheduling method is applied to a storage system, which includes a controller, a cache, and multiple disks. The following description illustrates the task scheduling method executed by the controller: The controller acquires multiple task requests; wherein the task requested indicates any of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation task. The controller acquires multiple request queues based on the priority and characteristics of the multiple task requests; wherein the aforementioned characteristics indicate one or more combinations of the following: disk identifier, execution deadline, and access address; one of the request queues includes at least one task request with the same disk identifier and belonging to the same priority. Furthermore, the controller determines the scheduling order of the multiple request queues based on the cache state and executes the tasks indicated by the multiple request queues according to the scheduling order.
[0054] The technical solutions involved in this application may be applied not only to current magnetic tape technology or storage devices, but also to future magnetic tape technology or storage devices, or to storage systems including magneto-electric disks (MEDs) or storage devices, such as magnetic tape systems or magnetic tape libraries. The terminology used in the embodiments section of this application is only for explaining specific embodiments of this application and is not intended to limit this application. A brief introduction to some concepts that may be involved in this application is provided below.
[0055] Storage medium: A storage material used to record sound, images, digital signals, or other signals. This storage material may include, but is not limited to, magnetic tape, such as a tape-shaped material with a magnetic layer used to record sound, images, digital signals, or other signals. Magnetic tape contains a magnetic medium, such as magnetic powder, for storing data. For example, changes in the magnetic field in this magnetic medium are typically achieved by coating a plastic film substrate (support or backing) with a layer of granular magnetic material or by evaporating and depositing a layer of magnetic oxide or alloy film. The substrate of magnetic tape may include, but is not limited to, paper, celluloid, or polyester film.
[0056] Magnetic head: A component that reads and writes data on magnetic tape using magnetic principles. It is divided into write heads and read heads. Write heads record data by magnetizing the magnetic medium (such as magnetic powder), while read heads read data from the magnetic medium by sensing its magnetic field.
[0057] In this document, the terms "first," "second," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Therefore, a feature defined with "first," "second," etc., may explicitly or implicitly include one or more of that feature. In the description of this application, unless otherwise stated, "a plurality of" means two or more.
[0058] Furthermore, in this application, directional terms such as "upper" and "lower" are defined relative to the orientation of the components shown in the accompanying drawings. It should be understood that these directional terms are relative concepts, used for relative description and clarification, and can change accordingly depending on the orientation of the components in the accompanying drawings.
[0059] The application scenarios of the embodiments of this application will be described below with reference to the accompanying drawings. Figure 1 This is a schematic diagram of a data access system provided in this application. The data access system includes: a data access device 100 and a storage system 120. Figure 1 In the application scenario shown, users access data through applications. The computer running these applications can be referred to as a "computing device".
[0060] Data access device 100 can be a physical machine, a virtual machine, or a container. The physical machine can include, but is not limited to, one or both a client and a smart NIC. For example, data access device 100 includes a client, such as a host, desktop computer, server, laptop, or mobile device. Another example is that data access device 100 includes a smart NIC. This smart NIC, also known as a smart network adapter, not only performs the network transmission functions of a standard NIC but also provides a built-in programmable and configurable hardware acceleration engine. This improves application performance and significantly reduces CPU consumption in the host connected to the smart NIC, providing more CPU resources for the application. For example, in a highly virtualized environment, the host CPU needs to run open virtual switch (OVS) related tasks. Simultaneously, the host CPU also needs to handle storage, online or offline encryption / decryption of data packets, deep packet inspection, firewalls, complex routing, and other operations. These operations not only consume significant CPU resources but also, due to competition for CPU resources between different services, prevent the services from achieving optimal performance. As a hub connecting various services, smart network interface cards (NICs) accelerate these services.
[0061] In one possible example, data access device 100 accesses storage system 120 via a network to access data; for example, the network may include switch 110.
[0062] In another possible example, the data access device 100 may also communicate with the storage system 120 via a wired connection, such as a Universal Serial Bus (USB) or a Peripheral Component Interconnect Express (PCIe) bus.
[0063] Figure 1 The storage system 120 shown can be a centralized storage system. A key feature of a centralized storage system is a unified entry point through which all data from external devices passes; this entry point is the engine 121 of the centralized storage system. The engine 121 has management functions, and many advanced functions of the storage system are implemented within it.
[0064] like Figure 1 As shown, engine 121 may have one or more controllers 1. Figure 1The following example illustrates the concept of an engine 121 containing one controller 1. In one possible example, if the engine 121 has multiple controllers 1, any two controllers 1 can have a mirror channel, enabling any two controllers 1 to serve as backups for each other, thereby preventing hardware failures from causing the entire storage system 120 to become unavailable. It should be understood that if the engine 121 includes multiple controllers 1, then the engine 121 can also be referred to as the array controller of the storage system 120.
[0065] Engine 121 also includes a front-end interface 1211 and a back-end interface 1214. The front-end interface 1211 is used to communicate with the data access device 100 to provide data access services to the data access device 100. The back-end interface 1214 is used to communicate with hard drives to expand the capacity of the storage system 120. Through the back-end interface 1214, engine 121 can connect to more hard drives, thereby forming a very large storage resource pool.
[0066] In terms of hardware, such as Figure 1 As shown, controller 1 includes at least processor 1212 and memory 1213. Processor 1212 is a central processing unit (CPU) used to process data access requests from outside the storage system 120 (server or other storage system), and also to process requests generated internally by the storage system 120. For example, when processor 1212 receives write data requests (write requests) sent by data access device 100 through front-end interface 1211, it temporarily stores the data in these write requests in memory 1213. When the total amount of data in memory 1213 reaches a certain threshold, processor 1212 sends the data stored in memory 1213 to at least one of the following hard drives for persistent storage: mechanical hard drive 1221, solid-state drive (SSD) 1222, magnetic disk 200, or other hard drive 1224, through back-end port.
[0067] Memory 1213 refers to internal memory that directly exchanges data with the processor. It can read and write data at any time and at high speed, serving as temporary data storage for the operating system or other running programs. Memory includes at least two types of memory, such as random access memory (RAM) or read-only memory (ROM). For example, RAM can be dynamic random access memory (DRAM) or SCM. DRAM is a semiconductor memory and, like most random access memory (RAM), is a volatile memory device. However, DRAM and SCM are merely illustrative examples in this embodiment; memory can also include other types of RAM, such as static random access memory (SRAM). For read-only memory, examples include programmable read-only memory (PROM) and erasable programmable read-only memory (EPROM). Additionally, memory 1213 can also be a dual in-line memory module (DIMM), i.e., a module composed of dynamic random access memory (DRAM), or an SSD. In practical applications, controller 1 can be configured with multiple memory modules 1213, and memory modules of different types 1213. This embodiment does not limit the number or type of memory modules 1213. Furthermore, memory modules 1213 can be configured to have a power-saving function. The power-saving function means that when the system loses power and then regains power, the data stored in memory modules 1213 will not be lost. Memory with a power-saving function is called non-volatile memory. Memory modules 1213 store software programs, and processor 1212 can run the software programs in memory modules 1213 to manage the hard disk. For example, the hard disk can be abstracted as a storage resource pool, and the storage resource pool can be provided to the server in the form of logical unit numbers (LUNs). Here, LUN is actually the hard disk seen on the server. Of course, some centralized storage systems are also file servers themselves, and can provide shared file services to the server.
[0068] like Figure 1As shown, in this system, engine 121 may not have a hard drive slot; the hard drive needs to be placed in disk enclosure 122, and the back-end interface 1214 communicates with disk enclosure 122. The back-end interface 1214 exists in the form of an adapter card within engine 121, and two or more back-end interfaces 1214 can be used simultaneously on one engine 121 to connect multiple disk enclosures. Alternatively, the adapter card can be integrated onto the motherboard, in which case the adapter card can communicate with processor 1212 via the PCIe bus.
[0069] It should be noted that, Figure 1 Only one engine 121 is shown in the figure. However, in actual applications, the storage system may contain two or more engines 121, and redundancy or load balancing may be performed between multiple engines 121.
[0070] The disk enclosure 122 includes a control unit 1225 and several hard drives. The control unit 1225 can have various forms. In one case, the disk enclosure 122 is a smart disk enclosure, such as... Figure 1 As shown, the control unit 1225 includes a CPU and memory. The CPU is used to perform operations such as address translation and reading / writing data. The memory is used to temporarily store data to be written to the hard disk or to read data from the hard disk to be sent to the controller 1. Alternatively, the control unit 1225 may be a programmable electronic component, such as a data processing unit (DPU). A DPU has the versatility and programmability of a CPU, but is more specialized, capable of efficiently operating on network packets, storage requests, or analysis requests. A DPU differs from a CPU by its high degree of parallelism (the ability to handle a large number of requests). Optionally, the DPU can be replaced by a graphics processing unit (GPU), an embedded neural network processing unit (NPU), or other processing chips. Typically, there may be one, two, or more control units 1225. The functions of the control unit 1225 can be offloaded to the network interface card (NIC) 1226. In other words, in this embodiment, the disk drive 122 does not contain a control unit 1225; instead, the NIC 1226 performs data reading / writing, address translation, and other computational functions. At this point, network interface card 1226 is a smart network interface card. It can include a CPU and memory. The CPU performs address translation and read / write operations, while the memory temporarily stores data to be written to the hard drive or reads data from the hard drive to be sent to controller 1. Network interface card 1226 can also include a programmable electronic component, such as a DPU. There is no ownership relationship between network interface card 1226 and hard drives in disk enclosure 122; network interface card 1226 can access any hard drive in disk enclosure 122 (e.g., ...). Figure 1The mechanical hard drive 1221, solid-state drive 1222, magneto-electric drive 200 and other hard drives 1224 shown are examples of hard drives that make it easier to expand the hard drive when storage space is insufficient.
[0071] In this embodiment, the magnetoelectric disk 200 refers to a memory including a magnetic tape medium. In hardware implementation, the magnetoelectric disk may include, but is not limited to, a magnetic head, a magnetic tape, and a magnetic tape driver. The magnetic tape driver can be used to drive the magnetic tape for winding, and the magnetic head can access the magnetic tape during winding, such as writing data to or reading data from the magnetic tape. Specific implementation details of the magnetoelectric disk are as follows. Figures 2 to 5 The embodiments shown are not described in detail here.
[0072] Depending on the type of communication protocol between engine 121 and disk enclosure 122, disk enclosure 122 may be a serially attached small computer system interface (SAS) disk enclosure, an NVMe (Non-Volatile Memory Express) disk enclosure, or other types of disk enclosures. SAS disk enclosures use the SAS 3.0 protocol, and each enclosure supports 25 SAS hard drives. Engine 121 connects to disk enclosure 122 via an onboard SAS interface or a SAS interface module. NVMe disk enclosures function more like a complete computer system, with NVMe hard drives inserted into them. The NVMe disk enclosure then connects to engine 121 via an RDMA port. In some cases, engine 121 may also be referred to as a hard drive management device or storage controller.
[0073] In terms of hardware implementation, the disk enclosure 122 can be installed in the storage system, or the disk enclosure 122 can be encapsulated and set up independently. When the disk enclosure 122 exists independently, it can also be called a storage device or storage system, such as a magnetoelectric storage system, etc. This application does not limit it in this way.
[0074] In this embodiment, a driver is deployed in the controller 1 or control unit 1225. This driver manages the various hard drives connected to the controller 1 or control unit 1225. In data plane communication, the driver provides an I / O channel to manage the data streams of different hard drives. This I / O channel is a processor capable of handling I / O requests and executing channel programs to control I / O operations. At any given time, the driver can only mount one hard drive through this I / O channel. "Mounting" here means that, at the management and access level of the hard drive, the driver connected to the I / O channel uses a file path to access the storage space provided by the hard drive; correspondingly, the other end of the driver is connected to the operating system, which accesses the hard drive connected to the I / O channel through this file path. For example, the driver in controller 1 assigns a drive letter to the hard drive and operates on the storage space provided by the hard drive based on this drive letter.
[0075] In one alternative implementation, the storage system 120 is a centralized storage system integrating disk and controller. The storage system 120 does not have the aforementioned disk enclosure 122, and the engine 121 manages multiple hard drives connected via hard drive bays. The functionality of the hard drive bays can be implemented by the backend interface 1214.
[0076] In some alternative implementations, storage system 120 is a distributed storage system. The distributed storage system includes a cluster of compute nodes and a cluster of storage nodes. The compute node cluster includes one or more compute nodes that can communicate with each other. Compute nodes can be servers, desktop computers, or controllers of storage arrays, etc. Hardware-wise, compute nodes can include processors, memory, and network interface cards (NICs), etc. The processor is a CPU used to process data access requests from outside the compute node or requests generated internally within the compute node. For example, when the processor receives a write request from a user, it temporarily stores the data in the write request in memory. When the total amount of data in memory reaches a certain threshold, the processor sends the data stored in memory to the storage node for persistent storage. In addition, the processor is also used for data computation or processing, such as metadata management, deduplication, data compression, virtualization of storage space, and address translation. In the embodiments provided in this application, the storage node can be a magnetic disk or other types of hard disks, etc. It is understood that the storage system described in the embodiments of this application can be a distributed storage system integrating storage and computing, or a distributed storage system with separate storage and computing; this application does not limit this.
[0077] For example, a distributed storage system can be implemented using network attached storage (NAS) technology. NAS refers to a network storage architecture that provides storage resources through file-level data access and sharing over an Internet Protocol (IP) network. In a NAS scenario, the NAS is an external device for the server / host, used to provide file-level storage space for the server / host in the distributed storage system.
[0078] It is worth noting that the above examples are merely possible implementations of the data access system provided in this embodiment and should not be construed as limiting this application. For example, Figure 1 In the storage system 120 shown, data is stored as files on various hard drives. The files stored on each hard drive constitute the file storage system, which can be, for example, a distributed file system, such as a network file system (NFS). NFS is both a distributed file system and a network protocol used for accessing and sharing files between devices on the same local area network. For example, a NAS system can be implemented using the NFS protocol. A network file system is a low-cost network file-sharing option that allows users and applications to access, store, and update files on remote computers, just like using direct-attached storage. Network file systems use the Remote Procedure Call (RPC) protocol to route requests between clients and servers. While participating devices need to support network file systems, they do not need to know the details of the network. It is worth noting that RPC can be insecure, so network file systems should only be deployed on trusted networks behind firewalls. Although Windows supports this protocol, it is primarily used in Linux environments.
[0079] Regarding the aforementioned magnetoelectric disk 200, this application provides an optional example, such as... Figure 2 As shown, Figure 2 This is a schematic diagram of a magnetoelectric disk provided in this application. The magnetoelectric disk 200 can be used to realize the functions of the magnetoelectric disk 200 described above. In this document, the magnetoelectric disk may also be referred to as a magnetic tape media storage device, magnetic tape drive, magnetic tape all-in-one device, integrated magnetic tape disk, integrated magnetic tape drive, or magnetic tape drive equipment, etc., and this application does not limit it in this way.
[0080] The following is combined Figure 2 The magnetoelectric disk 200 is described by way of example. It includes: magnetic tape 210, magnetic tape drive 220, magnetic head 230, reel 201, roller 202, base 203, processor 240 and cache 241.
[0081] The reel 201 and the base 203 are rotatably connected, and the magnetic tape 210 is wound onto the reel 201.
[0082] Regarding the structural relationship between the reel 201 and the magnetic tape 210, the following will be combined with... Figure 3 Provided as an example, Figure 3 A schematic diagram of the structure of a reel 201 and a magnetic tape 210 provided for this application. Please refer to... Figure 3 The reel 201 includes a reel 2013, a first cover plate 2011, and a second cover plate 2012. The reel 2013 and the aforementioned... Figure 2 The base 203 shown is rotatably connected. The magnetic tape 210 is located between the first cover plate 2011 and the second cover plate 2012. The first cover plate 2011 and the second cover plate 2012 can constrain the magnetic tape 210 and prevent the magnetic tape 210 from detaching from the reel 2013. During the rotation of the reel 2013, the first cover plate 2011 and the second cover plate 2012 rotate synchronously.
[0083] The first cover plate 2011 can be as follows: Figure 3 The circular plate-like structure shown can have a second cover plate 2012 as follows: Figure 3 The circular plate-like structure shown.
[0084] The embodiments of this application do not limit the shape of the first cover plate 2011 and the second cover plate 2012. For example, the first cover plate 2011 can be a circular, square, elliptical, or irregularly shaped plate. Similarly, the second cover plate 2012 can be a circular, square, elliptical, or irregularly shaped plate. The shape of the first cover plate 2011 can be the same as or different from the shape of the second cover plate 2012.
[0085] For example, the connection between the first cover plate 2011 and the roll 2013 can be achieved by welding, snap-fitting, or bonding. Similarly, the connection between the second cover plate 2012 and the roll 2013 can be achieved by welding, snap-fitting, or bonding.
[0086] Please continue reading. Figure 2 The magnetodisc 200 includes two reels 201. The first end of the magnetic tape 210 is wound on one reel 201, and the second end of the magnetic tape 210 is wound on the other reel 201.
[0087] During the tape winding process of the magnetic tape 210, in order to prevent the magnetic head from tearing the tape 210, the roller 202 in the magneto-disk 200 can be used to support the tape body of the magnetic tape 210, so that the friction between the magnetic tape 210 and the magnetic head is reduced during the winding process, which is beneficial to improving the service life of the magnetic tape 210.
[0088] Combination Figure 2 and Figure 3As can be seen from the provided embodiments, the magnetic tape 210 is used to store data, and the magnetic tape driver 220 is used to drive the magnetic tape 210 to reel in.
[0089] In terms of hardware implementation, magnetic tape 210 may include one or more data bands. Data bands are data tracks on magnetic tape 210, with different data bands separated and positioned by servo tapes. Multiple data bands are arranged side-by-side along the length of magnetic tape 210. Each data band contains multiple wraps, which are data transfers from one end of magnetic tape 210 to the other. Each wrap includes one or more tracks, each track being accessed by a read head / write head. The number and size of data bands in magnetic tape 210 depend on the generation and capacity of the tape. "Wrap" is a term used in magnetic tape terminology; "wrap" refers to the movement of a head on a data band.
[0090] The magnetic head in the magneto disk 200 accesses the magnetic tape 210 during the tape rewinding process. The processor 240 is used to control the speed at which the tape drive 220 drives the magnetic tape 210 according to the I / O stream, and to control the magnetic head 230 to slide to access the tape area in the magnetic tape 210.
[0091] In this embodiment, the processor 240 is a CPU used to process data access requests (such as I / O requests) or task requests from outside the magneto-electric disk 200 (servers or other storage systems), and also to process requests generated internally by the magneto-electric disk 200. For example, when the processor 240 receives write data requests (write requests) sent by a data access device or host through a front-end interface, it temporarily stores the data in these write data requests in a cache 241. When the total amount of data in the cache 241 reaches a certain threshold, the processor 240 stores the data stored in the cache 241 to the magnetic tape 210 for persistent storage through a back-end port.
[0092] exist Figure 2 In the illustrated magneto disk 200, the processor 240 and cache 241 are configured independently. However, in some optional cases, the cache 241 is integrated into the processor 240 and connected via a PCIe bus, a unified bus (Ubus or UB), or a compute express link (CXL) bus, etc., which is not limited in this application.
[0093] Please continue reading. Figure 2 As an optional implementation, the tape drive 220 includes a tape reel motor and a voice coil motor (VCM) motor.
[0094] The tape reel motor is used to drive the magnetic tape 210 to wind along its length. For example, the tape reel motor can be used to drive a drum, causing the magnetic tape wound on the drum to rewind in a first direction, rewind in a second direction, or stop rewinding. The first direction and the second direction are two opposite directions along the length of the magnetic tape.
[0095] The VCM motor is used to drive the magnetic tape 210 to move along the width of the tape 210, so that the magnetic head 230 can access different tracks in the tape 210. The VCM is a direct drive motor, and its working principle includes: a current-carrying coil placed in a magnetic field will generate a force, the magnitude of which is proportional to the current applied to the coil. Based on this principle, the movement of the VCM can be linear or circular.
[0096] Optionally, the tape drive 220 may also include a stepper motor for fine-tuning the winding position or speed of the tape 210 along its length. This stepper motor is a type of electric motor that converts electrical pulse signals into corresponding angular or linear displacements. For each input pulse signal, the rotor rotates by an angle or moves forward one step; the output angular or linear displacement is proportional to the number of input pulses, and the rotational speed is proportional to the pulse frequency. Therefore, a stepper motor is also called a pulse motor.
[0097] It is worth noting that the tape drive 220 described above are merely examples provided in the embodiments of this application and should not be construed as limiting the application. The tape drive 220 may also include devices such as linear motors, hydraulic cylinders, or pneumatic cylinders, which are not limited in this application.
[0098] As an optional implementation, the magnetic head 230 may include one or both of a write head and a read head. The write head records data by magnetizing and changing the magnetic field of the magnetic medium (such as magnetic powder), while the read head reads data from the magnetic medium by sensing its magnetic field.
[0099] In some alternative configurations, the magnetic head 230 may also include a servo head, which may be divided into a write servo head and a read servo head. Taking the read servo head as an example, the read servo head can determine the position information of the tape 210 based on the address in the IO request, and the tape driver 220 can rewind the tape 210 from its current position to the target tape area indicated by the position information, so that the read data head can read the data stored in the target tape area.
[0100] Optionally, the magneto-electric disk 200 may also deploy an application (APP) and a driver. The application can be used to obtain data access requests (such as read requests or write requests) or send access responses to the host, such as write responses or read responses. For example, after the application triggers a read or write operation, the IO data stream is sent to the firmware corresponding to the magnetic tape 210 through the driver. The firmware then issues instructions to control the motor to drive the tape body of the magnetic tape 210 to perform linear addressing. After reaching the desired tape position, the read / write operation is realized by the read / write head through the ADC / DAC channel for encoding and decoding.
[0101] The following is in conjunction with the appendix Figure 4 The task scheduling framework provided in the embodiments of this application will be described by way of example. Figure 4 A software architecture diagram for task scheduling is provided for this application. Please refer to [link / reference]. Figure 4 The software architecture for task scheduling includes: an information acquisition module 410, an information processing module 420, a queuing score calculation module 430, a task scheduling module 440, and a task execution module 450. In some optional scenarios, Figure 4 The software architecture for task scheduling shown is also called the in-disk system scheduling framework. The disk can refer to the aforementioned disk frame 122, the magnetoelectric disk 200, or a storage system containing multiple magnetoelectric disks (such as the aforementioned storage system 120). This application does not limit it in this regard.
[0102] Please see Figure 4 The functions of each module in the software architecture of task scheduling will be illustrated below.
[0103] The information acquisition module 410 is used to acquire information about the storage system 120, including but not limited to one or a combination of the following: the hardware status or software status of the storage system 120. The hardware status includes the storage capacity of the magnetic tape in the magnetodisk 200, the parameters of the motor, the parameters of the processor 240, etc., in the storage system 120. The software status includes the status of the storage software deployed in the storage system 120, such as read tasks, write tasks, or other statuses, such as data reconstruction, garbage collection, or others.
[0104] The information processing module 420 is used to process the information acquired by the information acquisition module 410 and apply the processed information to the queuing score calculation module 430 or the task scheduling module 440.
[0105] The queuing score calculation module 430 comprises two parts: one part classifies and combines the task requests issued by the storage system 120 to obtain multiple request queues; the other part uses a queuing model to evaluate the expected scheduling benefits of the multiple request queues and obtain the queuing scores of different request queues. In this embodiment, multiple task requests can be cached in a user request sequence, which is used to temporarily store the task requests to be processed.
[0106] For example, the classification and combination method includes: the queuing score calculation module 430 initially classifies multiple task requests according to their type (or task type) and priority (such as priority L0, priority L1, priority Ln), and then classifies and combines different task requests within the same priority to obtain request queues belonging to the same priority, such as request queues T00, T01...T0x belonging to priority L0, request queues T10...T1y belonging to priority L1, and request queues Tn0, Tn1...Tnz belonging to priority Ln.
[0107] The task scheduling module 440 is used to sort multiple request queues according to information such as cache status, priority of multiple request queues, and queuing score, to obtain the scheduling order of multiple request queues. For example, the task scheduling module combines queuing theory, queuing score, and priority rules to sort multiple request queues.
[0108] The task execution module 450 is used to execute task requests in multiple request queues sequentially according to the scheduling order determined by the task scheduling module 440. For example, the task execution module 450 executes multiple request queues, including: the magnetoelectric disk 2 (MED2) executes request queue T10 first, and after T10 is executed, it executes request queue Tn1; the magnetoelectric disk 1 (MED1) executes request queue Tn0.
[0109] It is worth noting that the functions of each module in the above software architecture are merely optional examples provided in the embodiments of this application and should not be construed as limiting this application. The following example, using the task scheduling method provided in the embodiments of this application applied to a storage system 120, provides an exemplary description of the task scheduling method provided in the embodiments of this application.
[0110] Figure 5 A flowchart illustrating a task scheduling method provided in this application. Figure 1 This task scheduling method is executed by controller 1 or control unit 1225 in storage system 120. For details regarding the hardware implementation of storage system 120, controller 1, and control unit 1225, please refer to [reference needed]. Figure 1 The description of that will not be repeated here.
[0111] exist Figure 5 In this embodiment, cache 123 is connected to the controller and each magneto disk. The hardware implementation of magneto disks 1 to k can be found in the description of magneto disk 200 above, and will not be repeated here. In this embodiment, cache 123 temporarily stores various task requests. In terms of hardware implementation, cache 123 can be located within the controller or outside of it, such as in the aforementioned disk frame 122; this application does not limit this. The storage medium used by cache 123 can include, but is not limited to, one or more of the following: SSD, DRAM, SRAM, or other types of storage media.
[0112] The following describes S510 to S550 by taking the task scheduling method provided in the embodiments of this application as an example.
[0113] S510, the controller receives multiple task requests.
[0114] For example, the plurality of task requests includes task request 1 ( Figure 5 The circular pattern shown in the image 1) and the task request 2 are... Figure 5 The circular pattern shown in the image 2), task request 3 ( Figure 5 The circular pattern shown in the image 3), Task Request 4 ( Figure 5 The circular pattern shown in the image 4), task request 5 ( Figure 5 The circular pattern shown in the image 5) etc.
[0115] The task indicated by the task request includes any of the following: data reading, data writing, scheduled data reading, data reconstruction, garbage collection, or user operation tasks.
[0116] Data writing refers to the controller writing data to a storage medium (such as a cache in a magnetic tape or magneto disk).
[0117] Data read refers to the controller reading data from a storage medium (such as a cache in a magnetic tape or magneto disk).
[0118] Data scheduled read refers to the controller reading data from the storage medium before a specified time.
[0119] Data reconstruction refers to the controller integrating data from different sources into a unified data model for data analysis and computation.
[0120] Garbage collection (GC) refers to the process where, when new data cannot be written to the storage area corresponding to garbage data in the storage medium, the controller needs to migrate the valid data from the storage block or page containing that garbage data to increase the remaining usable storage capacity in the storage medium. Garbage data refers to data that will not be read after GC, while valid data refers to data that will be read or used by processes or access devices (such as compute nodes, hosts or clients, and controllers) after GC.
[0121] User operation tasks refer to: operating the magnetic disks in the storage system 120 according to user needs. This includes adding magnetic disks, removing magnetic disks, retrieving magnetic tapes from magnetic disks, importing magnetic tapes into magnetic disks, or other user operations.
[0122] Optionally, the user operation task includes one or both of the following: tape import task or tape export task.
[0123] The tape import task includes importing a first tape into the first magneto-electric disk among multiple magneto-electric disks. If the tape import task is named an import task, the first magneto-electric disk is... Figure 5 The magneto disk 1 shown is a newly added magnetic tape.
[0124] The tape export task includes exporting the second tape from the second magnetodisk among multiple magnetodisks. For example, if the tape export task is named an "export" task, the second magnetodisk is... Figure 5 The magneto disk 2 shown is the second magnetic tape in the magneto disk 2.
[0125] In this embodiment, the controller can not only schedule tasks related to data access services, but also schedule operational services such as tape import and tape export. With the implementation of SLA, the execution continuity of tasks in the storage system is guaranteed.
[0126] The tasks described above are merely optional examples of task requests provided in the embodiments of this application and should not be construed as limiting this application. In some optional implementations, the tasks described above may also refer to data migration, which means that the controller moves data from one storage area to another storage area. These two storage areas may be located on the same storage medium or on different storage media.
[0127] S520: The controller obtains multiple request queues based on the priority and characteristics of multiple task requests.
[0128] The priority of a task request is determined by the SLA (Service Level Agreement) of the user who generated it. In storage systems, different priorities may have different processing orders. For example, the controller may process higher-priority task requests first, followed by lower-priority ones. In some alternative methods, the task request priority refers to the priority of the task indicated by the request.
[0129] The task request features indicate one or more of the following combinations: magneto disk identification (MED ID), execution deadline, and access address.
[0130] A magneto disk identifier refers to a label used to mark a magneto disk. For example, this magneto disk identifier is used to indicate the magneto disk object to be operated on in a task request.
[0131] The deadline for execution refers to the latest time that the task requested by the task request can be completed. In other words, the controller must complete the execution of the task before the deadline expires.
[0132] The access address in the task request characteristics is either a physical storage address or a logical storage address.
[0133] In one alternative scenario, the access address is a physical storage address, such as a physical block address (PBA). The PBA indicates the actual location of the storage space represented by the access address within a solid-state storage cell or memory.
[0134] In another alternative scenario, the access address is a logical storage address, such as a logical block address (LBA). The LBA refers to the address used by the storage software in the host to manage different storage spaces within the storage pool.
[0135] The two feasible examples above are merely examples of access addresses provided in the embodiments of this application and should not be construed as limiting this application. In some alternative situations, the characteristics of the task request may also include other information, such as the estimated execution time of each task request, etc., which are not limited in this application.
[0136] It is worth noting that the above features are merely optional examples of task requests provided in the embodiments of this application and should not be construed as limiting this application. In some optional embodiments, the features of the task request may also include other content, such as the data volume of the task indicated by the task request, the service level agreement (SLA) adopted by the task request, and the task type of the task request, etc., which are not limited in this application.
[0137] In this embodiment of the application, one of the multiple request queues mentioned above includes at least one task request that has the same magneto disk identifier and belongs to the same priority among the multiple task requests.
[0138] For example, all task requests in the request queue T10 have a priority of L1 and are identified as magneto disk MED 2, i.e., the second magneto disk (magneto disk 2).
[0139] For example, all task requests in the request queue Tn0 have a priority of Ln and are identified as MED 1, i.e., the first magneto disk (magneto disk 1).
[0140] The two optional examples above are merely optional methods of request queues provided in the embodiments of this application and should not be construed as limiting this application. In the embodiments of this application, multiple request queues include: request queues T00 to T0x with priority L0, request queues T10 to T1y with priority L1, and request queues Tn0 to Tnz with priority Ln.
[0141] As an optional implementation, the process of the controller acquiring multiple request queues can be implemented hierarchically. For example... Figure 6 As shown, Figure 6 A flowchart illustrating a task scheduling method provided in this application. Figure 2 .about Figure 6 The hardware implementation of each component is described in the foregoing embodiments and will not be repeated here.
[0142] Please see Figure 6 The above-mentioned S520 includes the following S610 to S630.
[0143] S610, the controller acquires the characteristics of multiple task requests.
[0144] In an optional scenario, the task request field carries the characteristics of the task request, and the controller parses the task request to obtain these characteristics. Please refer to the description of the task request characteristics in S520 above; they will not be repeated here.
[0145] S620: The controller divides multiple task requests into multiple first candidate request queues based on the magneto disk identifier indicated by the feature.
[0146] The first candidate request queue includes one or more task requests with the same magneto disk identifier.
[0147] exist Figure 6In this context, the multiple first candidate request queues include: request queues E00 to E0x, request queues E10 to E1y, and request queues En0 to Enz. For example, the magneto disk corresponding to request queue E00 is identified as MED 1.
[0148] For example, after the controller obtains the magneto disk identifier of each task request, it initially groups the different task requests according to the magneto disk object indicated by the magneto disk identifier. This helps to ensure that task requests in the same request queue have the same magneto disk object, thereby reducing the magneto disk switching latency generated during task scheduling and task execution, and improving the efficiency of task scheduling and task execution.
[0149] S630: The controller classifies and combines multiple first candidate request queues according to the priority of multiple task requests to obtain multiple request queues.
[0150] As can be seen from the embodiments provided in S610 to S630, task requests belonging to the same request queue have the same magneto disk identifier. During the execution of all task requests in the request queue, the controller does not need to switch the target magneto disk to be operated, which reduces the magneto disk switching latency and ensures the execution continuity of tasks in the storage system. This is beneficial to improving the timeliness of task scheduling and increasing the bandwidth of the storage system.
[0151] As an optional implementation, in the above S630, the process of the controller acquiring the classification and combination of multiple request queues can be implemented through clustering based on different features. For example... Figure 7 As shown, Figure 7 A flowchart illustrating a task scheduling method provided in this application. Figure 3 .about Figure 7 The hardware implementation of each component is described in the foregoing embodiments and will not be repeated here.
[0152] Please see Figure 7 The above-mentioned S630 includes the following S6301 to S6302.
[0153] S6301, The controller divides multiple first candidate request queues into multiple second candidate request queues according to the priority of multiple task requests.
[0154] The second candidate request queue includes M task requests that have the same magneto disk identifier and belong to the same priority.
[0155] exist Figure 7 In this context, the multiple second candidate request queues include: request queues E00 to E0x, request queues E10 to E1y, and request queues En0 to Enz.
[0156] S6302, The controller uses the deadline execution time and access address indicated by the feature as the cluster center to process multiple second candidate request queues to obtain multiple request queues.
[0157] The specific implementation details regarding the execution deadline and access address can be found in the description of S520 above, and will not be repeated here.
[0158] The first request queue among the multiple request queues includes N task requests that belong to the same priority and have a first magneto disk identifier. For example, if the first request queue is T00, the magneto disk identifier corresponding to T00 is MED 1 and the priority is L0.
[0159] Taking the first request queue (T00) as an example, among the N task requests in the first request queue, the difference between the deadline execution times of any two task requests is less than or equal to the first threshold.
[0160] In one example, the first threshold is a preset value. For example, the first threshold could be 10 seconds, 20 seconds, 100 seconds, or another value.
[0161] In another example, the first threshold is a dynamic value. For instance, this first threshold could be derived from statistics based on the cutoff execution time of multiple task requests.
[0162] The above two examples are optional methods of the first request queue provided in the embodiments of this application, and should not be construed as limiting this application.
[0163] In this embodiment of the application, the aforementioned first magnetoelectric disk identifier indicates magnetoelectric disk 1 among a plurality of magnetoelectric disks, and in N task requests, the tape area indicated by the access address of the N task requests is adjacent to the position of the first magnetoelectric disk.
[0164] For example, the proximity of the tape regions indicated by the access addresses of the N task requests in the first magnetodisk means that the first magnetodisk includes the first magnetic tape, and the tape regions indicated by the access addresses of the aforementioned N task requests are distributed along a first direction of the first magnetic tape.
[0165] In one alternative example, the first direction is the length direction of the first magnetic tape.
[0166] In another alternative example, the first direction is the width direction of the first magnetic tape.
[0167] In this embodiment, the controller groups and aggregates task requests according to their priority and characteristics, so that task requests with the same priority and the same magneto disk identifier are executed in the same request queue, which helps to improve the execution continuity of tasks in the storage system.
[0168] Furthermore, since the tape regions corresponding to different task requests in the same request queue are adjacent, the tape reel distance in the magneto-electric disk decreases during the execution of tasks in this request queue, reducing the access latency in the magneto-electric disk and improving the execution efficiency of tasks in the storage system.
[0169] As an optional implementation, the process of the controller acquiring multiple request queues in S6302 above can be further optimized to reduce the total latency of multiple task requests, such as... Figure 8 As shown, Figure 8 A flowchart illustrating a task scheduling method provided in this application. Figure 4 .
[0170] Please see Figure 8 The above-mentioned S6302 includes the following S801 to S802.
[0171] S801, the controller uses the deadline and access address indicated by the feature as the cluster center to cluster multiple second candidate request queues, and obtains multiple third candidate request queues.
[0172] The specific implementation details regarding the execution deadline and access address can be found in the description of S520 above, and will not be repeated here.
[0173] For example, cluster centroids refer to the central points determined from K randomly selected points during unsupervised learning. For instance, the controller uses K random task requests to determine cluster centroids. For different task requests in multiple second candidate request queues, it calculates the distance between each task request and the central point of the K random task requests, and clusters task requests whose distance to the central point of the K random task requests is less than or equal to a certain threshold into a third candidate request queue.
[0174] In some optional scenarios, the controller performs initial clustering of second candidate request queues in different time ranges based on the deadline execution time, and then performs secondary clustering of second candidate request queues in the same time range based on the tape area indicated by the access address, thereby obtaining multiple third candidate request queues.
[0175] In some alternative scenarios, the controller performs an initial clustering of second candidate request queues located in different tape regions based on the tape region indicated by the access address, and then performs a secondary clustering of second candidate request queues located in the same tape region based on the deadline execution time, thereby obtaining multiple third candidate request queues.
[0176] The examples and optional scenarios above are all optional methods provided by the embodiments of this application and should not be construed as limiting this application.
[0177] S802, the controller aggregates multiple third candidate request queues based on the multiple estimated execution times and total data volume corresponding to multiple task requests, and obtains multiple request queues.
[0178] In this application, one estimated execution time corresponds to one task request. This estimated execution time can be obtained by the controller invoking a scheduling algorithm to predict the task request. This scheduling algorithm can refer to a latency prediction model. The latency prediction model is determined based on the addressing parameters specifically provided for the motor in the magnetodisk during use. These addressing parameters include, but are not limited to, one or more combinations of the following: maximum motor speed, cruising speed, read / write speed, acceleration, deceleration, lane-changing latency, and U-turn latency. The above latency prediction model is merely an optional method provided in this embodiment and should not be construed as limiting this application. For example, the estimated execution time for one task request is 5 seconds.
[0179] In this embodiment of the application, the first request queue mentioned above includes at least two third candidate request queues among a plurality of third candidate request queues, and the estimated execution time of the first request queue is less than the sum of the estimated execution times of the at least two third candidate request queues.
[0180] The specific implementation process of S802 will be illustrated below with reference to Formula 1.
[0181]
[0182] Among them, time i Let f(x) be the estimated execution time of the i-th task request, x be the number of task requests in the first request queue, f(x) be the estimated execution time of the first request queue, a be the number of task requests in the first third candidate request queue, b be the number of task requests in the second third candidate request queue, f(a) be the estimated execution time of the first third candidate request queue, and f(b) be the estimated execution time of the second third candidate request queue.
[0183] If the sum of the estimated execution times of the two third candidate request queues is greater than the estimated execution time of the first request queue, the two third candidate request queues are aggregated to obtain the first request queue.
[0184] Furthermore, if the sum of the estimated execution times of the two third candidate request queues is equal to the estimated execution time of the first request queue, the two third candidate request queues may or may not be aggregated, and this application does not limit this.
[0185] Optionally, the multiple request queues may also include a second request queue, such as T10, which is consistent with one of the multiple third candidate request queues (such as H10).
[0186] In this embodiment, the controller can aggregate two or more third candidate request queues based on the estimated execution time of different queues, thereby reducing the overall execution time of multiple task requests, improving the execution effect of task scheduling, and helping to improve the execution efficiency of tasks in the storage system.
[0187] S530: The controller determines the scheduling order of multiple request queues based on the cache status.
[0188] For example, if the cache occupancy rate is greater than or equal to a first threshold, the controller prioritizes the request queues with write data tasks among multiple request queues.
[0189] For example, if the cache occupancy rate is less than or equal to a second threshold, the controller prioritizes the request queues whose tasks are reading data from multiple request queues. This applies if the second threshold is less than the first threshold.
[0190] The above examples are merely optional methods for the controller to determine the scheduling order according to the embodiments of this application, and should not be construed as limiting this application. Regarding the optional implementation methods for the controller to determine the scheduling order, the following... Figure 7 and Figure 8 Optional specific examples are provided, which will not be elaborated here.
[0191] S540: The controller executes tasks indicated by multiple request queues according to the scheduling order.
[0192] For example, the controller executes the tasks indicated by multiple request queues in sequence based on the scheduling order determined by S530.
[0193] In some alternative implementations, the scheduling order of multiple request queues includes: the request queue with higher priority is scheduled before the request queue with lower priority.
[0194] Please see Figure 5 In request queues T00 to T0x, T10 to T1y, and Tn0 to Tnz, priority L0 is the highest and priority Ln is the lowest. The scheduling order of multiple request queues is as follows: first execute the request queues corresponding to priority L0, then execute the request queues corresponding to priority L1, and finally execute the request queues corresponding to priority Ln.
[0195] Based on the content of S510 to S540, the controller combines multiple task requests according to their priority and characteristics, resulting in multiple request queues. Since task requests in the same request queue belong to the same priority, the task scheduling process ensures the priority of different task requests. That is, the storage system 120 can process task requests with the same SLA or priority in batches. Moreover, task requests in the same request queue have the same disk identifier. Thus, during the execution of all task requests in the request queue, the controller does not need to switch the target disk (such as disk 1) to be operated on, reducing disk switching latency and ensuring the continuity of task execution in the storage system 120. This is beneficial for improving the timeliness of task scheduling and increasing the bandwidth of the storage system.
[0196] Regarding the process by which the controller obtains the scheduling order of multiple request queues, the following section combines... Figure 9 Provided as an example, Figure 9 A flowchart illustrating a task scheduling method provided in this application. Figure 5 .about Figure 9 The specific implementation of each device can be referred to the description of the foregoing embodiments, and will not be repeated here.
[0197] Please see Figure 9 The aforementioned S530 includes S531 to S532.
[0198] S531. For one of multiple request queues, the controller takes the cache state and the queue characteristics of a request queue as input to the queuing model and outputs the queuing score of a request queue.
[0199] The queue characteristics include one or a combination of the following: the amount of data corresponding to a request queue, seek time, maximum tape wear information, and maximum execution wait time for executing a request queue.
[0200] The data volume corresponding to a request queue refers to the total amount of data that needs to be read / written when executing all task requests in that request queue.
[0201] Seek time refers to the time required for the target tape area on the magneto disk to align with the read / write head during the execution of all task requests in the request queue.
[0202] The maximum wear information of a magnetic tape refers to the number of contacts between the magnetic tape and the magnetic head.
[0203] In this embodiment of the application, a queuing model is used to determine the queuing score of each request queue among multiple request queues. For example, the queuing score model will be described below in conjunction with queue characteristics.
[0204]
[0205] Among them, Score total This refers to the queuing score of the request queue, t. max T represents the maximum execution wait time (also known as the maximum execution wait duration). EXT This refers to the time spent switching from the current task's end position to the position of the pending task request. This T... EXT This includes the timing of rewinding, wrap switching, motor reversal, and tape object switching. V max For maximum wear and tear (i.e., maximum wear information of the magnetic tape), V wear During the execution of task requests in the request queue, information such as invalid tape movement distance and invalid rewind counts are converted into proportionally determined values. V write The total amount of data written to the request queue, Cache max α represents the maximum storage capacity of the cache. α, β, γ ≥ 0. MINtime represents the minimum execution time.
[0206] In this embodiment, when the cache occupancy rate is lower than the occupancy rate threshold, β = 0, where the occupancy rate threshold is a fixed value or a dynamic value. For example, the occupancy rate threshold is 50%, 60%, or other values.
[0207] In one alternative example, if MINtime > T1, then γ = 0, where T1 is the reserved time. If the value of T1 is greater than ESTtime... i +t max Two times or more.
[0208] In another alternative example, if MINtime < ESTtime i +t max If no task has a deadline, the queuing score for the i-th task request (non-near-expiration task) will be calculated. If no task has a score to calculate, an alarm will be triggered, and the storage system will stop executing any tasks other than those with near-expiration deadlines until all near-expiration tasks have been completed.
[0209] S532. The controller obtains the scheduling order of multiple request queues based on the queuing score of each request queue in the multiple request queues.
[0210] For example, the controller sets the scheduling order of request queues with higher queuing scores to the front to ensure the SLA for different task requests, thereby improving the overall bandwidth of the storage system.
[0211] The specific implementation process of S532 will be illustrated below with examples for different situations.
[0212] The first case: When the deadline of at least one request queue among multiple request queues is less than or equal to the reserved time, the controller obtains a first sub-order according to the queuing scores of the at least one request queue.
[0213] Exemplarily, assume that there is currently a batch of task requests T = {t1, t2, t3, t4, …, t n}, the priorities L of the corresponding tasks = {1, 1, 4, 4, …, l n}, and the corresponding estimated execution times of the tasks are ESTtime = {4, 7, 10, 11, …, ESTt n}, the current time is 0, and the task reservation times (deadlines) corresponding to each task request are {100, 100, 30, 30, …}: Use the task scheduling method provided by the embodiments of the present application to sort and schedule this batch of task requests: 1. According to the tape object (magneto-electric disk identifier) of the task, the priority of the task, and time constraints (such as deadline, estimated execution time, etc.), obtain at least 2 grouped task requests {t1, t2}, {t3, t4, …, t n}; continue to cluster these 2 groups of task requests according to the characteristics of the tasks to obtain 2 request queues Tg0 (priority L1, reservation time 100s, estimated processing time 11s) and Tg1 (priority L4, reservation time 30s, estimated processing time 21s+). 2. Since the time of the read request satisfies the condition (MINtime < T1), the task requests {t3, t4, …, t n} of Tg1 are preferentially scheduled. 3. After the scheduling of Tg1 is completed, continue to schedule the task requests {t1, t2} of Tg0 according to the priority. 4. Finally, the controller outputs the scheduling execution sequence as {t3, t4, …, t n} → {t1, t2}.
[0214] In the first case, for the case where the deadline is less than or equal to the reserved time, the controller preferentially sorts the request queue so that the controller preferentially schedules the request queue, thereby avoiding the problem of conflicts between the switching of the magneto-electric disk and task scheduling, and is also beneficial to ensuring the SLA corresponding to the request queue and improving the user experience.
[0215] The second case: When the occupancy rate of the cache reaches the first threshold, the controller sorts the request queues with the task type of writing data among the multiple request queues to obtain a second sub-order. This second sub-order is after the first sub-order in the first case. For example, the first threshold is 85%.
[0216] Exemplarily, the meaning that the occupancy rate of the cache reaches the first threshold refers to: the occupancy rate of the cache is greater than the first threshold.
[0217] For example, the meaning of "cache utilization rate reaches the first threshold" is that the cache utilization rate is greater than or equal to the first threshold.
[0218] In the second scenario, if the cache contains a large amount of data, the controller prioritizes scheduling the request queue for writing data tasks, thereby reducing the space occupied in the cache. This not only helps improve the utilization of cache storage resources but also helps improve the write data bandwidth in the storage system, thus improving the execution efficiency of tasks in the storage system.
[0219] The third scenario: When the cache occupancy rate is lower than the second threshold, the controller sorts the request queues with the task type of reading data in multiple request queues to obtain a third sub-order.
[0220] In this case, the second threshold is less than the first threshold in the second scenario. For example, the second threshold is 30%, and the first threshold is 85%.
[0221] In the third scenario, if there is a large amount of available storage space in the cache, the controller will prioritize scheduling the request queue for executing read data tasks, thereby improving the utilization of cache storage resources, increasing read data bandwidth in the storage system, and improving the execution efficiency of tasks in the storage system.
[0222] The first to third scenarios mentioned above can be implemented independently, or in pairs, or in a combination of all three scenarios; this application does not limit this.
[0223] Based on the above content of S531 and S532, it can be seen that for different request queues, the controller calls the queuing model to calculate the queuing score for each request queue. For example, the queuing model calculates the queuing score of different request queues based on information such as data volume, seek time, maximum tape wear information, and maximum execution waiting time, so that different request queues have different scheduling weights in the task scheduling process. This realizes that the scheduling process of tasks under multi-dimensional characteristics can store quantifiable expected benefits (queuing score), which is conducive to improving the task scheduling effect of different task requests in the storage system and reducing the execution latency of tasks in the storage system.
[0224] Regarding the specific implementations of S530, S531, and S532 above, the scheduling order of multiple request queues provided in this application embodiment is illustrated below based on a greedy strategy and queuing theory. A greedy strategy refers to the process of gradually obtaining the optimal solution; the optimal solution determined by the greedy strategy is generally a globally approximate optimal solution. Queuing theory, also known as the theory of stochastic service systems, refers to the method of obtaining statistical patterns of indicators such as the arrival of service objects and service time through statistical analysis, and then using these patterns to achieve target values for one or more indicators.
[0225] The following examples illustrate S530, S531, and S532 of this application embodiment, which determine priority strategies based on queuing theory and global sorting strategies based on greedy strategies.
[0226] I. Priority Strategy.
[0227] Strategy 1: If the task request is a read request and the deadline (MINTime) of the read request is less than or equal to the reserved time (T1), the controller will immediately schedule all read requests that need to be completed by the deadline to Strategy 6 for orchestration.
[0228] Strategy 2: When a task request is a read request and the cache occupancy rate reaches a first threshold (e.g., 85%), the controller immediately schedules write requests, and Strategy 6 is used for orchestration and scheduling. Additionally, the controller enables expansion modules (EXPs), increasing the number of EXPs based on write bandwidth calculations to increase write bandwidth.
[0229] Strategy 3: If the cache occupancy rate is below the second threshold, reduce the number of scheduled write requests, increase the scheduling priority of read requests, and prioritize the scheduling of read requests. The second threshold can be determined by the product of the number of EXPs and the lowest aggregation granularity.
[0230] II. Global sorting strategy.
[0231] Strategy 4: Select request queues with queuing scores higher than the scheduling threshold from multiple request queues for orchestration and scheduling. If the queuing score of no request queue reaches the scheduling threshold, then lower the scheduling threshold. For example, this scheduling threshold is 0.5×(1+α), where α is a set value, such as 10, 15, 20, or others.
[0232] Strategy 5: Arrange according to priority from high to low.
[0233] Strategy 6: Request queues of the same priority are arranged in descending order of queuing score.
[0234] In strategies 1 to 6 above, the greedy strategy selects a portion of the request queues for priority scheduling based on the dynamic changes in the scheduling scale, characteristics, and constraints of the task requests. The global sorting strategy schedules requests based on their priority and queuing score, thereby achieving two-level scheduling of EXP-magnetic disk and magneto disk-task while ensuring the SLA of different task requests.
[0235] For example, suppose the storage system receives multiple task requests as T = {t1, t2, t3, t4, ..., t6}. nThe task priority corresponding to each task request is L={1,1,4,4,…,l}. n The estimated execution time for each task request is ESTtime = {4, 7, 10, 11, 10, 10…, ESTt}. n The current time is 0, and the scheduled execution time (deadline) for each task request is {30, 30, 100, 100, 200, 200}. The system task scheduling method is used to sort and schedule these task requests. The sorting and scheduling process includes: 1. Based on the task's tape object, priority, and time constraints, at least three grouped tasks are obtained: {t1, t2}, {t3, t4}, {t5, ..., t6}. n}; Continue to aggregate the 3 sets of information according to task characteristics to obtain 3 task groups: Tg0 (priority 1, deadline 30, estimated execution time 11), Tg1 (priority 4, deadline 100, estimated execution time 21), and Tg2 (priority 4, deadline 200, estimated execution time 21). 2. Since strategies 1 to 3 are not currently satisfied, calculate the queuing score for each task group according to strategy 4, which are 300, 200, and 100 respectively. 3. According to strategy 5, prioritize scheduling the higher-priority Tg0{t1,t2}. 4. According to strategy 5, Tg1 and Tg2 have the same priority, so proceed to strategy 6 for judgment. 5. According to strategy 6, Tg1 has a score of 200, and Tg2 has a score of 100. Compare the queuing scores and prioritize scheduling Tg1. 6. According to strategy 6, the final task scheduling output sequence is {t1,t2}→{t3,t4}→{t5,t6,…,t n}
[0236] above Figure 9 The provided embodiments exemplarily illustrate the process of determining the scheduling order, and are relevant to the above. Figures 5 to 9 The specific implementation process is as follows: Figure 10 One possible example is provided. Figure 10 A flowchart illustrating a task scheduling method provided in this application. Figure 6 The task scheduling method includes the following steps S1010 to S1060.
[0237] Phase 1: Task input phase or user request queue caching phase, including S1010 to S1012 below.
[0238] S1010, User task input.
[0239] For example, a user sends an input to an I / O device. Figure 4 The illustrated software architecture for task scheduling submits user tasks. This software architecture can be used by a controller.
[0240] S1011. The controller receives user tasks and determines whether the cache count is greater than or equal to the threshold. The cache count is the number of user task requests stored in the cache, and the threshold is a default value or a user-defined value.
[0241] If the number of cached items is greater than or equal to the threshold, then execute S1012; if the number of cached items is less than the threshold, then continue executing S1010.
[0242] S1012, The controller writes multiple data items associated with multiple task requests of the user task into the cache.
[0243] The second stage: the request queue grouping stage or queue combination stage, including S1021 to S1026 below.
[0244] S1021, The controller extracts the characteristics of each task request.
[0245] For details on the specific implementation of the features, please refer to the description in S520 above, which will not be repeated here.
[0246] S1022, The controller performs read request copy distribution.
[0247] For example, when the storage system uses a multi-replica redundant storage method to store data, the controller can complete the read request task by reading any data replica. Therefore, after finding the target magnetic disk corresponding to the read request, the controller can determine the other magnetic disks associated with the read request and allocate the read request to one of the multiple request queues of the corresponding other magnetic disks.
[0248] S1023, The controller clusters based on priority and features.
[0249] For details on the implementation of S1023, please refer to the aforementioned documentation. Figure 6 or Figure 7 The description of that will not be repeated here.
[0250] S1024. Based on the clustering results obtained in S1023, the controller determines the request queue group for each magneto-electric disk.
[0251] The specific implementation of S1024 can be found in the description of S620 above, and will not be repeated here.
[0252] S1025. For the request queue groups corresponding to different magneto disks determined in 924, the controller calculates the estimated execution time for each request queue group.
[0253] For details on the implementation of S1025, please refer to the aforementioned documentation. Figure 8 The description of that will not be repeated here.
[0254] S1026, Controller optimizes request queue.
[0255] The specific implementation of S1026 can be found in the description of Formula 1 above, and will not be repeated here.
[0256] The third stage: the queuing score calculation stage, including S1031 to S1034 below.
[0257] S1031, The controller triggers the queuing score calculation.
[0258] For example, "trigger" means that the controller receives a score calculation request.
[0259] For example, "triggering" means that the controller triggers based on time, such as the controller triggering the queuing score settlement stage according to the set cycle.
[0260] S1032, coefficients of the controller queuing model.
[0261] The coefficients of this queuing model are the model parameters. These parameters can be default values or parameters retrieved by the controller from a database or other devices; this application does not limit their specific implementation. For details on the implementation of this queuing model, please refer to the description related to Formula 2 above; further details are omitted here.
[0262] S1033, The controller determines the input characteristics of the queuing model.
[0263] For example, the input features include, but are not limited to, one or a combination of the following: the amount of data corresponding to the request queue, the seek time, the maximum wear information of the tape, and the maximum execution wait time for executing a request queue. These input features are the queue features in the example of S531 above, and will not be repeated here.
[0264] S1034. The controller calculates the queuing score for multiple request queues.
[0265] For the specific implementation of S1034, please refer to the description related to Formula 2 above, which will not be repeated here.
[0266] Phase 4: Request queue sorting phase, including S1041 to S1050 below.
[0267] S1041, The controller executes the time boundary judgment process.
[0268] For example, the controller obtains the deadline for execution of different request queues and compares the deadline with the reserved time.
[0269] S1042. The controller determines whether the deadline (MINTime) for each request queue is less than the reserved time (T1).
[0270] If MINTime < T1, then execute S1044; if MINTime > T1, then execute S1043; if MINTime = T1, then execute either S1043 or S1044.
[0271] S1043. The controller determines whether the cache occupancy rate is greater than or equal to the first threshold.
[0272] For a description of cache occupancy and the first threshold, please refer to the description in S532 above, which will not be repeated here.
[0273] If the cache occupancy rate is greater than or equal to the first threshold, then execute S1046; if the cache occupancy rate is less than the first threshold, then execute S1044.
[0274] S1044. The controller determines whether the cache occupancy rate is less than or equal to the second threshold.
[0275] The description of cache occupancy and the second threshold can be found in the description of S532 above, and will not be repeated here.
[0276] If the cache occupancy rate is less than or equal to the second threshold, then execute S1047; if the cache occupancy rate is greater than the second threshold, then execute S1048.
[0277] S1045, The controller filters the request queues whose execution deadline is within T1. As provided in the first case or strategy 1 in the embodiment of S532 above.
[0278] S1046. The controller filters the request queue for task requests that are write requests. This is the second case or strategy 2 provided in the embodiment of S532 above.
[0279] S1047. The controller filters the request queue for tasks that are read requests. This is the third case or strategy 3 provided in the embodiment of S532 above.
[0280] S1048. The controller filters request queues whose queuing scores are greater than or equal to the scheduling threshold.
[0281] For a specific implementation of S1048, please refer to strategy 4 in the aforementioned embodiment of S532.
[0282] S1049. The controller sorts multiple request queues from highest to lowest priority.
[0283] For a specific implementation of S1049, please refer to Strategy 5 in the aforementioned embodiment of S532.
[0284] S1050: The controller sorts multiple request queues from highest to lowest according to their queuing scores.
[0285] For a specific implementation of S1050, please refer to strategy 6 in the aforementioned embodiment of S532.
[0286] S1051, The controller outputs the task sequence.
[0287] This task sequence is obtained by sorting multiple request queues according to the scheduling order determined after the fourth stage.
[0288] Combining the content of S1041 to S1051, it can be seen that based on a two-level task scheduling strategy, and based on a greedy strategy and a queuing model, priority (comparison) strategy and a global greedy strategy are set to perform EXP-tape two-layer task orchestration. Taking into account constraints such as time, buffer capacity, and priority, the task scheduling time and task execution time are minimized. This can solve the conflict between tape switching and task scheduling in the storage system's drive, thereby achieving the maximum bandwidth while keeping the task objective low, and making the physical bandwidth utilization of the tape system reach more than 90%. It can also adapt to more versions of storage protocols.
[0289] Phase 5: Task execution phase, including S1060 below.
[0290] S1060, The controller executes the task sequence.
[0291] For example, the controller callback execution interface executes each task corresponding to the task sequence.
[0292] Regarding the execution process of the task request provided by S540 or S1060 in the above embodiments, the following will describe it using writing data and reading data as examples.
[0293] When the task request is a write request, this application provides a method for executing a write task, such as... Figure 11 As shown, Figure 11 This is a flowchart illustrating a data read / write task provided in this application. Figure 11 In this system, the protocol processing module is used to decapsulate write task requests or send data to the data access device. The distributed operating system is used to distribute task requests to different storage nodes. The memory pool is a storage space provided by multiple hard disks (e.g., hard disk 1 to hard disk 4), used to cache data to be written or retrieve data. In this embodiment, the distributed operating system interacts with cross-node data to realize functions such as archived data view, data redundancy management, data index query, and task scheduling. The task scheduling function is supported by a task scheduling device, the specific structure of which can be found in [reference needed]. Figure 4 The software framework for task scheduling shown is not described in detail here.
[0294] In addition, Figure 11 In this system, the data query engine determines the data storage location based on the data index, and the data layout within the magneto disk can be configured by containers deployed on the magneto disk. The distributed operating system manages different magneto disks through magneto disk drivers and firmware. Figure 11 The provided magnetoelectric disks include magnetoelectric disk 1 and magnetoelectric disk 2. The hardware implementation of the magnetoelectric disks can be found in the foregoing. Figures 1 to 3 The description of that will not be repeated here.
[0295] Please see Figure 11 The execution method of the data writing task provided in this application embodiment includes the following S1111 to S1113.
[0296] S1111, Obtain the write task.
[0297] For example, in the front-end archive data write request of the storage system, the data in the write request is aggregated and cached in the storage pool through the distributed operating system.
[0298] S1112. The distributed operating system submits a write task to the task scheduler and waits for scheduling.
[0299] S1113, the magneto disk driver and firmware execute the write task according to the scheduling order determined by the task scheduling device, and write data to magneto disk 1 or magneto disk 2 by the magneto disk driver and firmware.
[0300] For example, during the write task execution of each magneto disk, data is written to the magneto disk using a data redundancy multi-copy strategy, and the data index updates the metadata.
[0301] When the task request is a read request, this application provides a method for executing a read task, such as... Figure 11 As shown, the execution process of the read task includes S1121 to S1125 as described below.
[0302] S1121. The user sends a read request to the distributed operating system, which then places the read request into the scheduled IO queue and reads the index information.
[0303] Alternatively, a read request may also be called a data retrieval request, retrieval request, or other names. This read request is used to instruct the reading of data stored in the magnetic disk. If the task indicated by the read request is a scheduled read task, the magnetic disk can execute the read task before the deadline for the scheduled read task.
[0304] S1122. The archived data view issues data reading tasks through the task scheduling device based on the information in the data redundancy management and waits for scheduling.
[0305] S1123, the magneto disk driver and firmware execute the scheduled read task according to the scheduling order determined by the task scheduling device.
[0306] S1124. Read data from the magneto disk and put the read data into the storage pool.
[0307] S1125, Read data from the front end.
[0308] exist Figure 11 In this process, the execution order of each task can be: magneto disk A → magneto disk C → magneto disk B → magneto disk A. Tasks with an execution deadline (DL) approaching are executed first, followed by tasks with higher priority (such as tasks with black triangles, where the task with the larger black triangle has a higher priority than the task with the smaller black triangle), and finally tasks with lower priority (such as tasks with patterned triangles, where the task with the larger patterned triangle has a higher priority than the task with the smaller patterned triangle).
[0309] above Figure 11 The execution methods for the read and write tasks provided are merely optional methods for the embodiments of this application and should not be construed as limiting the task requests provided in this application. When the task request is of other types, the task scheduling method provided in the embodiments of this application can also be applied to other types of task requests.
[0310] It is understandable that the above Figures 5 to 11 The provided embodiments are all illustrated using the example of a storage system executing the task scheduling method provided in the embodiments of this application. However, in some optional embodiments, the task scheduling method provided in the embodiments of this application can also be applied to a magnetic disk, such as... Figure 2 The magneto disk 200 shown is as follows: Figure 12 As shown, Figure 12 A flowchart illustrating a task scheduling method provided in this application embodiment. Figure 7 .about Figure 12 For a description of each hardware component, please refer to the preceding text. Figure 1 or Figure 2 The description of that will not be repeated here.
[0311] Please see Figure 12 The task scheduling method provided in this application includes the following steps S1210 to S1240.
[0312] S1210 and processor 240 receive multiple task requests.
[0313] The task requested includes any of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation task. For a detailed description of the implementation of S1210, please refer to the description of S510; it will not be repeated here.
[0314] S1220 and processor 240 obtain multiple request queues based on the priority and characteristics of multiple task requests.
[0315] The feature indicates one or more combinations of the following: tape identifier, execution deadline, and access address. One of the multiple request queues includes at least one task request with the same magneto disk identifier and belonging to the same priority among multiple task requests. The difference between this feature and the feature in S520 is that in S1220, the tape identifier is used to mark different tapes in the magneto disk, such as tape 1 to tape 3. The magneto disk may also have more or fewer tapes, which is not limited in this application. For a detailed implementation of S1220, please refer to the relevant description of S520, which will not be repeated here.
[0316] S1230 and processor 240 obtain the scheduling order of multiple request queues based on the status of cache 241.
[0317] For the specific implementation of S1230, please refer to the description of S530, which will not be repeated here.
[0318] S1240, processor 240 executes tasks indicated by multiple request queues according to the scheduling order.
[0319] For details on the specific implementation of S1240, please refer to the description of S530, which will not be repeated here.
[0320] Based on the content of S1210 to S1240, it can be seen that the processor classifies and combines multiple task requests according to the priority and characteristics of different types of task requests to obtain multiple request queues. Since the task requests in the same request queue belong to the same priority, the task scheduling process ensures the priority of different task requests, that is, the magneto disk can process task requests with the same SLA or priority in batches.
[0321] Moreover, since task requests in the same request queue have the same tape identifier, the processor does not need to switch the target tape to be operated on during the execution of all task requests in the request queue, which reduces tape switching latency and ensures the continuity of task execution in the magneto disk. This is beneficial to improving the timeliness of task scheduling and increasing the bandwidth of the magneto disk.
[0322] For details on the task scheduling method for magneto disks, please refer to the aforementioned documentation. Figures 5 to 11 The description of that will not be repeated here.
[0323] It is understood that, in order to achieve the functions in the above embodiments, the magnetoelectric disk, processor, or controller includes hardware structures and / or software modules corresponding to the execution of each function. Those skilled in the art should readily recognize that, based on the units and method steps of the various examples described in conjunction with the embodiments disclosed in this application, this application can be implemented in hardware or a combination of hardware and computer software. Whether a function is executed in hardware or by computer software driving hardware depends on the specific application scenario and design constraints of the technical solution.
[0324] The above text combines Figures 1 to 12 The present application describes in detail the task scheduling method provided according to the embodiments of this application. The following is a detailed description in conjunction with... Figure 13 The chip provided in the embodiments of this application will be described by way of example.
[0325] Figure 13 The schematic diagram of the chip structure provided in this application shows that the chip 1300 includes a memory 1310 and at least one processor 1320. The processor 1320 can implement the task scheduling method provided in the above embodiments, and the memory 1310 is used to store the software instructions corresponding to the task scheduling method. As an optional implementation, in hardware implementation, the chip 1300 can refer to a chip or chip system that encapsulates one or more processors 1320. For example, when the chip 1300 is used to implement the method steps in the above embodiments, the processor 1320 included in the chip 1300 executes the chip steps and possible sub-steps in the above method. In an optional case, the chip 1300 may also include a communication interface 1330, which can be used to send and receive data. For example, the communication interface 1330 is used to receive IO requests or send IO responses, etc.; the communication interface 1330 can be implemented by the interface circuit included in the chip 1300. Therefore, in some examples, the communication interface 1330 can also be referred to as the transceiver of the chip. In the embodiments of this application, the communication interface 1330, processor 1320, and memory 1310 can be connected via a bus 1340, which can be divided into an address bus, a data bus, a control bus, etc. The bus 1340 can be a PCIe bus, or an extended industry standard architecture (EISA) bus, a unified bus (Ubus or UB), a computer fast link (CXL), a cache coherent interconnect for accelerators (CCIX), or other types of buses, etc.
[0326] The chip 1300 provided in this embodiment may be the processor 240 or control unit 1225 mentioned above, or other devices with task scheduling function, and this application does not limit it in this regard. For example, when other processing devices in the magnetoelectric disk also have task scheduling function, the chip 1300 may refer to the aforementioned disk frame or other processing devices in the magnetoelectric disk.
[0327] This application also provides a storage system. The storage system includes a communication interface, a chip, and the magnetoelectric disk provided in any of the foregoing embodiments. The magnetoelectric disk is used to store data, the communication interface is used to receive task requests (such as I / O requests), and the chip is used to manage target magnetoelectric disks in the storage system according to task requests (such as read requests or I / O write requests). The storage system is, for example, a magnetic tape library, a magnetic tape system, or a computer / server that includes magnetoelectric disks as persistent storage media.
[0328] This memory chip includes one or more processors, which can be a very large-scale integrated circuit. The processor contains an operating system and other software programs, enabling it to access magnetic disks and various PCIe devices. The processor includes one or more processor cores. These cores can be, for example, CPUs or other ASICs. The processor can also be other general-purpose processors, DSPs, ASICs, FPGAs, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. In practical applications, the memory system may also include multiple chips.
[0329] Optionally, the storage system may also include, but is not limited to, other storage media: dynamic random access memory (DRAM), static random access memory (SRAM), etc., for caching data from the magnetic disk for processor processing. Additionally, other storage media may be read-only memory (ROM). For example, read-only memory may be programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), etc. This embodiment does not limit the number or type of other storage media. Furthermore, other storage media can be configured to have power-saving functionality. Power-saving functionality means that when the system experiences a power outage and is then powered on again, the data stored in the memory will not be lost. Storage media with power-saving functionality are called non-volatile memory.
[0330] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of this application are performed entirely or partially. The computer can be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user equipment, or other programmable device. The computer program or instructions can be stored in a computer-readable storage medium or transferred from one computer-readable storage medium to another. For example, the computer program or instructions can be transferred from one website, computer, server, or data center to another website, computer, server, or data center via wired or wireless means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium, such as a floppy disk, hard disk, or magnetic tape; it can also be an optical medium, such as a digital video disc (DVD); or it can be a semiconductor medium, such as a solid-state drive (SSD).
[0331] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Various equivalent modifications or substitutions can be conceived within the technical scope disclosed in this application, and these modifications or substitutions should all be covered within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A task scheduling method, characterized in that, Applied to a storage system, the storage system including a controller, a cache, and multiple magneto-electric disks, the method is executed by the controller, and the method includes: Obtain multiple task requests; wherein the task indicated by the task request includes any one of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation task; Multiple request queues are obtained based on the priority and characteristics of the multiple task requests; wherein, the characteristics are used to indicate one or more combinations of the following: magneto disk identifier, execution deadline, and access address; one of the multiple request queues includes: at least one task request among the multiple task requests that has the same magneto disk identifier and belongs to the same priority. The scheduling order of the multiple request queues is obtained based on the state of the cache. The tasks indicated by the plurality of request queues are executed according to the scheduling order.
2. The method according to claim 1, characterized in that, The user operation tasks include one or both of the following: tape import task or tape export task; The tape import task includes: importing the first tape into the first magneto-electric disk among the plurality of magneto-electric disks; The tape export task includes: exporting the second magnetic tape from the second magnetic disk among the plurality of magnetic disks.
3. The method according to claim 1 or 2, characterized in that, The step of obtaining multiple request queues based on the priority and characteristics of the multiple task requests includes: Obtain the characteristics of the multiple task requests; Based on the magneto disk identifier indicated by the feature, multiple task requests are divided into multiple first candidate request queues; the first candidate request queue includes one or more task requests having the same magneto disk identifier. The multiple first candidate request queues are classified and combined according to the priority of the multiple task requests to obtain the multiple request queues.
4. The method according to claim 3, characterized in that, The step of classifying and combining the multiple first candidate request queues according to the priority of the multiple task requests to obtain the multiple request queues includes: Based on the priority of the multiple task requests, the multiple first candidate request queues are divided into multiple second candidate request queues; the second candidate request queue includes: M task requests that have the same magneto disk identifier and belong to the same priority; Using the specified feature-indicated deadline and access address as cluster centers, the multiple second candidate request queues are processed to obtain the multiple request queues. The first request queue among the plurality of request queues includes N task requests that belong to the same priority and have a first magneto disk identifier; among the N task requests, the difference between the deadline execution times of any two task requests is less than or equal to a first threshold; the first magneto disk identifier indicates the first magneto disk among the plurality of magneto disks, and among the N task requests, the tape area indicated by the access address of the N task requests is adjacent in position to the first magneto disk.
5. The method according to claim 4, characterized in that, The step of using the deadline and access address indicated by the feature as cluster centers to process the multiple second candidate request queues to obtain the multiple request queues includes: Using the deadline and access address indicated by the features as cluster centers, the multiple second candidate request queues are clustered to obtain multiple third candidate request queues. Based on the multiple estimated execution times and total data volume corresponding to the multiple task requests, the multiple third candidate request queues are aggregated to obtain the multiple request queues. In this context, one estimated execution time corresponds to one task request, and the first request queue includes at least two third candidate request queues from among the plurality of third candidate request queues. The estimated execution time of the first request queue is less than the sum of the estimated execution times of the at least two third candidate request queues.
6. The method according to any one of claims 1-5, characterized in that, The step of determining the scheduling order of the multiple request queues based on the state of the cache includes: For one of the multiple request queues, the state of the cache and the queue characteristics of the request queue are used as input to the queuing model, and the queuing score of the request queue is output. The queue characteristics include one or more of the following combinations: the amount of data corresponding to a request queue, seek time, maximum tape wear information, and maximum execution waiting time for executing the request queue; the queuing model is used to determine the queuing score of each request queue among the multiple request queues; The scheduling order of the multiple request queues is obtained based on the queuing score of each request queue.
7. The method according to claim 6, characterized in that, The step of obtaining the scheduling order of the multiple request queues based on the queuing score of each request queue includes: If the deadline for execution of at least one of the multiple request queues is less than or equal to the reserved time, the first sub-order is obtained based on the queuing score of the at least one request queue. And / or, if the cache occupancy rate reaches a first threshold, the request queues with the task type of writing data in the plurality of request queues are sorted to obtain a second sub-order; the second sub-order follows the first sub-order; And / or, if the cache occupancy rate is lower than the second threshold, the request queues with the task type of reading data in the plurality of request queues are sorted to obtain a third sub-order; the second threshold is less than the first threshold.
8. The method according to any one of claims 1-7, characterized in that, The scheduling order of the multiple request queues includes: the request queue with higher priority is scheduled before the request queue with lower priority.
9. A task scheduling method, characterized in that, Applied to a magnetoelectric disk, the magnetoelectric disk including a processor, a cache, and a magnetic tape, the method is executed by the processor, and the method includes: Obtain multiple task requests; wherein the task indicated by the task request includes any one of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation task; Multiple request queues are obtained based on the priority and characteristics of the multiple task requests; wherein, the characteristics are used to indicate one or more combinations of the following: tape identifier, execution deadline, access address; one of the multiple request queues includes: at least one task request among the multiple task requests that has the same magneto disk identifier and belongs to the same priority. The scheduling order of the multiple request queues is obtained based on the state of the cache. The tasks indicated by the plurality of request queues are executed according to the scheduling order.
10. The method according to claim 9, characterized in that, The step of determining the scheduling order of the multiple request queues based on the state of the cache includes: For one of the multiple request queues, the state of the cache and the queue characteristics of the request queue are used as input to the queuing model, and the queuing score of the request queue is output. The queue characteristics include one or more of the following combinations: the amount of data corresponding to a request queue, seek time, maximum tape wear information, and maximum execution waiting time for executing the request queue; the queuing model is used to determine the queuing score of each request queue among the multiple request queues; The scheduling order of the multiple request queues is obtained based on the queuing score of each request queue.
11. The method according to claim 10, characterized in that, The step of obtaining the scheduling order of the multiple request queues based on the queuing score of each request queue includes: If the deadline for execution of at least one of the multiple request queues is less than or equal to the reserved time, the first sub-order is obtained based on the queuing score of the at least one request queue. And / or, if the cache occupancy rate reaches a first threshold, the request queues with the task type of writing data in the plurality of request queues are sorted to obtain a second sub-order; the second sub-order follows the first sub-order; And / or, if the cache occupancy rate is lower than the second threshold, the request queues with the task type of reading data in the plurality of request queues are sorted to obtain a third sub-order; the second threshold is less than the first threshold.
12. A magnetoelectric disk, characterized in that, include: Communication interface, processor, cache, and magnetic tape; The magnetic tape is used to store data; The communication interface is used to obtain multiple task requests; wherein the task requested indicates any one of the following: data read, data write, scheduled data read, data reconstruction, garbage collection, or user operation task. The processor is configured to execute the method of any one of claims 9-11 based on the priority and characteristics of the plurality of task requests and the state of the cache.
13. A storage system, characterized in that, include: Communication interface, controller, buffer, and multiple magneto disks; The communication interface is used to: obtain multiple task requests; wherein the task indicated by the task request includes any one of the following: data read, data write, data scheduled read, data reconstruction, garbage collection, or user operation task; The controller is configured to: obtain multiple request queues based on the priority and characteristics of the multiple task requests; wherein the characteristics are used to indicate one or more combinations of the following: magneto disk identifier, execution deadline, and access address; one of the multiple request queues includes: at least one task request among the multiple task requests that has the same magneto disk identifier and belongs to the same priority. The cache is used to temporarily store the multiple request queues; The controller is further configured to: execute the method of any one of claims 1-8 based on the state of the cache and the plurality of request queues.
14. A chip, characterized in that, include: A communication interface used to receive multiple task requests; A processor, configured to perform the method of any one of claims 1-11 in cooperation with the communication interface, based on the plurality of task requests.
15. A computer program product, characterized in that, When the computer program product is run on an electronic device, the electronic device performs the method according to any one of claims 1-11.