Storage systems and data protection methods

The storage system addresses log accumulation in queues by generating and retrieving logs based on queue capacity, ensuring high performance and reliability through adaptive memory protection methods, even during controller failures.

JP2026110040APending Publication Date: 2026-07-02HITACHI VANTARA LTD

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
HITACHI VANTARA LTD
Filing Date
2024-12-20
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Conventional storage systems face issues with logs accumulating in a queue during write-back operations due to processor cores being occupied by other processes, leading to a lack of available cores for writing logs to a drive, resulting in potential data loss.

Method used

A storage system with non-volatile storage devices and multiple controllers that employ a first memory protection method to generate logs, store them in a queue, and retrieve them for writing to a non-volatile medium, controlling the execution of processes based on queue capacity, and switch between memory protection methods depending on controller status.

Benefits of technology

Prevents log accumulation in the queue, avoids deadlocks, and maintains high performance and reliability by ensuring data integrity during failures, allowing safe operation and minimal performance degradation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026110040000001_ABST
    Figure 2026110040000001_ABST
Patent Text Reader

Abstract

During writeback in the storage system, this prevents logs from accumulating in the queue that stores the logs of cache memory updates. [Solution] Each of the multiple storage controllers is equipped with a first memory protection method that generates logs related to the writing and updating of data in memory, stores them in a memory queue, retrieves the logs from the queue, and writes them to a non-volatile storage medium. When the storage controller protects data in memory using the first memory protection method, it controls the execution of a first process that stores logs in the queue and a second process that retrieves logs from the queue and writes them to the storage medium, according to the capacity of the logs stored in the queue.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The present invention relates to a storage system and a data protection method.

Background Art

[0002] A storage system is required to have high performance and high reliability. To improve the performance of a storage system, it is useful to respond to a host when writing write data from the host to a cache memory in a storage controller is completed, and then perform write-back to write the data to a drive.

[0003] Further, Patent Document 1 discloses the following technique for enhancing the reliability of a storage system in write-back. That is, while one of the duplicated storage controllers is blocked, the updated contents of the cache memory in the other storage controller are written to a drive as a log to be made non-volatile. Thereby, even if the other storage controller fails, data loss is prevented.

Prior Art Documents

Patent Documents

[0004]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0005] In the above-described conventional technology, the log of the updated contents of the cache memory is temporarily stored in a queue before being written to a drive. However, when the queue becomes full while all processor cores are executing other processes waiting for the storage of the log, a processor core to be dispatched cannot be secured, and the process of writing the log to the drive cannot be started. As a result, there is a problem that the log is not written to the drive and the log remains in the queue.

[0006] This invention has been made in view of the above-mentioned problems, and aims to prevent logs from accumulating in the queue that stores the log of cache memory updates during writeback in a storage system. [Means for solving the problem]

[0007] To achieve the above objective, the present invention, in one aspect, provides a storage system comprising a non-volatile storage device for storing user data and a plurality of storage controllers for controlling reading and writing to the storage device, wherein each of the plurality of storage controllers has a processor and memory, and the storage controllers include a first memory protection method for protecting the data in the memory by generating logs relating to the writing and updating of data in the memory and storing them in a queue, and retrieving the logs from the queue and writing them to a non-volatile storage medium, and the storage controllers, when protecting the data in the memory using the first memory protection method, control the execution of a first process for storing logs in a queue and a second process for retrieving logs from the queue and writing them to a storage medium according to the capacity of the logs stored in the queue. [Effects of the Invention]

[0008] According to the present invention, for example, in the write-back of a storage system, it is possible to prevent logs from accumulating in the queue that stores the logs of cache memory updates. [Brief explanation of the drawing]

[0009] [Figure 1] A diagram showing the overall system configuration, including the storage system according to the embodiment. [Figure 2] A diagram illustrating the light operation when one controller is blocked in the embodiment. [Figure 3] A diagram showing the memory configuration of the storage system according to the embodiment. [Figure 4]A diagram illustrating the process allocation by the scheduler according to this embodiment. [Figure 5] A diagram illustrating the log backup request queue according to the embodiment. [Figure 6] A diagram illustrating the threshold values ​​for the log backup request queue according to the embodiment. [Figure 7] A flowchart illustrating the light processing according to the embodiment. [Figure 8] A flowchart illustrating the cache data update process according to the embodiment. [Figure 9] A flowchart illustrating the log creation process according to the embodiment. [Figure 10] A flowchart illustrating the control information update process according to the embodiment. [Figure 11] A flowchart illustrating the process scheduling process according to the embodiment. [Figure 12] A flowchart illustrating the log backup process according to the embodiment. [Figure 13] A flowchart illustrating the destaging speed adjustment process according to the embodiment. [Modes for carrying out the invention]

[0010] Embodiments of the present invention will be described below with reference to the drawings. The embodiments relate, for example, to a storage system comprising a plurality of storage controllers.

[0011] (Configuration of the entire system including the storage system 100 according to the embodiment) Figure 1 is a diagram showing the overall system configuration including the storage system 100 according to this embodiment. The storage system 100 of this embodiment includes a plurality of controllers 103 and drives 110 which are storage devices. The storage controller 103 is a device that has the function of providing volumes to be read from and written to a host computer (hereinafter referred to as the host).

[0012] The drive 110 is, for example, an SSD (Solid State Drive) using a flash memory as a storage medium, an HDD (Hard Disk Drive) using a magnetic disk as a storage medium, or the like.

[0013] The storage controller 103 includes a CPU 106, a memory 105, a memory backup drive 107, a front-end interface (FE I / F) 104, and a back-end interface (BE I / F) 108.

[0014] The CPU 106 is an example of a processor and includes a plurality of CPU cores 106c that execute various dispatched processes described later.

[0015] The memory 105 is a semiconductor memory such as, for example, a DRAM (Dynamic Random Access Memory). The memory backup drive 107 is a drive such as, for example, an SSD and is used to store the content of the memory 105 in case of loss of external power supply or the like.

[0016] The front-end interface 104 is, for example, a Fibre Channel HBA (Host Bus Adapter) or a NIC (Network Interface Controller). The back-end interface 108 is, for example, a SAS HBA or a PCI Express (hereinafter, PCIe) adapter or a NIC. Each storage controller 103 and the drive 110 are connected, for example, by a switch (BE Switch) 109. Also, the CPUs 106 of a plurality of controllers are connected by an interconnect such as, for example, PCIe.

[0017] The CPUs 106 may be connected to each other via, for example, a PCIe switch. The storage system 100 is connected to a storage area network (SAN) 101, such as Fibre Channel or Ethernet®, and the hosts 102 are also connected to the SAN 101. The SAN 101 may include switches, etc. Multiple hosts 102 may also be connected to the SAN 101.

[0018] (Overview of light operation when one controller is blocked in the embodiment) Figure 2 shows an overview of the light operation when one controller is blocked in the embodiment.

[0019] In this embodiment, the CPU 106 of the storage controller 103 receives a write request from the host 102, receives data 201 from the host 102, and writes the data 201 to the memory 105 within its own storage controller 103. The CPU 106 also updates the control information (Metadata) 200 in the memory 105 within its own storage controller 103. Similarly, the CPU 106 writes the same data 201 to the memory 105 of another controller. The CPU 106 also updates the control information 200 in the memory 105 of the other controller. After that, the CPU 106 returns a write completion response to the host 102. The CPU 106 also writes the write data 201 written to the memory 105 to the drive 110 at a predetermined timing (destaging).

[0020] In this way, the storage system 100 prepares for the failure of the storage controller 103 by duplicating the write data and control information on the memory 105 between the storage controllers 103 during write operations. This memory protection method is an example of a second memory protection method in which the storage controller protects the data on the memory by replicating the data on the memory onto other memories of other storage controllers corresponding to that storage controller.

[0021] Here, as shown in Figure 2, consider a situation where only the storage controller 103 of one storage system 100 is functioning normally, while the storage controller 103 of the other storage system 100 is blocked due to a malfunction or other reason. This situation is called a single-controller blockage.

[0022] When one controller is blocked, the CPU 106 of the non-faulting storage controller 103 receives a write request from the host 102, receives data 201 from the host 102, and writes data 201 to the memory 105 within its own storage controller 103. The CPU 106 also updates the control information 200 in the memory 105.

[0023] Furthermore, the CPU 106 writes the updated data 201 as a log (cache data log 201L) to the drive 110, and also writes the updated control information 200 as a log (control information log 200L) to the drive 110 (log backup). Then the CPU 106 responds to the host 102 that the write is complete. This memory protection method, which protects data in memory when one controller is blocked, is called "log backup mode".

[0024] "Log evacuation mode" is an example of a first memory protection method in which the storage controller generates logs regarding the writing and updating of data in memory, stores them in a queue, and protects the data in memory by retrieving the logs from the queue and writing them to a non-volatile storage medium.

[0025] The storage controller 103 switches between using a first memory protection method (log backup method) and a second memory protection method (memory replication method) depending on the operating status of the other storage controller 103 in the redundant configuration, such as normal or failed. The storage controller 103 then protects the data on memory 105 using the switched first or second memory protection method.

[0026] In this embodiment, the control information log 200L and the cache data log 201L are described as being recorded on the user data storage drive 110. However, they may also be recorded on a separate log storage drive 110 or on a non-volatile medium such as the memory backup drive 107.

[0027] (Memory configuration of the storage system 100 according to the embodiment) Figure 3 shows the memory configuration of the storage system 100 according to the embodiment. The memory 105 includes a storage control program 1051, control information 200, cache data 1052, and a log backup request queue 1053.

[0028] The storage control program 1051 is a program that controls the storage system 100 and is executed by the CPU 106. The write operations and other processes described later are included in the operations performed by the write command process and the like, which are started by the execution of the storage control program 1051.

[0029] The control information 200 is data used by the storage control program 1051 to control the execution of the program. The control information 200 includes the control information log storage destination management table 200a and the cache data log storage destination management table 200b.

[0030] The control information log storage destination management table 200a manages the address of the storage destination for the control information log 200L. The cache data log storage destination management table 200b manages the address of the storage destination for the cache data log 201L. Control information 200 also includes cache control information, such as the correspondence between the address of cache data 1052 and the logical address (LBA) within the volume, and the status of the cache data (dirty / clean). Furthermore, control information 200 also includes configuration information, such as the drive type and capacity, the RAID group type and configuration, and the status of each controller (normal / blocked, etc.).

[0031] Cache data 1052 is data 201.

[0032] (Overview of log backup request queue 1053 according to the embodiment) Figure 4 shows an overview of process allocation by the scheduler according to this embodiment. Figure 5 shows an overview of the log backup request queue 1053 according to this embodiment.

[0033] As shown in Figure 4, each process 400 included in the group of processes waiting to be executed 500 is selected by a scheduler run by the CPU 106 according to the schedule and dispatched to the CPU core 106c of the CPU 106. Process 400 includes first processes such as the write command process 400-1, the destaging process 400-2, the deduplication process 400-3, and the snapshot process 400-4, and second processes such as the log evacuation process 400a.

[0034] The log backup process 400a stores log 400L, which is stored in the log backup request queue 1053, into drive 110.

[0035] The write command process 400-1 receives an I / O request from host 102 and writes data 201 and control information 200 to memory 105. Furthermore, when one controller is blocked, the write command process 400-1 receives an I / O request from host 102 and writes data 201 and control information 200 to memory 105. Finally, the write command process 400-1 stores the cache data log 201L and control information log 200L in the log backup request queue 1053.

[0036] The destaging process 400-2 writes the data 201 stored in memory 105 to drive 110 at a predetermined timing. The deduplication process 400-3 provides a deduplication function. The snapshot process 400-4 provides a snapshot creation function. Other processes are not shown in the diagrams or described.

[0037] As shown in Figure 5, when one controller is blocked, the cache data log 201L and the control information log 200L are written to the drive 110 to make the updated contents of the memory 105 non-volatile, thereby preventing data loss in the event of a failure of the other storage controller 103, which is operating normally.

[0038] Processes 400, such as the write command process 400-1 and the destaging process 400-2, execute predetermined processes and output log 400L (cache data log 201L and control information log 200L) when dispatched to CPU core 106c. The output log 400L is registered in the log backup request queue 1053.

[0039] When log backup process 400a is dispatched to CPU core 106c, it retrieves log 400L from log backup request queue 1053 and writes it to log backup drive 110.

[0040] In conventional technology, if the log backup request queue 1053 is full and all running processes 400 stall simultaneously, a deadlock occurs because there is no CPU core 106c to run the log backup process 400a. Also, if the log backup request queue 1053 is full and the log backup process 400a waits for a tag to become available to manage the multiplexing of drive 110, and the destaging process 400-2 updates memory 105 while still holding the tag, a deadlock may occur. This is because if the log backup process 400a cannot operate, no space becomes available in the log backup request queue 1053, causing the processing of the destaging process 400-2, which is holding the tag, to stop and the tag not to be released. However, according to this embodiment, the above-mentioned deadlocks can be avoided and the reliability of the storage system 100 can be improved.

[0041] (Threshold for log backup request queue 1053 according to the embodiment) FIG. 6 is a diagram showing an overview of the thresholds of the log evacuation request queue 1053 according to the embodiment. Four thresholds, namely, threshold Th1 (first threshold), Th2 (second threshold), Th3 (third threshold), and Th4 (fourth threshold), are provided from the OUT side of the log evacuation request queue 1053. The thresholds Th1, Th2, Th3, and Th4 are the capacity thresholds of the log 400L stored in the log evacuation request queue 1053, and there is a magnitude relationship of Th1 < Th2 and Th3 < Th4. The magnitude relationship between Th2 and Th3 is not relevant.

[0042] When the capacity of the log 400L stored in the log evacuation request queue 1053 becomes equal to or greater than the threshold Th1, a log evacuation request is output. When a log evacuation request is output, the log evacuation process 400a dispatched to the CPU core 106c retrieves the log 400L stored in the log evacuation request queue 1053 and stores it in the drive 110.

[0043] When the capacity of the log 400L stored in the log evacuation request queue 1053 becomes equal to or greater than the threshold Th2, the number of destage requests is reduced from a predetermined number determined based on the status of cache data such as the dirty rate of the memory 105. By reducing the execution frequency of the destage process 400-2, the data 201 cached in the memory 105 is written to the drive 110 through destage processing in a plurality of times grouped together.

[0044] When the capacity of the log 400L stored in the log evacuation request queue 1053 becomes equal to or greater than the threshold Th3, a sleep time is provided for the execution of the write command process 400-1. By providing a sleep time for the execution of the write command process 400-1, the execution interval of the write process is increased, and the inflow rate of the log 400L into the log evacuation request queue 1053 is suppressed.

[0045] When the capacity of log 400L stored in log backup request queue 1053 exceeds the threshold Th4, the destaging request is canceled. The execution of destaging process 400-2 is canceled, and the execution of log backup process 400a takes priority. The threshold Th4 is determined by adding a predetermined margin to the amount of log 400L generated by a single execution of process 400.

[0046] As described above, when the storage controller 103 protects data in memory using the first memory protection method, it controls the execution of the first and second processes according to the capacity of the logs stored in the log backup request queue 1053.

[0047] (Light processing according to the embodiment) Figure 7 is a flowchart illustrating the write process according to the embodiment. The write process is executed each time a write request is received from the host 102.

[0048] First, in step S11, the write command process 400-1 determines whether the inflow of logs 400L into the log evacuation request queue 1053 is restricted, that is, whether the capacity of logs 400L stored in the log evacuation request queue 1053 is greater than or equal to the threshold Th3. If the inflow of logs 400L into the log evacuation request queue 1053 is restricted (step S11YES), the write command process 400-1 moves to step S12, and if the inflow is not restricted (step S11NO), it moves to step S13.

[0049] In step S12, the write command process 400-1 performs a sleep process to wait for a certain period of time before processing. While the write command process is sleeping, the CPU core 106c can execute other processes. Next, in step S13, the write command process 400-1 allocates a cache area of ​​memory 105 for the write data related to the I / O request from host 102.

[0050] Next, in step S14, the write command process 400-1 executes a cache data update process. Details of the cache data update process will be described later with reference to Figure 8. Next, in step S15, the write command process 400-1 executes a control information update process. Details of the control information update process will be described later with reference to Figure 10.

[0051] Next, in step S16, the write command process 400-1 determines whether it is in log backup mode. If it is in log backup mode (step S16YES), the write command process 400-1 moves to step S17, and if it is not in log backup mode (step S16NO), it moves to step S19.

[0052] In step S17, the write command process 400-1 outputs a log backup request. In step S17, the write command process 400-1 changes the status of the log 400L to which the log backup request was sent to "Backup Requested", and does not send another backup request for the log 400L that is already "Backup Requested".

[0053] Next, in step S18, the write command process 400-1 waits for the log backup request in step S18 to be completed. In step S18, before the host response in step S19, it is necessary to make the contents of memory 105 updated in this write process non-volatile, so it waits for the latest log 400L generated in this write process to be backed up. While the write command process is waiting, the CPU core 106c can execute another process.

[0054] In step S19, the write command process 400-1 sends a response to the write request to the host 102.

[0055] (Cache data update process according to the embodiment) Figure 8 is a flowchart showing the cache data update process according to an embodiment.

[0056] First, in step S14a, the write command process 400-1 updates the cache data (data 201) in memory 105. Next, in step S14b, the write command process 400-1 determines whether the cache data updated in step S14a needs to be made non-volatile. If non-volatilization is necessary (step S14b YES), the write command process 400-1 moves to step S14c. On the other hand, if non-volatilization is not necessary (step S14b NO), the write command process 400-1 terminates the cache data update process.

[0057] In step S14c, the write command process 400-1 executes the log creation process. Details of the log creation process will be described later with reference to Figure 9.

[0058] Next, in step S14d, the write command process 400-1 determines whether the update in step S14a is an overwrite of data. If the update in step S14a is an overwrite of data (step S14d YES), the write command process 400-1 moves to step S14e. On the other hand, if the update in step S14a is a new registration of data (step S14d NO), the write command process 400-1 moves to step S14f.

[0059] In step S14e, the write command process 400-1 invalidates the log in the log header table (cache data log storage destination management table 200b). Next, in step S14f, the write command process 400-1 updates the log header table.

[0060] (Log creation process according to the embodiment) Figure 9 is a flowchart showing the log creation process according to this embodiment.

[0061] First, in step S14c1, the write command process 400-1 secures a sequence number. This sequence number indicates the creation order of each log and is stored in the log header created in subsequent steps. Next, in step S14c2, the write command process 400-1 secures an entry in the log evacuation request queue 1053. Then, in step S14c3, the write command process 400-1 creates the log header.

[0062] Next, in step S14c4, the write command process 400-1 stores the log with the log header created in step S14c3 attached to the entry in the log backup request queue 1053 that was secured in step S14c2. Next, in step S14c5, the write command process 400-1 activates the log stored in step S14c4.

[0063] Next, in step S14c6, the write command process 400-1 determines whether the capacity of the unsaved logs 400L stored in the log evacuation request queue 1053 is above a specified amount, i.e., above the threshold Th1. "Unsaved logs" are logs that have been enqueued in the log evacuation request queue 1053 but whose status has not yet changed to "evacuation requested".

[0064] The write command process 400-1 moves to step S14c7 if the capacity of the unsaved logs 400L stored in the log backup request queue 1053 is greater than or equal to the specified amount (step S14c6YES).

[0065] On the other hand, the write command process 400-1 terminates the log creation process if the capacity of the unsaved logs 400L stored in the log backup request queue 1053 is less than the specified amount (step S14c6NO).

[0066] In step S14c7, the write command process 400-1 outputs a log backup request for the unbacked log 400L stored in the log backup request queue 1053.

[0067] (Control information update process according to the embodiment) Figure 10 is a flowchart showing the control information update process according to the embodiment.

[0068] First, in step S15a, the write command process 400-1 updates the control information 200 in memory 105. Next, in step S15b, the write command process 400-1 determines whether it is necessary to make the control information updated in step S14a non-volatile. If it is necessary to make the control information updated in step S14a non-volatile (step S15b YES), the write command process 400-1 moves to step S15c. On the other hand, if it is not necessary to make the control information updated in step S14a non-volatile (step S15b NO), the write command process 400-1 terminates the control information update process.

[0069] In step S15c, the write command process 400-1 executes the log creation process described with reference to Figure 9.

[0070] (Process scheduling process according to the embodiment) Figure 11 is a flowchart of the process scheduling process according to the embodiment. The process scheduling process is repeatedly executed by the CPU core 106c that executes the process scheduling process. The process scheduling process starts various processes in response to requests such as commands from the host 102. However, in Figure 11, the explanation is simplified by using the startup of the log backup process 400a (step S22), the startup of the destaging process 400-2 (step S24), and the startup of command processing (step S26) as examples.

[0071] First, in step S21, CPU core 106c determines whether there is a log backup request (step S14c7 (Figure 9)). If there is a log backup request (step S21YES), CPU core 106c moves to step S22. On the other hand, if there is no log backup request (step S21NO), CPU core 106c moves to step S23.

[0072] In step S22, CPU core 106c initiates the log backup process (Figure 12), which will be described later.

[0073] Next, in step S23, the CPU core 106c determines whether there is a destaging request. If there is a destaging request (step S23 YES), the CPU core 106c moves to step S24. On the other hand, if there is no destaging request (step S23 NO), the CPU core 106c moves to step S25.

[0074] In step S24, CPU core 106c dispatches destaging process 400-2 to CPU core 106c to initiate the destaging process.

[0075] Next, in step S25, the CPU core 106c determines whether it has received a command from the host 102. If the CPU core 106c has received a command (step S25YES), it moves processing to step S25. On the other hand, if the CPU core 106c has not received a command (step S25NO), it moves processing to step S26.

[0076] In step S26, CPU core 106c initiates processing according to the received command.

[0077] Next, in step S27, the CPU core 106c determines whether there are any waiting processes 400. If there are waiting processes 400 (step S27YES), the CPU core 106c proceeds to step S28. On the other hand, if there are no processes 400 waiting to be executed (step S27NO), the CPU core 106c terminates the process scheduling process.

[0078] In step S28, CPU core 106c starts the waiting process 400.

[0079] (Log backup process according to this embodiment) Figure 12 is a flowchart illustrating the log backup process according to this embodiment. The log backup process is performed by the log backup process 400a.

[0080] First, in step S31, the log backup process 400a retrieves a predetermined amount of unbacked logs 400L stored in the log backup request queue 1053. Next, in step S32, the log backup process 400a writes the logs 400L retrieved in step S31 to the drive 110, which is managed by the control information log storage destination management table 200a or the cache data log storage destination management table 200b. Then, in step S33, the log backup process 400a deletes the logs 400L written to the drive 110 in step S32 from the log backup request queue 1053.

[0081] (Destage speed adjustment process according to the embodiment) Figure 13 is a flowchart of the destage speed adjustment process according to an embodiment. The destage speed adjustment process is performed by the destage speed adjustment process.

[0082] First, in step S41, the destaging speed adjustment process determines the number of destaging requests based on the dirty rate of the data 201 cached in memory 105. Here, the dirty rate = dirty cache amount / total cache capacity, and the higher the dirty rate, the more destaging requests are made. Note that the number of destaging requests may be a value calculated based on other indicators representing the status of the cache data, or it may be a constant value, not limited to the dirty rate.

[0083] Next, in step S42, the destage speed adjustment process determines whether log backup priority is in effect, i.e., whether the capacity of the log 400L stored in the log backup request queue 1053 is greater than or equal to the threshold Th2. If log backup priority is in effect (step S42YES), the destage speed adjustment process moves to step S43. On the other hand, if log backup priority is not in effect (step S42YES), the destage speed adjustment process moves to step S44.

[0084] In step S43, the destaging speed adjustment process reduces the number of destaging requests determined in step S41 by a predetermined number. In step S43, because the available space in the log backup request queue 1053 is below a certain value, the frequency of destaging is reduced in order to prioritize log backup.

[0085] Next, in step S44, the destage speed adjustment process determines whether the log inflow is restricted, i.e., whether the capacity of the log 400L stored in the log evacuation request queue 1053 is greater than or equal to the threshold Th4. If the log inflow is restricted (step S44 YES), the destage speed adjustment process terminates. If step S44 NO, the capacity of the log 400L stored in the log evacuation request queue 1053 has reached the threshold Th4. For this reason, the execution of the destage speed adjustment process (step S45) in the first process, which writes the cached data cached in memory 105 to drive 110, is canceled.

[0086] On the other hand, the destaging speed adjustment process moves to step S45 if log inflow is not restricted (step S44NO).

[0087] In step S45, the destaging speed adjustment process issues destaging requests for the data 201 cached in memory 105, up to the number of destaging requests finally determined through steps S41 and S42.

[0088] Furthermore, the data protection on the cache memory using the first memory protection method described above can be applied not only when one of the redundant storage controllers is blocked, but also during normal operation when both storage controllers are functioning correctly.

[0089] (Effects of the embodiment) In the above embodiment, when protecting data in memory using the first memory protection method, the execution of the first process that stores logs in a queue and the second process that retrieves logs from the queue and writes them to the storage medium is controlled according to the capacity of the logs stored in the queue. Therefore, during write-back of the storage system, it is possible to prevent logs from accumulating in the queue that stores logs of cache memory updates. Furthermore, deadlocks in the memory non-volatility function can be avoided, and the system can be operated safely.

[0090] Furthermore, in the above embodiment, the system switches between using the first memory protection method and the second memory protection method depending on the operating state of the other storage controller, and protects the data in memory using either the first or second memory protection method. Therefore, in the event of a failure in one storage controller, the first memory protection method can be used instead of the second memory protection method to make the memory non-volatile, thereby suppressing a decrease in reliability, while preventing log accumulation in the queue that stores the log of the cache memory update contents. In addition, high performance can be achieved by executing arbitrary jobs using all CPU cores under normal conditions, and high reliability can be achieved by the first memory protection method when one storage controller fails. In other words, a storage system can be realized that is high-performance under normal conditions and has minimal performance degradation in the event of a failure.

[0091] Furthermore, in the above-described embodiment, when protecting data in memory using the first memory protection method, if the capacity of the logs stored in the queue is greater than or equal to the fourth threshold, the execution of the destaging process among the first processes is canceled. On the other hand, if the capacity of the logs stored in the queue is less than the fourth threshold, the destaging process is executed. Therefore, when the log capacity is greater than or equal to the fourth threshold, log accumulation in the queue can be quickly resolved by performing only log evacuation without executing destaging.

[0092] Furthermore, in the above-described embodiment, when protecting data in memory using the first memory protection method, the second process (log evacuation process 400a) is executed when the capacity of the logs stored in the queue is greater than or equal to the first threshold, which is less than the fourth threshold. On the other hand, when the capacity of the logs stored in the queue is less than the first threshold, the execution of the second process is canceled. Therefore, by prioritizing log evacuation when the log capacity is greater than or equal to the first threshold, log accumulation in the queue can be suppressed.

[0093] Furthermore, in the above-described embodiment, when protecting data in memory using the first memory protection method, the destaging process is executed at a predetermined frequency when the capacity of the logs stored in the queue is greater than the first threshold and less than the second threshold, which is less than the fourth threshold. On the other hand, when the capacity of the logs stored in the queue is greater than or equal to the second threshold, the destaging process is executed at a frequency lower than the predetermined frequency. Therefore, when the log capacity is greater than or equal to the second threshold, the frequency of batch writing of destaging cache data is reduced, thereby reducing the drive load and suppressing log accumulation in the queue.

[0094] Furthermore, in the above-described embodiment, when protecting data in memory using the first memory protection method, if the capacity of the logs stored in the queue is greater than or equal to the third threshold, the execution interval of the write process among the first processes is extended. On the other hand, if the capacity of the logs stored in the queue is less than the third threshold, the execution interval is not extended when the write process is executed. Therefore, by reducing the rate at which logs flow into the queue (making write I / O wait), the log generation rate can be reduced, and log accumulation in the queue can be suppressed.

[0095] Furthermore, in the above embodiment, the storage medium for backing up the log is drive 110. Therefore, by using a portion of the user data drive for log backup, high performance can be obtained at low cost.

[0096] Although several embodiments have been described above, these are merely illustrative examples for explaining the present invention and are not intended to limit the scope of the present invention to these embodiments only. The present invention can also be implemented in various other forms, such as forms in which some of the components of the above embodiments are omitted, forms in which at least some of the components are replaced, forms in which components are added, or forms in which some or all of the embodiments are combined. [Explanation of Symbols]

[0097] 100: Storage system, 102: Host, 103: Storage controller, 105: Memory, 106: CPU, 110: Drive, 200: Control information, 200L: Control information log, 201: Data, 201L: Cache data log, 400: Process, 400-1: Write command process, 400-2: Destage process, 400L: Log, 400a: Log evacuation process, 1051: Storage control program, 1053: Log evacuation request queue

Claims

1. A storage system comprising a non-volatile storage device for storing user data, and a plurality of storage controllers for controlling reading and writing to the storage device, Each of the aforementioned storage controllers has a processor and memory, The storage controller includes a first memory protection method that protects the data in the memory by generating logs relating to the writing and updating of data in the memory, storing them in a queue, and retrieving the logs from the queue and writing them to a non-volatile storage medium. The aforementioned storage controller When protecting data in memory using the first memory protection method, the execution of a first process that stores the log in the queue and a second process that retrieves the log from the queue and writes it to the storage medium is controlled according to the capacity of the log stored in the queue. A storage system characterized by the following features.

2. A storage system according to claim 1, The storage controller further comprises a second memory protection scheme that protects the data in the memory by replicating the data in the memory onto other memory of another storage controller corresponding to the storage controller. The aforementioned storage controller Depending on the operating state of the other storage controller, the system switches between using the first memory protection method and the second memory protection method, and protects the data in the memory using the switched first or second memory protection method. A storage system characterized by the following features.

3. A storage system according to claim 1, The aforementioned storage controller When protecting the data on the memory using the first memory protection method described above, When the capacity of the log stored in the queue is greater than or equal to the fourth threshold, the execution of the destaging process in the first process that writes the cached data cached in the memory to the storage device is canceled. When the capacity of the logs stored in the queue is less than the fourth threshold, the destaging process is executed. A storage system characterized by the following features.

4. A storage system according to claim 3, The aforementioned storage controller When protecting the data on the memory using the first memory protection method described above, When the capacity of the logs stored in the queue is greater than or equal to a first threshold that is less than the fourth threshold, the second process is executed. When the capacity of the log stored in the queue is less than the first threshold, the execution of the second process is canceled. A storage system characterized by the following features.

5. A storage system according to claim 4, The aforementioned storage controller When protecting the data on the memory using the first memory protection method described above, When the capacity of the logs stored in the queue is greater than the first threshold and less than the second threshold which is less than the fourth threshold, the destaging process is executed at a predetermined frequency. When the capacity of the logs stored in the queue exceeds the second threshold, the destaging process is executed at a frequency lower than the predetermined frequency. A storage system characterized by the following features.

6. A storage system according to claim 4, The aforementioned storage controller When protecting the data on the memory using the first memory protection method described above, When the capacity of the log stored in the queue is greater than the first threshold and less than the fourth threshold, and greater than or equal to the third threshold, the execution interval is increased when the write process that writes data related to the write request from the host to the storage device is executed in the first process. When the capacity of the log stored in the queue is less than the third threshold, the execution interval is not extended when executing the write process. A storage system characterized by the following features.

7. A storage system according to claim 1, The storage medium is the storage device. A storage system characterized by the following features.

8. A data protection method performed by a storage system comprising a non-volatile storage device for storing user data and a plurality of storage controllers for controlling reading and writing to the storage device, Each of the aforementioned storage controllers has a processor and memory, The storage controller includes a first memory protection method that protects the data in the memory by generating logs relating to the writing and updating of data in the memory, storing them in a queue, and retrieving the logs from the queue and writing them to a non-volatile storage medium. The aforementioned storage controller When protecting data in memory using the first memory protection method, the execution of a first process that stores the log in the queue and a second process that retrieves the log from the queue and writes it to the storage medium is controlled according to the capacity of the log stored in the queue. A data protection method characterized by having a processing step.