A DMA management method, system, device and storage medium

By monitoring the DMA status of the NTB chip and resetting only the faulty DMA engine, the problem of data transfer interruption caused by DMA failure was solved, thus achieving data reliability and stability in the multi-controller storage system.

CN114816824BActive Publication Date: 2026-06-12INSPUR SUZHOU INTELLIGENT TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
INSPUR SUZHOU INTELLIGENT TECH CO LTD
Filing Date
2022-05-31
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In multi-controller storage systems, when a DMA failure occurs between two controllers, existing technology requires directly resetting the NTB chip, which prevents data from being transferred between controllers, affecting the reliability and data security of the entire system.

Method used

By monitoring the DMA in the NTB chip, it is determined whether only one DMA is in a working state. If not, the NTB chip is prevented from being reset, and only the abnormal DMA engine is controlled to be reset, so as to avoid affecting the data transfer of other controllers.

🎯Benefits of technology

This ensures that data transfer between controllers can continue normally even in the event of a DMA malfunction, guaranteeing data reliability and stability and preventing the spread of system failures caused by NTB chip resets.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN114816824B_ABST
    Figure CN114816824B_ABST
Patent Text Reader

Abstract

The application discloses a DMA management method, system, module and storage medium, which is applied to a DMA management module, relates to the field of server management, and is used for controlling the reset of a DMA engine when the DMA is abnormal. Specifically, when it is detected that there is an abnormal DMA in the NTB chip corresponding to the DMA management module, it is judged whether only one DMA is in a working state in the NTB chip corresponding to the DMA management module. If not, in order to avoid that the controller corresponding to the DMA management module cannot perform data migration, the reset of the NTB chip is prohibited, and only the reset of the abnormal DMA engine is controlled, so that the data migration between the controllers can be normally performed. In the application, only the reset of the abnormal DMA engine is controlled instead of directly controlling the reset of the NTB chip, so that the data migration between the controllers can be performed, and the reliable caching of data is ensured.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of server management, and in particular to a DMA management method, system, device, and storage medium. Background Technology

[0002] Currently, regardless of whether it spans multiple chassis, unified multi-controller storage generally uses cache mirroring between controllers for data protection. In the event of a single or multiple controller failure, other normal controllers can continue to provide front-end services using the cached data, ensuring uninterrupted host services.

[0003] In the process of caching images between multiple controllers, regardless of whether they are in different chassis, each controller is equipped with a third-party chip, such as an NTB (Non-Transparent Bridge) chip. The NTB chip has multiple DMA (Direct Memory Access) functions, and data can be transferred between any two controllers via DMA. However, in the existing technology, if the DMA between two controllers fails and data cannot be transferred normally, the NTB chips of these two controllers need to be reset directly. This results in the inability of these two controllers to transfer data not only between themselves but also between themselves and other controllers, making it impossible to guarantee the safe caching of data. Ultimately, this failure spreads to the entire cluster, thereby affecting the reliability of the entire system. Summary of the Invention

[0004] The purpose of this invention is to provide a DMA management method, system, device, and storage medium that, by controlling only the abnormal DMA engine reset rather than directly controlling the NTB chip reset, avoids the inability of data transfer between controllers and ensures reliable data caching.

[0005] To address the aforementioned technical problems, this invention provides a DMA management method applied to a DMA management module. The DMA management module monitors the NTB chip of its corresponding controller. The NTB chip includes multiple DMAs. The method includes:

[0006] Detect whether there is an abnormal DMA in the corresponding NTB chip;

[0007] If the abnormal DMA exists, determine whether only one DMA is in working state in the corresponding NTB chip;

[0008] If not only one DMA is active, then resetting the NTB chip is prohibited, and the abnormal DMAengine is controlled to reset.

[0009] Preferably, after determining whether only one DMA is active in the corresponding NTB chip, the method further includes:

[0010] If only one DMA of the NTB chip is in working state, then the NTB chip is controlled to be reset.

[0011] Preferably, the DMA management module is also used to monitor the CPU of its corresponding controller, and the embedded NTB chip of the CPU includes multiple DMAs;

[0012] Before determining whether only one DMA is active in the corresponding NTB chip, the process also includes:

[0013] The abnormal DMA is determined to be either the DMA of the NTB chip or the DMA of the embedded NTB chip of the CPU;

[0014] The process then proceeds to determine whether only one DMA is active in the corresponding NTB chip.

[0015] Preferably, the DMA management modules corresponding to each controller are connected via a wireless network.

[0016] Preferably, detecting whether there is abnormal DMA in the corresponding NTB chip includes:

[0017] Determine the data transfer efficiency of each DMA of the NTB chip;

[0018] Calculate the deviation ratio between the data transfer efficiency of the DMA and the theoretical transfer efficiency of the DMA;

[0019] Determine whether the deviation ratio is greater than a preset threshold, and whether the holding time of the deviation ratio is greater than a preset time;

[0020] If so, then the abnormal DMA is determined to exist.

[0021] Preferably, before calculating the deviation ratio between the data transfer efficiency of the DMA and the theoretical transfer efficiency of the DMA, the method further includes:

[0022] Get real-time front-end business load;

[0023] The theoretical migration efficiency of the DMA is determined based on the correspondence between the real-time front-end service pressure and the theoretical migration efficiency.

[0024] Preferably, after resetting the DMA engine in case of an abnormality, the method further includes:

[0025] Determine whether the abnormal DMA has returned to normal operation;

[0026] If not, the abnormal DMA engine is repeatedly reset, and after the number of times the abnormal DMA engine is reset exceeds a preset number, the CPU simulates DMA data transfer mode.

[0027] To address the aforementioned technical problems, this application also provides a DMA management system applied to a DMA management module. The DMA management module monitors the NTB chip of its corresponding controller. The NTB chip includes multiple DMAs. The system includes:

[0028] The detection unit is used to detect whether there is an abnormal DMA in the corresponding NTB chip;

[0029] The first judgment unit is used to determine whether only one DMA is in working state in the NTB chip corresponding to itself when the abnormal DMA exists.

[0030] The first control unit is used to prevent the NTB chip from being reset when more than one DMA is in operation, and to control the reset of abnormal DMA engines.

[0031] To address the aforementioned technical problems, this application provides a DMA management module, comprising: a processor, a memory, a communication interface, and a communication bus, wherein the processor, the memory, and the communication interface communicate with each other via the communication bus;

[0032] The memory is used to store at least one executable instruction that causes the processor to perform operations as described above in the DMA management method.

[0033] To address the aforementioned technical problems, this application provides a computer-readable storage medium storing at least one executable instruction. When the executable instruction is executed on a DMA management device, it causes the DMA management device to perform operations as described above in the DMA management method.

[0034] This invention provides a DMA management method, system, module, and storage medium, applied to a DMA management module in the field of server management. It controls the DMA engine reset when a DMA anomaly occurs. Specifically, when the DMA management module detects an abnormal DMA in its corresponding NTB chip, it determines whether only one DMA is active within that NTB chip. If more than one DMA is active, to prevent the corresponding controller from being completely unable to perform data transfer, it prohibits resetting the NTB chip and only controls the abnormal DMA engine to reset, thus ensuring normal data transfer between controllers. This application avoids the inability to transfer data between controllers by only controlling the abnormal DMA engine reset, rather than directly resetting the NTB chip, thus ensuring reliable data caching. Attached Figure Description

[0035] To more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the prior art and embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0036] Figure 1 A flowchart illustrating a DMA management method provided by the present invention;

[0037] Figure 2 A schematic diagram illustrating a specific process of a DMA management method provided by the present invention;

[0038] Figure 3 A schematic diagram of the structure of a DMA management system provided in this application;

[0039] Figure 4 This is a schematic diagram of the structure of a DMA management module provided in this application. Detailed Implementation

[0040] The core of this invention is to provide a DMA management method, system, device, and storage medium. By controlling only the abnormal DMA engine reset, rather than directly controlling the NTB chip reset, the data transfer between controllers is prevented from being interrupted, thus ensuring reliable data caching.

[0041] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0042] Please refer to Figure 1 , Figure 1 This invention provides a flowchart illustrating a DMA management method applied to a DMA management module. The DMA management module monitors the NTB chip of its corresponding controller. The NTB chip includes multiple DMAs. The method includes:

[0043] S11: Detect whether there is an abnormal DMA in its corresponding NTB chip;

[0044] In this embodiment, in order to ensure data integrity and security, when caching data, the data is usually mirrored and cached in multiple controllers, and then stored in the hard disk through each controller. This can avoid the loss of cached data before it is stored in the hard disk if the controller fails when the data is only cached in one controller, which would cause the data to be lost and become unstable. Therefore, if any one or more controllers fail, the normally functioning controller can still save the cached data to the hard disk, ensuring the complete and stable preservation of the data.

[0045] In addition, to ensure data mirroring between controllers, data transfer between any two controllers is performed via DMA. For example, one controller's DMA performs data transfer with another controller. If data transfer between any two controllers fails, the NTB chips of both controllers are reset directly. However, since the DMA on a controller is set on one NTB chip, directly resetting the controller's NTB chip will prevent that controller from transferring data with any other controllers, affecting the stability of the data mirroring cache and the security of the data.

[0046] To address the aforementioned technical issues, each controller in this application is equipped with a DMA management module. The DMA management module monitors and manages each DMA of its corresponding controller and detects whether there is an abnormal DMA. If an abnormal DMA is found, this application does not directly control the NTB chip to reset, thus preventing the controller from being unable to transfer data with any other controller, ensuring the stability of the data cache, and guaranteeing data security.

[0047] S12: If an abnormal DMA exists, determine whether only one DMA is in working state in the corresponding NTB chip;

[0048] If the DMA management module detects an abnormal DMA, it determines whether only one DMA is active in its corresponding NTB chip. If only one DMA is active in the NTB chip, the controller mirrors and caches data with only one controller. In this case, as a preferred embodiment, the NTB chip can be directly reset, and there is no data transfer between the controller and multiple controllers.

[0049] S13: If not only one DMA is in operation, then the NTB chip is disabled from being reset, and the abnormal DMA engine is controlled to be reset.

[0050] If not only one DMA is active, the controller may transfer data with multiple controllers simultaneously via multiple DMAs. In this case, the NTB chip should not be reset directly. Instead, the abnormal DMA engine between the two controllers that cannot transfer data normally should be reset. These two controllers can still transfer data with other controllers. After the abnormal DMA engine is reset, if it can work normally again, the two controllers can resume normal data transfer.

[0051] For example, the first and second controllers are controllers within the first chassis, while the third and fourth controllers are controllers within the second chassis. Each of the first, second, third, and fourth controllers has an NTB chip. Data transfer between the first and second controllers, the first and third controllers, the first and fourth controllers, the second and third controllers, the second and fourth controllers, and the third and fourth controllers is achieved through mutual DMA (Direct Memory Access) communication. This means each controller's NTB chip has three DMAs in active state, and each controller's NTB chip has three NT ports (NT pins) externally. NT ports are the pins for connecting the NTB chip to other NTB chips. Each controller corresponds to a DMA management module. Each DMA management module monitors the DMAs of its corresponding NTB chip, determining that four DMAs of the monitored NTB chip are active and judging whether the four DMAs are abnormal, i.e., whether they cannot transfer data normally. If data cannot be transferred normally, the DMA is considered an abnormal DMA. For example, if the DMA between the first controller and the second controller cannot transfer data normally, since each NTB chip has four DMAs in working state, the NTB cannot be directly reset. If the NTB is directly reset, not only will the data transfer between the first controller and the second controller be unable to proceed normally, but the DMA between the first controller and the third and fourth controllers will also be disconnected, affecting the security of the data cache. Based on this, the DMA management module corresponding to the first controller only resets the DMA engine on the NTB chip of the first controller that is connected to the second controller, and the DMA management module corresponding to the second controller only resets the DMA engine on the NTB chip of the second controller that is connected to the first controller.

[0052] In summary, this application controls only the abnormal DMA engine reset, rather than directly controlling the NTB chip reset, to avoid the inability of data transfer between controllers and ensure reliable data caching.

[0053] Based on the above embodiments:

[0054] Please refer to Figure 2 , Figure 2 This is a schematic diagram illustrating a specific process of a DMA management method provided by the present invention.

[0055] In a preferred embodiment, the DMA management module is also used to monitor the CPU (central processing unit) of its corresponding controller, and the CPU's embedded NTB chip includes multiple DMAs.

[0056] Before determining whether only one DMA is active in its corresponding NTB chip, the following steps are also included:

[0057] Determine whether the abnormal DMA is the DMA of the NTB chip or the DMA of the CPU's embedded NTB chip.

[0058] The process then proceeds to determine whether only one DMA is active in the corresponding NTB chip.

[0059] In this embodiment, when data is transferred between controllers, it can be transferred not only through the DMA of the peripheral NTB chip, but also through the embedded NTB chip on the controller's CPU. The embedded NTB chip also has DMA, but its performance is worse than that of the peripheral NTB chip. Therefore, the peripheral NTB chip's DMA is usually preferred for data transfer. Only if the peripheral NTB chip cannot work properly will the DMA of the embedded NTB chip be selected.

[0060] Based on this, before determining whether only one DMA is working in the corresponding NTB chip, it is necessary to first determine whether the abnormal DMA is the DMA of the peripheral NTB chip or the DMA of the CPU embedded NTB chip. If the abnormal DMA is the DMA of the peripheral NTB chip, then it is necessary to determine whether only one DMA of the NTB chip is working, that is, whether the controller where the NTB chip is located is only transferring data with one controller, so as to determine whether to reset the DMA or the NTB chip.

[0061] Of course, if the abnormal DMA is the DMA of the NTB chip, and only one DMA in the NTB chip is in working state, the peripheral NTB chip can be reset directly. If the abnormal DMA is the DMA of the CPU's embedded NTB chip, and only one DMA in the NTB chip is in working state, the embedded NTB chip can also be reset directly.

[0062] In a preferred embodiment, the DMA management modules corresponding to each controller are connected via a wireless network.

[0063] Since each controller corresponds to a DMA management module, that is, each DMA management module manages all DMAs on the NTB chip of a controller, the DMA management module can detect whether each DMA it monitors is working normally, or whether the DMA is working. In this embodiment, by connecting the DMA management modules corresponding to each controller through wireless network communication, each DMA management module can determine whether the DMAs managed by other DMA management modules are working normally. If one of the DMA management modules fails and cannot monitor and manage the DMA, it sends a request to intervene in management to other DMA management modules. The other DMA management modules take over the DMAs monitored by the failed DMA management module until the failed DMA management module recovers to normal.

[0064] It should be noted that DMA management modules can be interconnected, but are not limited to, via high-speed SerDes or other high-speed networks to ensure the synchronization of management information between DMA management modules.

[0065] As a preferred embodiment, detecting whether there is abnormal DMA in the corresponding NTB chip includes:

[0066] Determine the data transfer efficiency of each DMA of the NTB chip;

[0067] Calculate the deviation ratio between the data transfer efficiency of DMA and the theoretical transfer efficiency of DMA;

[0068] Determine whether the deviation ratio is greater than a preset threshold, and whether the holding time for a deviation greater than the preset threshold is greater than a preset time;

[0069] If so, then an abnormal DMA is determined to exist.

[0070] In this embodiment, when the DMA management module detects whether there is an abnormal DMA in the NTB chip, it first determines the data transfer efficiency of each DMA of the NTB chip. Since each DMA has its theoretical transfer efficiency, the determined data transfer efficiency of each DMA of the NTB chip and its corresponding theoretical transfer efficiency are calculated to calculate the deviation ratio. The deviation ratio is then compared with a preset threshold. The data transfer efficiency of each DMA is acquired in real time, and the deviation ratio is also calculated in real time. DMAs with deviation ratios greater than the preset threshold and a holding time greater than the preset threshold are identified as abnormal DMAs. This allows for timely processing of abnormal DMAs, ensuring that data transfer between controllers can proceed normally as quickly as possible.

[0071] As a preferred embodiment, before calculating the deviation ratio between the DMA data transfer efficiency and the DMA's theoretical transfer efficiency, the method further includes:

[0072] Get real-time front-end business load;

[0073] The theoretical migration efficiency of DMA is determined by the correspondence between real-time front-end business pressure and theoretical migration efficiency.

[0074] In this embodiment, before calculating the deviation ratio between the data transfer efficiency of DMA and the theoretical transfer efficiency of DMA, it is necessary to first determine the theoretical transfer efficiency of DMA. Since there is a corresponding relationship between the theoretical transfer efficiency and the front-end service pressure, the real-time front-end service pressure can be obtained first. Based on the real-time front-end service pressure, the theoretical transfer efficiency of DMA can be determined through the correspondence between the front-end service pressure and the theoretical transfer efficiency, so as to calculate the deviation ratio, quickly determine whether there is abnormal DMA, and handle abnormal DMA in a timely manner to ensure that data transfer between controllers can be carried out normally as soon as possible.

[0075] It should be noted that the front-end business pressure in this application can include four scenarios: small block random write, large block random write, small block sequential write, and large block sequential write. The formula for calculating the deviation ratio K can be, but is not limited to, the following:

[0076] K = |(Data migration efficiency - theoretical migration efficiency) / theoretical migration efficiency|;

[0077] The preset threshold is a percentage of the threshold set by the user based on the business scenario.

[0078] As a preferred embodiment, after controlling the abnormal DMA engine reset, the following steps are also included:

[0079] Determine if the abnormal DMA has returned to normal operation;

[0080] If not, the abnormal DMA engine will be repeatedly reset, and after the number of times the abnormal DMA engine is reset exceeds the preset number, the CPU will enter the DMA data transfer simulation mode.

[0081] In this embodiment, if the abnormal DMA engine is reset or the NTB chip is reset, and the abnormal DMA does not return to normal operation, the abnormal DMA engine is reset repeatedly. If the DMA still does not return to normal after repeated resets, the CPU simulates DMA data transfer mode is entered. The CPU simulates DMA data transfer mode does not use the DMA in the CPU's embedded NTB chip for data transfer, but rather the CPU simulates DMA for data transfer. Although this method occupies the CPU's running memory and may cause data loss, it can still perform data mirroring and caching when the DMA in the NTB chip cannot work properly, thus ensuring a certain level of data security.

[0082] It should be further noted that, when working in conjunction with the DMA management module, this application also includes:

[0083] Cache module: This module is located on the board and includes memory and high-speed cache within the component. That is, it is the module that caches data when the image is cached in this application. DMA moves data between cache modules and is controlled by the DMA management module. It sends the real-time data moving efficiency of DMA in four scenarios (small block random write, large block random write, small block sequential write, and large block sequential write) to the DMA management module in real time.

[0084] OS (operating system) module: The whole system module, which sends real-time front-end service pressure to the DMA management module.

[0085] CPU module: The main system processing unit, which switches to CPU-simulated DMA data transfer mode when an abnormal DMA occurs.

[0086] Indicator Module: This module is located on the board and is directly controlled by the serial port module. It indicates the real-time status of the current DMA management module.

[0087] Wireless module: It can convert the serial port module signal into wireless signals such as WIFI (mobile hotspot), so that the external side can communicate with the DMA management module without the need for a physical serial cable.

[0088] Serial port module: The serial port module enables information exchange, parameter preset, and activation of related functions between the external environment and the DMA management module.

[0089] Please refer to Figure 3 , Figure 3 This application provides a schematic diagram of a DMA management system. This system is applied to a DMA management module, which monitors the NTB chip of its corresponding controller. The NTB chip includes multiple DMAs. The system includes:

[0090] The detection unit 310 is used to detect whether there is an abnormal DMA in its corresponding NTB chip;

[0091] The first judgment unit 320 is used to determine whether only one DMA is in working state in its corresponding NTB chip when there is an abnormal DMA.

[0092] The first control unit 330 is used to prevent the NTB chip from being reset when not only one DMA is in operation, and to control the reset of abnormal DMA engines.

[0093] In one alternative embodiment, the DMA management system 300 also includes:

[0094] The second control unit is used to control the NTB chip to reset when only one DMA of the NTB chip is in working state.

[0095] In an alternative approach, the DMA management module is also used to monitor the CPU of its corresponding controller, and the CPU's embedded NTB chip includes multiple DMAs.

[0096] The DMA management system 300 also includes:

[0097] The first determining unit is used to determine whether the abnormal DMA is the DMA of the NTB chip or the DMA of the CPU's embedded NTB chip, and to trigger the judgment unit.

[0098] In one alternative approach, the DMA management modules corresponding to each controller are connected via a wireless network.

[0099] In one optional embodiment, the detection unit is specifically used to determine the data transfer efficiency of each DMA of the NTB chip; calculate the deviation ratio between the data transfer efficiency of the DMA and the theoretical transfer efficiency of the DMA; determine whether the deviation ratio is greater than a preset threshold, and whether the holding time of the deviation ratio is greater than the preset threshold is greater than a preset time; if so, it is determined that there is an abnormal DMA.

[0100] In one alternative embodiment, the DMA management system 300 also includes:

[0101] The acquisition unit is used to acquire real-time front-end business pressure.

[0102] The second determining unit is used to determine the theoretical transfer efficiency of DMA based on the correspondence between the real-time front-end service pressure and the theoretical transfer efficiency.

[0103] In one alternative embodiment, the DMA management system 300 also includes:

[0104] The second judgment unit is used to determine whether the abnormal DMA has resumed normal operation;

[0105] The third control unit is used to repeatedly control the abnormal DMA engine to reset when the abnormal DMA fails to resume normal operation, and enters the CPU simulated DMA data transfer mode after the number of times the abnormal DMA engine is reset exceeds a preset number.

[0106] For an introduction to the DMA management system 300 provided by the present invention, please refer to the above method embodiments; the present invention will not be described in detail here.

[0107] Please refer to Figure 4 , Figure 4This is a schematic diagram of the structure of a DMA management module provided in this application. The specific implementation of the DMA management module is not limited by the specific embodiments of the present invention.

[0108] like Figure 4 As shown, the DMA management module may include: a processor 402, a communications interface 404, a memory 406, and a communications bus 408.

[0109] The processor 402, communication interface 404, and memory 406 communicate with each other via communication bus 408. Communication interface 404 is used to communicate with other network elements such as clients or other servers. The processor 402 executes program 410, specifically performing the relevant steps described above in the DMA management method embodiment.

[0110] Specifically, program 410 may include program code, which includes computer-executable instructions.

[0111] Processor 402 may be a central processing unit (CPU), an application-specific integrated circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. The DMA management module includes one or more processors, which may be processors of the same type, such as one or more CPUs; or processors of different types, such as one or more CPUs and one or more ASICs.

[0112] Memory 406 is used to store program 410. Memory 406 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk storage device.

[0113] Specifically, program 410 can be called by processor 402 to cause the DMA management module to perform the following operations:

[0114] Detect whether there is abnormal DMA in its corresponding NTB chip;

[0115] If an abnormal DMA is present, determine whether only one DMA is active in the corresponding NTB chip.

[0116] If more than one DMA is active, then NTB chip reset is disabled, and abnormal DMAengine reset is controlled.

[0117] In an alternative manner, the program 410 is invoked by the processor 402 to cause the DMA management module to perform the following operations:

[0118] After determining whether only one DMA is active in its corresponding NTB chip, the process also includes:

[0119] If only one DMA is active in the NTB chip, then the NTB chip will be reset.

[0120] In an alternative approach, the DMA management module is also used to monitor the CPU of its corresponding controller, and the CPU's embedded NTB chip includes multiple DMAs.

[0121] The program 410 is invoked by the processor 402 to cause the DMA management module to perform the following operations:

[0122] Before determining whether only one DMA is active in its corresponding NTB chip, the following steps are also included:

[0123] Determine whether the abnormal DMA is the DMA of the NTB chip or the DMA of the CPU's embedded NTB chip.

[0124] The process then proceeds to determine whether only one DMA is active in the corresponding NTB chip.

[0125] In one alternative approach, the DMA management modules corresponding to each controller are connected via a wireless network.

[0126] In an alternative manner, the program 410 is invoked by the processor 402 to cause the DMA management module to perform the following operations:

[0127] Detect whether there is abnormal DMA in the corresponding NTB chip, including:

[0128] Determine the data transfer efficiency of each DMA of the NTB chip;

[0129] Calculate the deviation ratio between the data transfer efficiency of DMA and the theoretical transfer efficiency of DMA;

[0130] Determine whether the deviation ratio is greater than a preset threshold, and whether the holding time for a deviation greater than the preset threshold is greater than a preset time;

[0131] If so, then an abnormal DMA is determined to exist.

[0132] In an alternative manner, the program 410 is invoked by the processor 402 to cause the DMA management module to perform the following operations:

[0133] Before calculating the deviation between the data transfer efficiency of DMA and the theoretical transfer efficiency of DMA, the following steps are also included:

[0134] Get real-time front-end business load;

[0135] The theoretical migration efficiency of DMA is determined by the correspondence between real-time front-end business pressure and theoretical migration efficiency.

[0136] In an alternative manner, the program 410 is invoked by the processor 402 to cause the DMA management module to perform the following operations:

[0137] After resetting the DMA engine in case of an error, the following steps are also included:

[0138] Determine if the abnormal DMA has returned to normal operation;

[0139] If not, the abnormal DMA engine will be repeatedly reset, and after the number of times the abnormal DMA engine is reset exceeds the preset number, the CPU will enter the DMA data transfer simulation mode.

[0140] For an introduction to the DMA management module provided by this invention, please refer to the above method embodiments; the invention itself will not be described in detail here.

[0141] This invention provides a computer-readable storage medium storing at least one executable instruction that, when executed on a DMA management system / module, causes the DMA management system / module to perform the DMA management method in any of the above method embodiments.

[0142] Specifically, the executable instructions can be used to cause the DMA management system / module to perform the following operations:

[0143] Detect whether there is abnormal DMA in its corresponding NTB chip;

[0144] If an abnormal DMA is present, determine whether only one DMA is active in the corresponding NTB chip.

[0145] If more than one DMA is active, then NTB chip reset is disabled, and abnormal DMAengine reset is controlled.

[0146] In an alternative approach, the executable instructions cause the DMA management system / module to perform the following operations:

[0147] After determining whether only one DMA is active in its corresponding NTB chip, the process also includes:

[0148] If only one DMA is active in the NTB chip, then the NTB chip will be reset.

[0149] In an alternative approach, the DMA management module is also used to monitor the CPU of its corresponding controller, and the CPU's embedded NTB chip includes multiple DMAs.

[0150] The executable instructions cause the DMA management system / module to perform the following operations:

[0151] Before determining whether only one DMA is active in its corresponding NTB chip, the following steps are also included:

[0152] Determine whether the abnormal DMA is the DMA of the NTB chip or the DMA of the CPU's embedded NTB chip.

[0153] The process then proceeds to determine whether only one DMA is active in the corresponding NTB chip.

[0154] In one alternative approach, the DMA management modules corresponding to each controller are connected via a wireless network.

[0155] In an alternative approach, the executable instructions cause the DMA management system / module to perform the following operations:

[0156] Detect whether there is abnormal DMA in the corresponding NTB chip, including:

[0157] Determine the data transfer efficiency of each DMA of the NTB chip;

[0158] Calculate the deviation ratio between the data transfer efficiency of DMA and the theoretical transfer efficiency of DMA;

[0159] Determine whether the deviation ratio is greater than a preset threshold, and whether the holding time for a deviation greater than the preset threshold is greater than a preset time;

[0160] If so, then an abnormal DMA is determined to exist.

[0161] In an alternative approach, the executable instructions cause the DMA management system / module to perform the following operations:

[0162] Before calculating the deviation between the data transfer efficiency of DMA and the theoretical transfer efficiency of DMA, the following steps are also included:

[0163] Get real-time front-end business load;

[0164] The theoretical migration efficiency of DMA is determined by the correspondence between real-time front-end business pressure and theoretical migration efficiency.

[0165] In an alternative approach, the executable instructions cause the DMA management system / module to perform the following operations:

[0166] After resetting the DMA engine in case of an error, the following steps are also included:

[0167] Determine if the abnormal DMA has returned to normal operation;

[0168] If not, the abnormal DMA engine will be repeatedly reset, and after the number of times the abnormal DMA engine is reset exceeds the preset number, the CPU will enter the DMA data transfer simulation mode.

[0169] For a description of the computer-readable storage medium provided by the present invention, please refer to the above method embodiments; the present invention will not be described again here.

[0170] It should also be noted that, in this specification, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0171] The above description of the disclosed embodiments enables those skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the invention is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A DMA management method, characterized in that, The method is applied to a DMA management module, which monitors the NTB chip of its corresponding controller. The NTB chip includes multiple DMAs. Detect whether there is an abnormal DMA in the corresponding NTB chip; If the abnormal DMA exists, determine whether only one DMA is in working state in the corresponding NTB chip; If not only one DMA is active, then the NTB chip should be prevented from being reset, and the abnormal DMAengine should be controlled to reset. Detecting whether there is abnormal DMA in the corresponding NTB chip includes: Determine the data transfer efficiency of each DMA of the NTB chip; Calculate the deviation ratio between the data transfer efficiency of the DMA and the theoretical transfer efficiency of the DMA; Determine whether the deviation ratio is greater than a preset threshold, and whether the holding time of the deviation ratio is greater than a preset time; If so, then the abnormal DMA is determined to exist.

2. The DMA management method as described in claim 1, characterized in that, After determining whether only one DMA is active in the corresponding NTB chip, the process also includes: If only one DMA of the NTB chip is in working state, then the NTB chip is controlled to be reset.

3. The DMA management method as described in claim 1, characterized in that, The DMA management module is also used to monitor the CPU of its corresponding controller, and the embedded NTB chip of the CPU includes multiple DMAs. Before determining whether only one DMA is active in the corresponding NTB chip, the process also includes: The abnormal DMA is determined to be either the DMA of the NTB chip or the DMA of the embedded NTB chip of the CPU; The process then proceeds to determine whether only one DMA is active in the corresponding NTB chip.

4. The DMA management method as described in claim 1, characterized in that, The DMA management modules corresponding to each controller are connected via wireless network communication.

5. The DMA management method as described in claim 1, characterized in that, Before calculating the deviation ratio between the data transfer efficiency of the DMA and the theoretical transfer efficiency of the DMA, the following steps are also included: Get real-time front-end business load; The theoretical migration efficiency of the DMA is determined based on the correspondence between the real-time front-end service pressure and the theoretical migration efficiency.

6. The DMA management method according to any one of claims 1-5, characterized in that, After resetting the DMA engine due to a malfunction, the following steps are also included: Determine whether the abnormal DMA has returned to normal operation; If not, the abnormal DMA engine is repeatedly reset, and after the number of times the abnormal DMA engine is reset exceeds a preset number, the CPU simulates DMA data transfer mode.

7. A DMA management system, characterized in that, An application is made in a DMA management module, which monitors the NTB chip of its corresponding controller. The NTB chip includes multiple DMAs. The system includes: The detection unit is used to detect whether there is an abnormal DMA in the corresponding NTB chip; The first judgment unit is used to determine whether only one DMA is in working state in the NTB chip corresponding to itself when the abnormal DMA exists. The first control unit is used to prevent the NTB chip from being reset when not only one DMA is in a working state, and to control the reset of abnormal DMA engines. Detecting whether there is abnormal DMA in the corresponding NTB chip includes: Determine the data transfer efficiency of each DMA of the NTB chip; Calculate the deviation ratio between the data transfer efficiency of the DMA and the theoretical transfer efficiency of the DMA; Determine whether the deviation ratio is greater than a preset threshold, and whether the holding time of the deviation ratio is greater than a preset time; If so, then the abnormal DMA is determined to exist.

8. A DMA management module, characterized in that, include: The processor, memory, communication interface, and communication bus are provided, wherein the processor, memory, and communication interface communicate with each other via the communication bus. The memory is used to store at least one executable instruction that causes the processor to perform the operation of the DMA management method as described in any one of claims 1-6.

9. A computer-readable storage medium, characterized in that, The storage medium stores at least one executable instruction, which, when executed on the DMA management device, causes the DMA management device to perform the operation of the DMA management method as described in any one of claims 1-6.