A network card configuration method, system, device and storage medium

By centralizing global metadata configuration files to automate network interface card (NIC) configuration, the inflexibility and complexity of persistent SR-IOV configuration in existing technologies are resolved, achieving efficient, reliable, and highly available automated management of NIC configuration.

CN116389245BActive Publication Date: 2026-06-26JINAN INSPUR DATA TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
JINAN INSPUR DATA TECH CO LTD
Filing Date
2023-04-07
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing technologies for persistent configuration of SR-IOV in cloud platforms suffer from problems such as operating system dependence, non-universal deployment, complex and labor-intensive configuration, and inability to efficiently handle changes in the number of network interface cards.

Method used

A centralized global metadata configuration file is adopted. By obtaining and comparing the network interface card (NIC) name and PF mode, the metadata configuration file of the node is generated, the NIC is automatically configured and the configuration status information is recorded, so as to realize the centralized management and automation of NIC configuration.

Benefits of technology

It enables automated and centralized management of network interface card (NIC) configuration, reduces maintenance workload, improves configuration flexibility and high availability, and allows for quick location and resolution of configuration problems.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116389245B_ABST
    Figure CN116389245B_ABST
Patent Text Reader

Abstract

The application discloses a network card configuration method, comprising the steps of: obtaining the network card name and PF mode of each node; judging whether the network card name and PF mode of all nodes are the same; in response to the network card name and PF mode of all nodes being the same, configuring a global metadata configuration file, wherein the global metadata configuration file records an expected block, and the expected block comprises an expected VF number, a PF mode and a network card name; generating a metadata configuration file of each node based on the global metadata configuration file; scanning the network card of each node and storing the information of compatible network cards as state block information into the metadata configuration file of each node; and configuring each network card according to the expected block and the state block in the metadata configuration file of each node. The application further discloses a system, a computer device and a readable storage medium. The scheme provided by the application defines network card configuration information for each node through a global metadata configuration file.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of network interface cards (NICs), and more specifically to a NIC configuration method, system, device, and storage medium. Background Technology

[0002] SR-IOV is a hardware-based virtualization solution used in cloud platforms to virtualize network interface cards (NICs). Through relevant configuration commands, a physical NIC on a physical server can be virtualized into multiple Virtual Functions (VFs). Each VF is assigned to a virtual machine or container on that server, thereby providing the NIC's RDMA capabilities to applications on the virtual machine or container.

[0003] In cloud platform RDMA implementations, VF persistence is a major challenge. After a server failure or shutdown / restart, VF needs to be reconfigured. On virtualization or container platforms, the following solutions are commonly used:

[0004] Operating system customization, which forces the loading of kernel modules during operating system deployment;

[0005] Configure the SR-IOV startup script to configure the VF capacity, number of VFs, etc. of SR-IOV network devices for discovery and initialization.

[0006] The above solution still has several problems. Operating system customization is heavily dependent on the operating system and lacks deployment universality; if problems occur during the script configuration of SR-IOV, troubleshooting is complex; when the number of smart network interface cards in the production environment changes, significant manpower and time are required to modify the configuration and restart the server to ensure the configuration takes effect, and the configuration process also requires the normal migration of virtual machines or Pods to ensure normal business operation. Summary of the Invention

[0007] In view of this, in order to overcome at least one aspect of the above problems, embodiments of the present invention propose a network interface card (NIC) configuration method, comprising the following steps:

[0008] Get the network interface name and PF mode of each node;

[0009] Determine if the network interface card names and PF modes are the same for all nodes;

[0010] In response to the fact that the network interface card (NIC) name and PF mode are the same on all nodes, a global metadata configuration file is configured, wherein the global metadata configuration file records the expected block, which includes the expected number of VFs, PF mode, and NIC name;

[0011] A metadata configuration file for each node is generated based on the global metadata configuration file;

[0012] Scan the network interface card (NIC) of each node and store the information of compatible NICs as status block information in the metadata configuration file of each node;

[0013] Configure each network interface card (NIC) according to the expectation block and status block in the metadata configuration file of each node.

[0014] In some embodiments, it also includes:

[0015] Since the network interface card names and PF modes of all nodes are different, several tags are set for each node, and a global metadata configuration file corresponding to each tag is configured. The global metadata configuration file records the expected blocks and tags. The expected blocks include the expected number of VFs, PF mode, and network interface card name.

[0016] In some embodiments, it also includes:

[0017] Based on several tags of each node, obtain several corresponding global metadata configuration files, and classify different network interface card information according to PF mode to generate a metadata configuration file corresponding to the node from several global metadata configuration files.

[0018] In some embodiments, configuring each network interface card (NIC) according to the expectation block and status block in the metadata configuration file of each node further includes:

[0019] Compare the network interface card (NIC) name in each expectation block with the NIC information in the status block;

[0020] If the corresponding network interface name does not exist in the status block, the corresponding expectation block is filtered and an error flag is set.

[0021] In some embodiments, it also includes:

[0022] Determine whether the PF pattern in the filtered expected block contains an rdma pattern;

[0023] In response to its existence, check whether the first kernel module is enabled;

[0024] In response to the condition that it is not enabled, enable the kernel module;

[0025] Set a restart tag and evict the pod to another node, then restart the node after evictment.

[0026] In some embodiments, it also includes:

[0027] Determine if VF exists in the node's current network interface and has been used by the pod;

[0028] If a VF exists and has already been invoked by a pod, set a restart tag, evict the pod to another node, and restart the node after eviction.

[0029] In response to the absence of VF, configure the network interface card according to the PF mode in the expected block.

[0030] In some embodiments, in response to the absence of a VF, configuring the network interface card according to the PF mode in the desired block further includes:

[0031] In response to the PF mode being set to rdma in the expected block, the number of expected VFs is modified in the corresponding network card's system file via IO stream;

[0032] In response to the PF mode being Switchdev in the expected block, the expected number of VFs for the network interface card is modified via the systemctl system task, and the network interface card is switched to Switchdev mode.

[0033] Based on the same inventive concept, according to another aspect of the present invention, embodiments of the present invention also provide a network interface card (NIC) configuration system, comprising:

[0034] The acquisition module is configured to acquire the network interface name and PF mode of each node.

[0035] The judgment module is configured to determine whether the network interface card name and PF mode are the same for all nodes;

[0036] The configuration module is configured to respond to the fact that the network interface card name and PF mode are the same on all nodes. It configures a global metadata configuration file, wherein the global metadata configuration file records the expectation block, which includes the expected number of VFs, PF mode and network interface card name.

[0037] The generation module is configured to generate a metadata configuration file for each node based on the global metadata configuration file;

[0038] The scanning module is configured to scan the network interface card (NIC) of each node and store the information of compatible NICs as status block information in the metadata configuration file of each node.

[0039] The execution module is configured to configure each network interface card (NIC) based on the expectation block and status block in the metadata configuration file of each node.

[0040] Based on the same inventive concept, according to another aspect of the present invention, embodiments of the present invention also provide a computer device, comprising:

[0041] At least one processor; and

[0042] The memory stores a computer program that can run on the processor, and when the processor executes the program, it performs the steps of any of the network interface card configuration methods described above.

[0043] Based on the same inventive concept, according to another aspect of the present invention, embodiments of the present invention also provide a computer-readable storage medium storing a computer program that, when executed by a processor, performs the steps of any of the network interface card configuration methods described above.

[0044] The present invention has one of the following beneficial technical effects: The solution proposed in this invention defines network card configuration information for each node by centralizing the global metadata configuration file. When adding or deleting nodes, adding or deleting smart network cards, or modifying the number of VFs, only the global configuration template needs to be modified to automatically configure and add configuration status information (such as configuration process stage and configuration error reason, etc.) during the configuration process. The operation and maintenance personnel can quickly query the progress and locate and solve problems through the status information. Attached Figure Description

[0045] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other embodiments can be obtained based on these drawings without creative effort.

[0046] Figure 1 A schematic flowchart illustrating a network interface card (NIC) configuration method provided in an embodiment of the present invention;

[0047] Figure 2 A schematic diagram of the configuration system for a network interface card provided in an embodiment of the present invention;

[0048] Figure 3 A schematic diagram of the structure of a computer device provided for an embodiment of the present invention;

[0049] Figure 4 A schematic diagram of the structure of a computer-readable storage medium provided for an embodiment of the present invention. Detailed Implementation

[0050] To make the objectives, technical solutions, and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below with reference to specific examples and the accompanying drawings.

[0051] It should be noted that all uses of "first" and "second" in the embodiments of the present invention are for the purpose of distinguishing two entities or parameters with the same name but different names. It is clear that "first" and "second" are only for the convenience of expression and should not be construed as limiting the embodiments of the present invention. Subsequent embodiments will not explain this in detail.

[0052] According to one aspect of the present invention, embodiments of the present invention provide a method for configuring a network interface card (NIC), such as... Figure 1 As shown, it may include the following steps:

[0053] S1, obtain the network interface name and PF mode of each node;

[0054] S2, determine whether the network interface card name and PF mode are the same for all nodes;

[0055] S3, in response to the fact that the network interface card name and PF mode are the same for all nodes, configure a global metadata configuration file, wherein the global metadata configuration file records the expected block, the expected block includes the expected number of VFs, PF mode and network interface card name;

[0056] S4, Generate a metadata configuration file for each node based on the global metadata configuration file;

[0057] S5 scans the network interface card (NIC) of each node and stores the information of compatible NICs as status block information in the metadata configuration file of each node;

[0058] S6 configures each network interface card (NIC) based on the expectation block and status block in the metadata configuration file of each node.

[0059] The proposed solution uses a centralized global metadata configuration file to define network interface card (NIC) configuration information for each node. When adding or deleting nodes, adding or deleting smart NICs, or changing the number of VFs, only the global configuration template needs to be modified for automatic configuration. Configuration status information (such as configuration process stage and reasons for configuration errors) is added during the configuration process. Operation and maintenance personnel can quickly check the progress and locate and solve problems through the status information.

[0060] In some embodiments, it also includes:

[0061] Since the network interface card names and PF modes of all nodes are different, several tags are set for each node, and a global metadata configuration file corresponding to each tag is configured. The global metadata configuration file records the expected blocks and tags. The expected blocks include the expected number of VFs, PF mode, and network interface card name.

[0062] Specifically, when all nodes have the same network interface card (NIC) name and PF mode, only one set of expected value information needs to be specified. A node label configuration device can be deployed using a daemonset. It periodically scans (every 30 seconds) the `sriov_numvfs` file under the directory ` / sys / class / net / [NIC name] / device` on each node and marks the node with the tag `feature.node.kubernets.io / network-sriov.capable=true` based on the existence of this file, indicating whether the current node has an SR-IOV smart NIC. When the cluster's SR-IOV nodes have different NICs or different PF modes, in addition to adding `feature.node.kubernets.io / network-sriov.capable=true`, it is also necessary to specify a unique label for each node, specifying the NIC array information according to different nodes.

[0063] When a node needs to restart, it will also mark the node with the tag sriovnetwork.kubernets.io / state=reboot so that only one node in the cluster is currently evictping a pod to restart the node, thus ensuring high availability of the cluster.

[0064] In some embodiments, it also includes:

[0065] Based on several tags of each node, obtain several corresponding global metadata configuration files, and classify different network interface card information according to PF mode to generate a metadata configuration file corresponding to the node from several global metadata configuration files.

[0066] Specifically, the global metadata configuration file mainly consists of expectation value blocks. These expectation value blocks are arrays that specify the expected values ​​for different network interfaces (NICs) on different nodes. They primarily include the node label selector (nodeSelector) and NIC array information: the expected number of Virtual Functions (VFs), the Physical Function (PF) mode (rdma / switchdev), and the NIC name (nicSelector). The node selector is used to select and filter nodes, and based on numVfs ​​(expected number of VFs), rdma / switchdev (PF mode), and nicSelector (NIC selector), differentiated NIC configurations are generated for different nodes and NICs.

[0067] Then, iterate through the global metadata template, aggregate the same node label selectors, classify different network interface information according to the rdma / switchdev flag, and populate the network interface's metadata configuration file.

[0068] In some embodiments, when a change in the network interface card (NIC) configuration template of the local node is detected, all NICs on the local node are scanned, compatible NICs are filtered, and stored in the status block of the NIC configuration template. For example, a periodic scan (every 30 seconds) can be performed to filter and obtain basic NIC information, which will then be stored in the SR-IOV metadata configuration template status block.

[0069] In some embodiments, configuring each network interface card (NIC) according to the expectation block and status block in the metadata configuration file of each node further includes:

[0070] Compare the network interface card (NIC) name in each expectation block with the NIC information in the status block;

[0071] If the corresponding network interface name does not exist in the status block, the corresponding expectation block is filtered and an error flag is set.

[0072] In some embodiments, it also includes:

[0073] Determine whether the PF pattern in the filtered expected block contains an rdma pattern;

[0074] In response to its existence, check whether the first kernel module is enabled;

[0075] In response to the fact that it is not enabled, enable the kernel module;

[0076] Set a restart tag and evict the pod to another node, then restart the node after evictment.

[0077] In some embodiments, it also includes:

[0078] Determine if VF exists in the node's current network interface and has been used by the pod;

[0079] If a VF exists and has already been invoked by a pod, set a restart tag, evict the pod to another node, and restart the node after eviction.

[0080] In response to the absence of VF, configure the network interface card according to the PF mode in the expected block.

[0081] In some embodiments, in response to the absence of a VF, configuring the network interface card according to the PF mode in the desired block further includes:

[0082] In response to the PF mode being set to rdma in the expected block, the number of expected VFs is modified in the corresponding network card's system file via IO stream;

[0083] In response to the PF mode being Switchdev in the expected block, the expected number of VFs for the network interface card is modified via the systemctl system task, and the network interface card is switched to Switchdev mode.

[0084] Specifically, the network interface card (NIC) configuration template of this node is monitored via list / watch. When the expected value block of the NIC configuration template changes, the smart NIC is configured based on the expected block and the status block. The specific configuration process is as follows:

[0085] (1) First, compare the value of each expected block with the status block. If there is no network card with the same name in the status block, then filter out the value of this expected block.

[0086] (2) Determine if the rdma flag exists in the filtered expected block. If the rdma flag exists, check if the system has enabled the IOMMU and VFIO_PCI kernel modules. If not, call the grubby linux script to enable the kernel modules. If the IOMMU kernel module needs to be enabled, jump to steps 4 and 5 to evict the pod and restart the node to take over the effective kernel modules.

[0087] (3) Determine if the smart network interface card (NIC) already has a VF and is being used by a Pod. If it is already being used by a Pod, skip to steps 4 and 5. Otherwise, configure the NIC using either the rdma or Switchdev flags.

[0088] If rdma is present in the expected block, the numVfs ​​value will be modified to the corresponding network interface card's system file via I / O stream.

[0089] If Switchdev is present in the expected block, the system task will call a script to modify the network card configuration and switch the network card to switchdev mode.

[0090] After successful configuration, set the success flag in the status block. Proceed to step one to continue monitoring for changes in the network interface card (NIC) configuration template.

[0091] (4) Mark the node with the sriovnetwork.kubernets.io / state=reboot flag and remove the pods on this node so that they can be scheduled to other nodes to prevent service interruption. At the same time, only one node has this flag during the configuration process to ensure the high availability of the cluster.

[0092] (5) After the Pod is evicted, the device will force a restart of the node and remove the sriovnetwork.kubernets.io / state=reboot flag after the restart.

[0093] (6) Any error that occurs in steps 1, 2, 3, 4, and 5 will set an error flag in the status block and record the reason for the error in the status block. The operation and maintenance personnel can quickly check the progress and locate and solve the problem through the status information.

[0094] The proposed solution uses a centralized global metadata configuration file to define network interface card (NIC) configuration information for each node. When adding or deleting nodes, adding or deleting smart NICs, or changing the number of VFs, only the global configuration template needs to be modified for automatic configuration. Configuration status information (such as configuration process stage and reasons for configuration errors) is added during the configuration process. Operation and maintenance personnel can quickly check the progress and locate and solve problems through the status information.

[0095] Based on the same inventive concept, according to another aspect of the present invention, embodiments of the present invention also provide a network interface card (NIC) configuration system 400, such as... Figure 2 As shown, it includes:

[0096] Module 401 is configured to retrieve the network interface name and PF mode of each node.

[0097] The judgment module 402 is configured to determine whether the network card names and PF modes of all nodes are the same;

[0098] Configuration module 403 is configured to respond to the fact that the network interface card name and PF mode are the same for all nodes, and to configure a global metadata configuration file, wherein the global metadata configuration file records the expected block, the expected block including the expected number of VFs, PF mode and network interface card name;

[0099] The generation module 404 is configured to generate a metadata configuration file for each node based on the global metadata configuration file;

[0100] The scanning module 405 is configured to scan the network interface card (NIC) of each node and store the information of compatible NICs as status block information in the metadata configuration file of each node.

[0101] Execution module 406 is configured to configure each network interface card (NIC) according to the expectation block and status block in the metadata configuration file of each node.

[0102] Based on the same inventive concept, according to another aspect of the present invention, such as Figure 3 As shown, an embodiment of the present invention also provides a computer device 501, comprising:

[0103] At least one processor 520; and

[0104] The memory 510 stores a computer program 511 that can run on the processor. When the processor 520 executes the program, it performs the steps of any of the network card configuration methods described above.

[0105] Based on the same inventive concept, according to another aspect of the present invention, such as Figure 4As shown, embodiments of the present invention also provide a computer-readable storage medium 601, which stores a computer program 610. When the computer program 610 is executed by a processor, it performs the steps of any of the network card configuration methods described above.

[0106] Finally, it should be noted that those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods.

[0107] Furthermore, it should be understood that the computer-readable storage medium (e.g., memory) described herein may be volatile memory or non-volatile memory, or may include both volatile memory and non-volatile memory.

[0108] Those skilled in the art will also understand that the various exemplary logic blocks, modules, circuits, and algorithm steps described in conjunction with the disclosure herein can be implemented as electronic hardware, computer software, or a combination of both. To clearly illustrate this interchangeability between hardware and software, the functionality of various illustrative components, blocks, modules, circuits, and steps has been generally described. Whether this functionality is implemented as software or as hardware depends on the specific application and the design constraints imposed on the system as a whole. Those skilled in the art can implement the functionality in various ways for each specific application, but such implementation decisions should not be construed as departing from the scope of the embodiments disclosed herein.

[0109] The above are exemplary embodiments disclosed in this invention. However, it should be noted that various changes and modifications can be made without departing from the scope of the embodiments of this invention as defined by the claims. The functions, steps, and / or actions of the methods according to the disclosed embodiments described herein do not need to be performed in any particular order. Furthermore, although the elements disclosed in the embodiments of this invention may be described or claimed individually, they may be understood as multiple unless explicitly limited to a singular number.

[0110] It should be understood that, as used herein, the singular form “a” is intended to include the plural form as well, unless the context clearly supports an exception. It should also be understood that, as used herein, “and / or” refers to any and all possible combinations of one or more of the associated listed items.

[0111] The embodiment numbers disclosed in the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0112] Those skilled in the art will understand that all or part of the steps of the above embodiments can be implemented by hardware or by a program instructing related hardware. The program can be stored in a computer-readable storage medium, such as a read-only memory, a disk, or an optical disk.

[0113] Those skilled in the art should understand that the discussion of any of the above embodiments is merely exemplary and is not intended to imply that the scope of the invention (including the claims) is limited to these examples. Within the framework of the invention, technical features of the above embodiments or different embodiments can be combined, and many other variations of different aspects of the invention exist, which are not provided in the details for the sake of brevity. Therefore, any omissions, modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the invention should be included within the protection scope of the invention.

Claims

1. A method for configuring a network interface card (NIC), characterized in that, Includes the following steps: Get the network interface name and PF mode of each node; Determine if the network interface card names and PF modes are the same for all nodes; In response to the fact that the network interface card (NIC) name and PF mode are the same on all nodes, a global metadata configuration file is configured, wherein the global metadata configuration file records the expected block, which includes the expected number of VFs, PF mode, and NIC name; A metadata configuration file for each node is generated based on the global metadata configuration file; Scan the network interface card (NIC) of each node and store the information of compatible NICs as status block information in the metadata configuration file of each node; Configure each network interface card (NIC) according to the expectation block and status block in the metadata configuration file of each node.

2. The method as described in claim 1, characterized in that, Also includes: Since the network interface card names and PF modes of all nodes are different, several tags are set for each node, and a global metadata configuration file corresponding to each tag is configured. The global metadata configuration file records the expected blocks and tags. The expected blocks include the expected number of VFs, PF mode, and network interface card name.

3. The method as described in claim 2, characterized in that, Also includes: Based on several tags of each node, obtain several corresponding global metadata configuration files, and classify different network interface card information according to PF mode to generate a metadata configuration file corresponding to the node from several global metadata configuration files.

4. The method as described in claim 1, characterized in that, Configure each network interface card (NIC) according to the expected block and status block in the metadata configuration file of each node, further including: Compare the network interface card (NIC) name in each expectation block with the NIC information in the status block; If the corresponding network interface name does not exist in the status block, the corresponding expectation block is filtered and an error flag is set.

5. The method as described in claim 4, characterized in that, Also includes: Determine whether the PF pattern in the filtered expected block contains an rdma pattern; In response to its existence, check whether the first kernel module is enabled; In response to the fact that it is not enabled, enable the kernel module; Set a restart tag and evict the pod to another node, then restart the node after evictment.

6. The method as described in claim 4, characterized in that, Also includes: Determine if VF exists in the node's current network interface and has been used by the pod; If a VF exists and has already been invoked by a pod, set a restart tag, evict the pod to another node, and restart the node after eviction. In response to the absence of VF, configure the network interface card according to the PF mode in the expected block.

7. The method as described in claim 6, characterized in that, In response to the absence of VF, the network interface card is configured according to the PF mode in the expected block, further including: In response to the PF mode being set to rdma in the expected block, the number of expected VFs is modified in the corresponding network card's system file via IO stream; In response to the PF mode being Switchdev in the expected block, the expected number of VFs for the network interface card is modified via the systemctl system task, and the network interface card is switched to Switchdev mode.

8. A network interface card (NIC) configuration system, characterized in that, include: The acquisition module is configured to acquire the network interface name and PF mode of each node. The judgment module is configured to determine whether the network interface card name and PF mode are the same for all nodes; The configuration module is configured to respond to the fact that the network interface card name and PF mode are the same on all nodes. It configures a global metadata configuration file, wherein the global metadata configuration file records the expectation block, which includes the expected number of VFs, PF mode and network interface card name. The generation module is configured to generate a metadata configuration file for each node based on the global metadata configuration file; The scanning module is configured to scan the network interface card (NIC) of each node and store the information of compatible NICs as status block information in the metadata configuration file of each node. The execution module is configured to configure each network interface card (NIC) based on the expectation block and status block in the metadata configuration file of each node.

9. A computer device, comprising: At least one processor; as well as A memory storing a computer program executable on the processor, characterized in that the processor executes the program by performing the steps of the method as described in any one of claims 1-7.

10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it performs the steps of the method as described in any one of claims 1-7.