Data transmission system, data transmission method and network device

By encapsulating and decapsulating virtual machine data in the RDMA protocol, the problem of low data transfer efficiency among multiple virtual machines is solved, achieving efficient data transfer and storage.

CN114765631BActive Publication Date: 2026-06-12HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2021-01-14
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

The existing RDMA protocol is inefficient when transferring data from multiple virtual machines, as it cannot transfer data from multiple virtual machines to the memory of another system at once, thus affecting the transfer efficiency.

Method used

Data from multiple virtual machines is obtained through the first network device. The data and write address are encapsulated into a message according to the RDMA protocol and sent to the second network device. The second network device decapsulates and stores the data, reducing processing steps and memory usage.

🎯Benefits of technology

It improves data transmission efficiency, reduces processing steps and memory usage, and ensures data reliability and feasibility.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN114765631B_ABST
    Figure CN114765631B_ABST
Patent Text Reader

Abstract

The application discloses a data transmission system, a data transmission method and network equipment, and is used for improving transmission efficiency. The system is applied to remote direct memory access (RDMA) data transmission. The system comprises N data of N VMs running on a first host and needing to be transmitted to a second host through RDMA. The N VMs are normal data extraction VMs. A first network equipment in the first host can encapsulate the N data and write addresses of the N data according to an RDMA protocol, and then send a packet generated by encapsulation to a second network equipment on the second host. The second network equipment can decapsulate the packet, extract the N data and the write addresses of the N data, and store the N data in positions indicated by the write addresses.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of network communication technology, and in particular to a data transmission system, a data transmission method, and a network device. Background Technology

[0002] Remote direct memory access (RDMA) is a protocol that allows data to be transferred directly from one system to the memory of another system over a network without operating system intervention. The RDMA protocol encapsulates the data to be transferred into one or more RDMA messages, and then sends these messages from the sender to the receiver.

[0003] RDMA transfers data from one system to another's memory using a single send queue (SQ), where the data in the SQ consists of only data from a virtual machine (VM).

[0004] When a business involves data from multiple virtual machines (VMs), it is impossible to transfer the data from multiple VMs to the memory of another system at once, which affects the transfer efficiency. Summary of the Invention

[0005] This application provides a data transmission system, a data transmission method, and a network device for improving transmission efficiency.

[0006] The first aspect of this application provides a data transmission system, comprising: a first network device and a second network device, the first network device being mounted on a first host, the second network device being mounted on a second host, and N virtual machines (VMs) running on the first host; the first network device being configured to acquire N data points from the N VMs, encapsulate the N data points and their write addresses into a message according to a Remote Direct Memory Access (RDMA) protocol, and send the message to the second network device, wherein N is an integer greater than 1; the second network device being configured to receive the message, decapsulate the message to obtain the N data points and their write addresses, and store the N data points on the second host according to the write addresses.

[0007] In the first aspect described above, N pieces of data from N virtual machines (VMs) running on the first host need to be transmitted to the second host via RDMA. These N VMs are VMs capable of extracting data normally. The first network device on the first host can encapsulate the N pieces of data and their write addresses according to the RDMA protocol, and then send the encapsulated message to the second network device on the second host. The second network device can decapsulate the message, extract the N pieces of data and their write addresses, and store the N pieces of data at the location indicated by the write addresses. The first network device can directly send data from multiple virtual machines to the second network device, improving transmission efficiency.

[0008] In one possible implementation, a first network device is used to obtain the identifiers and memory addresses of N VMs; and to obtain N data based on the identifiers and memory addresses of the N VMs.

[0009] In the above possible implementations, the first network device can directly obtain N data based on the VM identifier and VM memory address of the N data to be RDMA transmitted to the second network device. The write address received by the second network device includes the VM identifier and VM memory address. The second network device can directly use the write address to store the N data without going through the chip logical address (CLA), which can reduce the processing steps.

[0010] In one possible implementation, an abnormal VM exists on the first host, and data cannot be obtained based on the identifier and memory address of the abnormal VM; the first network device is used to obtain M data based on the identifier and memory address of the abnormal VM and the identifiers and memory addresses of some VMs among N VMs, where M is a positive integer less than N; and to encapsulate the M data into an abnormal message according to the RDMA protocol.

[0011] In the above possible implementations, if there is an abnormal VM among the N VMs, such as a VM malfunctioning, shutting down, or restarting, the first network device cannot obtain data based on the identifier and memory address of the abnormal VM. That is, the abnormal message generated by encapsulating the identifier and memory address of the abnormal VM and the identifiers and memory addresses of some of the N VMs cannot be sent to the second network device through queuepair (QP) messages.

[0012] In one possible implementation, a first network device is used to generate a message sequence, the message sequence including anomaly messages and at least one message.

[0013] In the above possible implementations, when there is an abnormal VM among the N VMs, the first network device needs to generate a message sequence from the abnormal message and at least one of the above messages, so that the first network device can send the abnormal message and the above messages in sequence according to the message sequence number of the message sequence, thereby improving the feasibility of the solution.

[0014] In one possible implementation, a first network device is configured to modify a message sequence, wherein modifying the message sequence includes deleting abnormal messages and adding padding messages to the message sequence.

[0015] In the above possible implementations, the presence of abnormal packets in the message sequence prevents the second network device from receiving them, causing the second network device to continuously request message retransmission. Multiple retransmission failures lead to a disconnect between the first and second network devices, resulting in RDMA transmission failure. This application can remove abnormal packets from the message sequence and then supplement the sequence with invalid padding packets to ensure the data length remains unchanged, thus avoiding retransmissions due to data length discrepancies.

[0016] In one possible implementation, the second network device is configured to receive the modified message sequence, determine the padding message in the modified message sequence, and delete the padding message.

[0017] In the above possible implementations, after the second network device receives the modified message sequence, it can determine the invalid messages in it through verification, that is, determine the padding messages in the modified message sequence, and then delete the padding messages to improve the reliability of message transmission.

[0018] A second aspect of this application provides a data transmission method, the method comprising: a first network device acquiring N data items, the first network device being configured on a first host, the first host running N virtual machines (VMs), the N data items coming from the N VMs, wherein N is an integer greater than 1; the first network device encapsulating the N data items and their write addresses into a message according to a Remote Direct Memory Access (RDMA) protocol; and the first network device sending the message to a second network device.

[0019] In one possible implementation, a first host runs an abnormal VM that cannot obtain data based on its identifier and memory address. The method further includes: a first network device obtaining M data items based on the identifier and memory address of the abnormal VM and the identifiers and memory addresses of some of the N VMs, where M is a positive integer less than N; and the first network device encapsulating the M data items into an abnormal message according to the RDMA protocol.

[0020] In one possible implementation, the method further includes: a first network device generating a message sequence, the message sequence including an abnormal message and at least one message.

[0021] In one possible implementation, the method further includes: a first network device modifying a message sequence, wherein modifying the message sequence includes deleting abnormal messages and adding padding messages to the message sequence.

[0022] A third aspect of this application provides a data transmission method, comprising: a second network device receiving a message from a first network device, the first network device being configured on a first host, the second network device being configured on a second host, the first host running N virtual machines (VMs), the message being a message generated by encapsulating N data and N data write addresses according to the Remote Direct Memory Access (RDMA) protocol, the N data coming from the N VMs, where N is an integer greater than 1; the second network device decapsulating the message to obtain the N data and N data write addresses; and the second network device storing the N data on the second host according to the N data write addresses.

[0023] In one possible implementation, the method further includes: a second network device receiving a modified message sequence, the modified message sequence including padding messages; the second network device determining the padding messages in the modified message sequence and deleting the padding messages.

[0024] A fourth aspect of this application provides a network device, comprising: an acquisition unit for acquiring N data points, wherein the network device is configured on a first host, the first host runs N virtual machines (VMs), and the N data points come from the N VMs, wherein N is an integer greater than 1; an encapsulation unit for encapsulating the N data points and their write addresses into a message according to a Remote Direct Memory Access (RDMA) protocol; and a sending unit for sending the message to a second network device.

[0025] The network device is used to perform the method of the second aspect or any implementation thereof.

[0026] A fifth aspect of this application provides a network device, comprising: a receiving unit for receiving a message from a first network device, wherein the first network device is configured on a first host, and a second host is configured on a second host, wherein the first host runs N virtual machines (VMs), and the message is a message generated by encapsulating N data and N data write addresses according to the Remote Direct Memory Access (RDMA) protocol, wherein the N data come from the N VMs, and N is an integer greater than 1; a decapsulation unit for decapsulating the message to obtain the N data and N data write addresses; and a storage unit for storing the N data on the second host according to the N data write addresses.

[0027] The network device is used to perform the method of the third aspect or any implementation thereof.

[0028] A sixth aspect of this application provides a network device, including a processor, a memory, and a communication interface. The processor is configured to execute instructions stored in the memory, causing the network device to perform the method provided in the second aspect or any alternative method of the second aspect described above. The communication interface is configured to receive or transmit instructions. Specific details of the network device provided in the sixth aspect can be found in the second aspect or any alternative method of the second aspect described above, and will not be repeated here.

[0029] A seventh aspect of this application provides a network device, including a processor, a memory, and a communication interface. The processor is configured to execute instructions stored in the memory, causing the network device to perform the method provided in the third aspect or any alternative method of the third aspect described above. The communication interface is configured to receive or transmit instructions. Specific details of the network device provided in the seventh aspect can be found in the third aspect or any alternative method of the third aspect described above, and will not be repeated here.

[0030] The eighth aspect of this application provides a computer-readable storage medium storing a program that, when executed by a computer, performs the method provided in the second aspect or any alternative method of the second aspect.

[0031] The ninth aspect of this application provides a computer-readable storage medium storing a program that, when executed by a computer, performs the method provided in the third aspect or any of the alternative methods of the third aspect.

[0032] The tenth aspect of this application provides a computer program product that, when executed on a computer, performs the method provided in the second aspect or any alternative method of the second aspect.

[0033] The eleventh aspect of this application provides a computer program product that, when executed on a computer, performs the method provided in the third aspect or any of the optional methods of the third aspect. Attached Figure Description

[0034] Figure 1 A system framework diagram of the data transmission method provided in the embodiments of this application;

[0035] Figure 2 This is a schematic diagram of the data transmission system structure provided in an embodiment of this application;

[0036] Figure 3 An embodiment of the data transmission method provided in this application;

[0037] Figure 4 A schematic diagram of the CLA address provided in the embodiments of this application;

[0038] Figure 5 This is another embodiment of the data transmission method provided in the embodiments of this application;

[0039] Figure 6 A schematic diagram of the combined strips provided in the embodiments of this application;

[0040] Figure 7 A schematic diagram of a filling message provided in an embodiment of this application;

[0041] Figure 8 This is another schematic diagram of the filling message provided in the embodiments of this application;

[0042] Figure 9 This is a schematic diagram of the network device provided in the embodiments of this application;

[0043] Figure 10 This is a schematic diagram of the network device provided in the embodiments of this application;

[0044] Figure 11 This is a schematic diagram of the network device provided in the embodiments of this application;

[0045] Figure 12 This is a schematic diagram of the network device provided in an embodiment of this application. Detailed Implementation

[0046] This application provides a data transmission system, data transmission method, and network device to reduce processing steps and memory usage. The embodiments of this application are described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. Those skilled in the art will understand that with the development of technology and the emergence of new scenarios, the technical solutions provided by the embodiments of this application are also applicable to similar technical problems.

[0047] The terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments described herein can be implemented in a sequence other than that illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0048] The data transmission method in this application is mainly applicable to RDMA transmission scenarios. When an application initiates an RDMA read / write request, the system does not perform data copying, which reduces the number of context switches between kernel space and user space during network communication. Without requiring any kernel memory involvement, the RDMA request is sent from the application running in user space to the local network card, and then transmitted over the network to the remote network card. Therefore, RDMA transmission does not require operating system involvement and does not increase system load. Figure 1 The diagram illustrates a system framework of the data transmission method according to an embodiment of this application. It shows a transmission scenario for RDMA transmission: a first application retrieves data from memory, generates an RDMA message, and sends the RDMA message to the local network card via a buffer. Then, it transmits the message over the network to a remote network card. The remote network card buffers the received RDMA message, and a second application retrieves data from the buffer and writes it into memory. Similarly, the process of the first application reading data from the second application's memory is similar to the writing process described above and will not be repeated here. Furthermore, the network card includes an RDMA network interface card (RNIC) or a host channel adapter (HCA).

[0049] RDMA transfer transmits data from one system to another's memory using a single send queue (SQ). The data in the SQ only includes the data of one virtual machine (VM). When a service involves data from multiple VMs, it is impossible to transfer the data of multiple VMs to the memory of another system at once, which affects the transmission efficiency.

[0050] To address the aforementioned problems, embodiments of this application provide a data transmission system, the structure of which can be found in [reference needed]. Figure 2 The data transmission system includes a first network device 21 and a second network device 22. The first network device 21 is set on a first host 211, and the second network device 22 is set on a second host 221. The first host 211 runs N virtual machines VM 2111.

[0051] The first network device 21 is used to acquire N data points from N VMs 2111. According to the Remote Direct Memory Access (RDMA) protocol, the N data points and their write addresses are encapsulated into a message and sent to the second network device 22, where N is an integer greater than 1.

[0052] The second network device 22 is used to receive messages, decapsulate the messages to obtain N data and the write address of N data, and store N data on the second host 221 according to the write address of N data.

[0053] Specifically, N pieces of data from N virtual machines (VMs) 2111 running on the first host 211 need to be transmitted to the second host 221 via RDMA. These N VMs 2111 are VMs capable of retrieving data normally. The first network device 21 on the first host 211 can encapsulate the aforementioned N pieces of data and their write addresses according to the RDMA protocol, and then send the encapsulated message to the second network device 22 on the second host 221. The second network device 22 can decapsulate the message, extract the N pieces of data and their write addresses, and store the N pieces of data at the location indicated by the write addresses. The first network device 21 can directly send data from multiple virtual machines to the second network device 22, improving transmission efficiency.

[0054] Optionally, the first network device 21 is used to obtain the identifiers and memory addresses of N VMs 2111; and to obtain N data based on the identifiers and memory addresses of the N VMs.

[0055] Specifically, the first network device 21 can directly obtain N data based on the VM identifier and VM memory address of the N data to be RDMA transmitted to the second network device 22. The write address received by the second network device 22 includes the VM identifier and VM memory address. The second network device 22 can directly use the write address to store the N data without going through the chip logical address (CLA), which can reduce the processing steps.

[0056] Optionally, the first host 211 has an abnormal VM running on it, and data cannot be obtained based on the identifier and memory address of the abnormal VM; the first network device 21 is used to obtain M data based on the identifier and memory address of the abnormal VM and the identifier and memory address of some VMs in N VMs 2111, where M is a positive integer less than N; and encapsulate the M data into an abnormal message according to the RDMA protocol.

[0057] Specifically, if there is an abnormal VM among the N VMs 2111, such as a VM malfunctioning, shutting down, or restarting, the first network device 21 cannot obtain data based on the identifier and memory address of the abnormal VM. That is, the abnormal message generated by encapsulating the identifier and memory address of the abnormal VM and the identifiers and memory addresses of some of the N VMs cannot be sent to the second network device 22 through queue pair (QP) messages.

[0058] Optionally, the first network device 21 is used to generate a message sequence, the message sequence including an abnormal message and at least one message.

[0059] Specifically, when there is an abnormal VM among the N VMs, the first network device 21 needs to generate a message sequence from the abnormal message and at least one of the above messages, so that the first network device 21 can send the abnormal message and the above messages in sequence according to the message sequence number of the message sequence, thereby improving the feasibility of the solution.

[0060] Optionally, the first network device 21 is used to modify the message sequence, wherein modifying the message sequence includes deleting abnormal messages and adding padding messages to the message sequence.

[0061] Specifically, if there are abnormal packets in the message sequence, these abnormal packets cannot be sent to the second network device 22, causing the second network device 22 to continuously request message retransmission. Multiple retransmission failures will lead to a break in the link between the first network device 21 and the second network device 22, resulting in RDMA transmission failure. This application can delete the abnormal packets from the message sequence and then supplement the message sequence with invalid padding packets to ensure the data length remains unchanged, thus avoiding retransmissions due to data length discrepancies.

[0062] Optionally, the second network device 22 is used to receive the modified message sequence, determine the padding message in the modified message sequence, and delete the padding message.

[0063] Specifically, after receiving the modified message sequence, the second network device 22 can identify invalid messages through verification, that is, identify padding messages in the modified message sequence, and then delete the padding messages to improve the reliability of message transmission.

[0064] In the technical solution of this application embodiment, the first network device can directly send data from multiple VMs to the second network device, thereby improving transmission efficiency.

[0065] The VMs running on the first host of the first network device can all be VMs that can extract data normally, or they can include abnormal VMs that cannot extract data. These will be described separately below.

[0066] First, all VMs running on the first host are VMs that can extract data normally.

[0067] Please see Figure 3 An example of the data transmission method shown:

[0068] 301. The first network device acquires N data points.

[0069] In this embodiment, N (N is an integer greater than 1) data items are the data that the first network device needs to write to the second network device via RDMA. These N data items are stored in N virtual machines (VMs) of the first host of the first network device. The first network device can retrieve the N data items from these N VMs based on their identifiers and memory addresses. In this embodiment, all N VMs are VMs capable of normally retrieving data.

[0070] 302. The first network device encapsulates N data points and the write addresses of N data points into a single message according to the RDMA protocol.

[0071] In this embodiment, the encapsulated message can be an RDMA write message. RDMA write messages include three types: RDMA write First, RDMA write Middle, and RDMA write Last. The RDMA extended transport header (RETH) is a field in the RDMA message format that carries the destination address of the message data. The base transport header (BTH) is another field in the RDMA message format, including the PSN. Therefore, the RDMA write First message includes the packet sequence number (PSN) and the write address, while the RDMA write Middle and RDMA write Last messages contain the BTH field.

[0072] In this embodiment, the first network device can pre-create the RDMA QP, which can be created on the host's hypervisor or on other central processing units (CPUs) or devices. Simultaneously, the RDMA hardware device (RNIC, HCA, or other RDMA-capable hardware device) can directly access the VM's memory space. When RDMA transmission hardware is limited to CLA, the first network device pre-creates a CLA for RDMA hardware access. To ensure no conflicts during CLA use, at least one CLA corresponding to an L_key (used for local device memory reads) / R_key (used for remote device memory reads; L_key and R_key can also use the same value) must be unique. Then, RDMA memory registration is performed, mapping the VM's memory to be read to the CLA. RDMA data operation commands can all be based on the CLA. The registration registry includes L_key, R_key, CLA starting address, VM identifier, VM memory address, and length. There are multiple VM identifiers, and each VM identifier corresponds to a VM memory address and length. The first network device can query the registry based on the CLA and length of the target data to obtain the VM identifier, VM memory address, and length corresponding to the CLA and length. Then, it extracts the target data from the locations corresponding to the VM identifier, VM memory address, and length for encapsulation. In this embodiment, the CLA address of the target data can be referenced... Figure 4 As shown, a CLA address includes the addresses of data for multiple VMs. For example, VM1 includes data sg11, sg12, and sg13; VM2 includes data sg21, sg22, and sg23; and VM3 includes data sg31, sg32, and sg33. The address of a CLA may include addresses adr11, adr21, and adr31, where address adr11 can indicate data sg11 in VM1, address adr21 can indicate data sg21 in VM2, and address adr31 can indicate data sg31 in VM3.

[0073] Optionally, the registry can also only include L_key, R_key, VM identifier, VM memory address, and length. The first network device can directly use the VM identifier, VM memory address, and length of the target data to extract the target data from the corresponding VM for encapsulation. In this embodiment, the VM identifier, VM memory address, and length need to be securely verified through the registry to prevent read / write operations on VMs in memory areas outside the registry when the read / write location indicated by these identifiers exceeds the memory area indicated by the VM identifier, VM memory address, and length in the registry.

[0074] In this embodiment, the VM identifier, based on a single-root I / O virtualization (SR-IOV) device, can be an identifier (ID) of a physical function (PF) / virtual function (VF). Based on a Scalable I / O virtualization (Scalable-IOV) device, it can be an assignable device interface (ADI), or other identifiers that can identify different VMs or address domains, such as a process address space identifier (PASID).

[0075] Optionally, the memory may be the memory space of an application in the second network device, and the target data may be data stored in the memory space of an application in the first network device.

[0076] The maximum transmission unit (MTU) is the maximum data packet size that can be transmitted in each RDMA transmission in the RDMA protocol. The number of the first messages mentioned above can be determined based on the size of the target data and the MTU.

[0077] Optionally, in this embodiment, the first network device transmitting data to the second network device via RDMA can be either the first network device directly writing data to the second network device's RDMA, or the second network device reading data from the first network device's RDMA. When the second network device reads data from the first network device's RDMA, before step 301, the second network device also needs to send a data read request to the first network device to trigger step 301.

[0078] 303. The first network device sends the message to the second network device.

[0079] In this embodiment, when the message encapsulation is successful, the first network device can directly send the message to the second network device according to the PSN sequence of the message. Correspondingly, the second network device receives the message.

[0080] 304. The second network device decapsulates the message to obtain N data items and the write address for N data items.

[0081] After receiving the above message, the second network device can directly decapsulate the message, such as disassembling the protocol packet, processing the information in the packet header, and extracting N data items and the write address of the N data items in the payload.

[0082] 305. The second network device stores N data on the second host according to the write address of N data.

[0083] After obtaining the above write address, the second network device can store N data in the VM indicated by the write address. For example, if the write address is the CLA address and length, the second network device can query the registry to match the corresponding VM identifier, VM memory address and length for storage based on the CLA address and length. Alternatively, if the write address is the VM identifier, VM memory address and length, the second network device can directly store the N data based on the VM identifier, VM memory address and length.

[0084] In this embodiment, the first network device encapsulates N data points from N VMs running on the first host into a single message and sends the message to the second network device. The first network device can directly send data from multiple VMs to the second network device without performing multiple RDMA transfers, thus improving transmission efficiency.

[0085] Second, the VMs running on the first host include abnormal VMs that cannot extract data.

[0086] Please see Figure 5 ,like Figure 5 Another embodiment of the data transmission method provided in this application includes:

[0087] 501. The first network device acquires N data points.

[0088] 502. The first network device encapsulates N data points and the write addresses of N data points into a single message according to the RDMA protocol.

[0089] Steps 501 and 502 can be referred to Figure 3 The relevant descriptions of steps 301 and 302 in the data transmission method shown will not be repeated here.

[0090] 503. The first network device obtains M data points based on the identifier and memory address of the abnormal VM, as well as the identifiers and memory addresses of some VMs among the N VMs.

[0091] In this embodiment, the abnormal VM is a VM that cannot extract data normally. The abnormal VM may be due to VM failure, shutdown, or restart. The data transmitted by the first network device in RDMA involves the data in the abnormal VM. The initiated RDMA command, such as direct memory access (DMA), fails. The data cannot be obtained based on the identifier and memory address of the abnormal VM. Only M data can be obtained based on the identifier and memory address of some of the N VMs.

[0092] 504. The first network device encapsulates M data items into an exception message according to the RDMA protocol.

[0093] When the first network device encapsulates the data of an abnormal VM and some VMs among N VMs according to the RDMA protocol, since the data of the abnormal VM cannot be obtained, the encapsulated message is an abnormal message. The first network device can directly return the command to complete and configure indication information for the abnormal message. This indication information can indicate that the abnormal message is an error message. For example, the indication information can be configured in the complete queue element (CQE) of the complete queue (CQ).

[0094] In this embodiment, the order of steps 501 to 502 and steps 503 to 504 is not limited.

[0095] 505. The first network device generates a message sequence, which includes an abnormal message and at least one message.

[0096] When the first network device transmits data to the second network device RDMA, it transmits the data sequentially according to the PSN of the transmitted packets. In this embodiment, the first device sorts the normally encapsulated packets and abnormal packets to form a packet sequence and configures the PSN for the normally encapsulated packets and abnormal packets.

[0097] 506. The first network device modifies the message sequence, wherein modifying the message sequence includes deleting abnormal messages and adding padding messages to the message sequence.

[0098] In this embodiment, the first network device can process factors in the message sequence that affect the reception status of the second network device. For example, the first network device sends a message sequence to the second device via a QP message. The first network device can modify the PSN in the message sequence, skipping the error message and sending the next message. This avoids the second network device detecting a discontinuous PSN and assuming that an intermediate message was lost, leading to repeated retransmission attempts. After several unsuccessful attempts, it may conclude that there is a QP failure or QP disconnection. For example, as shown... Figure 6 The diagram shows a combination of stripes, labeled VM0, VM1, VM2, VM3, VM4 (not shown in the diagram), and VM5 (not shown in the diagram). VM0 includes data D000, D001, and D020, represented by thin solid lines; VM1 includes data D110 and D111; VM2 includes data D210, D220, and D221; VM3 includes data D300, D301, and D310; and the data in VM4 and VM5 are represented by solid boxes. That is, combining the data from VM0 to VM5 into stripes yields the following... Figure 6The MSG0 to MSG5 messages shown in the diagram can be sent from the first network device to the second network device in the order shown in the send queue (SQ).

[0099] When VM2 fails, its data becomes corrupted, preventing MSG2 and MSG3 from directly obtaining data from VM2. This causes MSG2 and MSG3 to fail to send, and after multiple retransmissions, the connection is broken, preventing other normal messages from being sent. In this embodiment, the subsequent MSG4 and MSG5 data can be moved forward to replace the data from MSG2 and MSG3, such as... Figure 7 As shown, the original MSG4 data is now numbered MSG2', and the MSG5 data is now numbered MSG3'. Then, padding messages are added to create new MSG4 and MSG5 data, resulting in a new SQ', to ensure the correct data length and allow RDMA messages to be sent correctly to the second network device. Specifically, the padding data in the padding messages can be other VM data or blank data.

[0100] The above-mentioned method of supplementing messages requires reordering the PSN of the messages. Optionally, the embodiments of this application can also include supplementing messages without reordering the PSN of the messages, such as... Figure 8 As shown, the first network device directly replaces the target VM data with data from other VMs or blank data to obtain a new SQ'.

[0101] 507. The first network device sends the modified message sequence to the second network device.

[0102] The first network device sends packets and padding packets to the second network device in sequence according to the PSN of the modified packet sequence. Correspondingly, the second network device receives the packets and padding packets in the modified packet sequence in sequence according to the PSN order.

[0103] 508. The second network device identifies the padding packets in the modified packet sequence and deletes the padding packets.

[0104] After receiving the modified message sequence, the second network device can verify it through the data integrity field (DIF). For example, if the data in the padding message comes from other VMs that do not belong to N VMs, it can be verified as an invalid message that does not conform to the VM type. Alternatively, if the padding message includes blank data, it can be directly verified as an invalid message. The upper-layer data processing program will remove the invalid messages and only save the aforementioned messages in the modified message sequence.

[0105] 509. The second network device decapsulates the message sequence after deleting the padding message to obtain N data and the write address of N data.

[0106] In this embodiment, the message sequence after deleting the padding message only includes the message in step 502. The message can be decapsulated, such as disassembling the protocol packet, processing the information in the packet header, and extracting N data and the write address of N data in the payload.

[0107] 510. The second network device stores N data on the second host according to the write address of N data.

[0108] After obtaining the aforementioned write address, the second network device can store N data in the VM indicated by its corresponding write address. For example, it can query the registry based on the CLA address and length to match the corresponding VM identifier, VM memory address, and length for storage, or it can directly store the data based on the VM identifier, VM memory address, and length.

[0109] In this embodiment, the first network device encapsulates N data points from N VMs running on the first host into a single message and sends the message to the second network device. The first network device can directly send data from multiple VMs to the second network device without performing multiple RDMA transfers, thus improving transmission efficiency.

[0110] Furthermore, the first network device modifies the message sequence including the abnormal message, deletes the abnormal message and adds a padding message, and sends the modified message sequence to the second network device. This prevents the second network device from discovering that the sequence number is not continuous and assuming that the middle message is lost, which would lead to repeated retransmission attempts. After several unsuccessful attempts, it would assume that the QP is faulty or the QP is broken. This can improve transmission efficiency.

[0111] The data transmission method has been described above. The network device of the present application embodiment is described below with reference to the accompanying drawings.

[0112] Figure 9 This is a schematic diagram of one embodiment of the network device 90 in this application.

[0113] like Figure 9 As shown, this application embodiment provides a network device, which includes:

[0114] Acquisition unit 901 is used to acquire N data points. The network device is set on the first host, and the first host runs N virtual machines (VMs). The N data points come from the N VMs, where N is an integer greater than 1.

[0115] Encapsulation unit 902 is used to encapsulate N data and the write address of N data into a message according to the Remote Direct Memory Access (RDMA) protocol;

[0116] The sending unit 903 is used to send messages to the second network device.

[0117] Optionally, the acquisition unit 901 is specifically used to: acquire the identifiers and memory addresses of N VMs; and acquire N data based on the identifiers and memory addresses of the N VMs.

[0118] Optionally, the acquisition unit 901 is further configured to: acquire M data based on the identifier and memory address of the abnormal VM and the identifier and memory address of some of the N VMs, wherein the abnormal VM runs on the first host and the abnormal VM cannot acquire data based on the identifier and memory address of the abnormal VM, where M is a positive integer less than N;

[0119] The encapsulation unit 902 is also used to encapsulate M data into an exception message according to the RDMA protocol.

[0120] Optionally, the network device 90 further includes a generation unit 904, which is specifically used to generate a message sequence, the message sequence including an abnormal message and at least one message.

[0121] Optionally, the network device 90 further includes a modification unit 905, which is specifically used to modify the message sequence, wherein modifying the message sequence includes deleting abnormal messages and adding padding messages to the message sequence.

[0122] In this embodiment, the network device can perform the aforementioned... Figure 3 and Figure 5 The specific operations performed by the first network device in the illustrated embodiment will not be described here.

[0123] Figure 10 This is a schematic diagram of another embodiment of the network device 100 in this application.

[0124] like Figure 10 As shown, this application embodiment provides a network device, which includes:

[0125] The receiving unit 1001 is used to receive a message from a first network device. The first network device is set on a first host, and the second network device is set on a second host. The first host runs N virtual machines (VMs). The message is generated by encapsulating N data and the write address of N data according to the Remote Direct Memory Access (RDMA) protocol. The N data come from N VMs, where N is an integer greater than 1.

[0126] The decapsulation unit 1002 is used to decapsulate the message to obtain N data items and the write address of N data items;

[0127] Storage unit 1003 is used to store N data on the second host according to the write address of N data.

[0128] Optionally, the receiving unit 1001 is further configured to: receive a modified message sequence, wherein the modified message sequence includes a padding message;

[0129] The network device 100 also includes a deletion unit 1004, which is specifically used to: determine the padding packets in the modified packet sequence and delete the padding packets.

[0130] In this embodiment, the network device can perform the aforementioned... Figure 3 and Figure 5 The specific operations performed by the second network device in the illustrated embodiment will not be described here.

[0131] Figure 11 The diagram shown illustrates a possible logical structure of a network device 110 provided in an embodiment of this application. The network device 110 includes a processor 1101, a communication interface 1102, a storage system 1103, and a bus 1104. The processor 1101, communication interface 1102, and storage system 1103 are interconnected via the bus 1104. In an embodiment of this application, the processor 1101 is used to control and manage the operation of the network device 110; for example, the processor 1101 is used to execute... Figure 3 and Figure 5 The steps performed by the first network device in this method embodiment. Communication interface 1102 is used to support communication by network device 110. Storage system 1103 is used to store the program code and data of network device 110.

[0132] The processor 1101 can be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various exemplary logic blocks, modules, and circuits described in conjunction with the disclosure of this application. The processor 1101 can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, etc. The bus 1104 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 11 The bus is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.

[0133] The sending unit 903 in network device 90 is equivalent to the communication interface 1102 in network device 110, and the acquisition unit 901, encapsulation unit 902, generation unit 904 and modification unit 905 in network device 90 are equivalent to the processor 1101 in network device 110.

[0134] The network device 110 in this embodiment can correspond to the above. Figure 3 and Figure 5 The first network device in the method embodiment, the communication interface 1102 of the network device 110 can implement the above. Figure 3 and Figure 5 For the sake of brevity, the functions of the first network device and / or the various steps implemented in the method embodiment will not be described in detail here.

[0135] Figure 12 The diagram shown illustrates a possible logical structure of a network device 120 provided in an embodiment of this application. The network device 120 includes a processor 1201, a communication interface 1202, a storage system 1203, and a bus 1204. The processor 1201, communication interface 1202, and storage system 1203 are interconnected via the bus 1204. In an embodiment of this application, the processor 1201 is used to control and manage the operation of the network device 120; for example, the processor 1201 is used to execute... Figure 3 and Figure 5 The steps performed by the second network device in the method embodiment are described. Communication interface 1202 is used to support communication by network device 120. Storage system 1203 is used to store program code and data of network device 120.

[0136] The processor 1201 can be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various exemplary logic blocks, modules, and circuits described in conjunction with the disclosure of this application. The processor 1201 can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, etc. The bus 1204 can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 12 The bus is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.

[0137] The receiving unit 1001 in network device 100 is equivalent to the communication interface 1202 in network device 120, and the decapsulation unit 1002, storage unit 1003 and deletion unit 1004 in network device 100 can be equivalent to the processor 1201.

[0138] The network device 120 in this embodiment can correspond to the above. Figure 3 and Figure 5 The second network device in the method embodiment, wherein the processor 1201 and communication interface 1202 in the network device 120 can implement the above-mentioned... Figure 3 and Figure 5 For the sake of brevity, the functions of the second network device and / or the various steps implemented in the method embodiments will not be described in detail here.

[0139] In another embodiment of this application, a computer-readable storage medium is also provided, which stores computer-executable instructions. When the processor of the device executes the computer-executable instructions, the device performs the aforementioned... Figure 3 and Figure 5 The steps of the data transmission method performed by the first network device.

[0140] In another embodiment of this application, a computer-readable storage medium is also provided, which stores computer-executable instructions. When the processor of the device executes the computer-executable instructions, the device performs the aforementioned... Figure 3 and Figure 5 The steps of the data transmission method performed by the second network device.

[0141] In another embodiment of this application, a computer program product is also provided, which includes computer-executable instructions stored in a computer-readable storage medium; when the processor of the device executes the computer-executable instructions, the device performs the above-described... Figure 3 and Figure 5 The steps of the data transmission method performed by the first network device.

[0142] In another embodiment of this application, a computer program product is also provided, which includes computer-executable instructions stored in a computer-readable storage medium; when the processor of the device executes the computer-executable instructions, the device performs the above-described... Figure 3 and Figure 5 The steps of the data transmission method performed by the second network device.

[0143] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0144] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection between apparatuses or units through some interfaces, and may be electrical, mechanical, or other forms.

[0145] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0146] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0147] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

Claims

1. A data transmission system, characterized in that, The data transmission system includes a first network device and a second network device. The first network device is installed on a first host, and the second network device is installed on a second host. N virtual machines (VMs) are running on the first host. The first network device is used to acquire N data points, which are from the N VMs, and encapsulate the N data points and their write addresses into a message according to the Remote Direct Memory Access (RDMA) protocol, and send the message to the second network device, where N is an integer greater than 1; The second network device is configured to receive the message, decapsulate the message to obtain the N data items and the write address of the N data items, and store the N data items on the second host according to the write address of the N data items.

2. The data transmission system according to claim 1, characterized in that, The first network device is configured to obtain the identifiers and memory addresses of the N VMs; and to obtain the N data based on the identifiers and memory addresses of the N VMs.

3. The data transmission system according to claim 2, characterized in that, The first host is running an abnormal VM, and data cannot be obtained based on the identifier and memory address of the abnormal VM; The first network device is configured to obtain M data points based on the identifier and memory address of the abnormal VM and the identifiers and memory addresses of some of the N VMs, where M is a positive integer less than N; and to encapsulate the M data points into an abnormal message according to the RDMA protocol.

4. The data transmission system according to claim 3, characterized in that, The first network device is configured to generate a message sequence, the message sequence including the abnormal message and at least one of the messages.

5. The data transmission system according to claim 4, characterized in that, The first network device is configured to modify the packet sequence, wherein modifying the packet sequence includes deleting the abnormal packet and adding a padding packet to the packet sequence.

6. The data transmission system according to claim 5, characterized in that, The second network device is configured to receive the modified message sequence, determine the padding message in the modified message sequence, and delete the padding message.

7. A data transmission method, characterized in that, include: A first network device acquires N data points. The first network device is set on a first host, and a second network device is set on a second host. The first host runs N virtual machines (VMs), and the N data points come from the N VMs, where N is an integer greater than 1. The first network device encapsulates the N data and the write address of the N data into a message according to the Remote Direct Memory Access (RDMA) protocol; The first network device sends the message to the second network device.

8. The data transmission method according to claim 7, characterized in that, The first network device acquires N data points, including: The first network device obtains the identifiers and memory addresses of the N VMs; The first network device obtains the N data based on the identifiers and memory addresses of the N VMs.

9. The data transmission method according to claim 7, characterized in that, The first host is running an abnormal VM, which cannot obtain data based on its identifier and memory address. The method further includes: The first network device obtains M data points based on the identifier and memory address of the abnormal VM and the identifiers and memory addresses of some of the N VMs, where M is a positive integer less than N; The first network device encapsulates the M data items into an exception message according to the RDMA protocol.

10. The data transmission method according to claim 9, characterized in that, The method further includes: The first network device generates a message sequence, the message sequence including the abnormal message and at least one of the messages.

11. The data transmission method according to claim 10, characterized in that, The method further includes: The first network device modifies the packet sequence, wherein modifying the packet sequence includes deleting the abnormal packet and adding a padding packet to the packet sequence.

12. A data transmission method, characterized in that, include: The second network device receives a message from the first network device, which is located on the first host and the second network device is located on the second host. The first host runs N virtual machines (VMs). The message is generated by encapsulating N data and the write address of the N data according to the Remote Direct Memory Access (RDMA) protocol. The N data comes from the N VMs, where N is an integer greater than 1. The second network device decapsulates the message to obtain the N data items and their write addresses; The second network device stores the N data on the second host according to the write addresses of the N data.

13. The data transmission method according to claim 12, characterized in that, The method further includes: The second network device receives the modified message sequence, the modified message sequence including padding messages; The second network device determines the padding packet in the modified packet sequence and deletes the padding packet.

14. A network device, characterized in that, include: An acquisition unit is used to acquire N data points. The network device is set on a first host, and the second network device is set on a second host. The first host runs N virtual machines (VMs), and the N data points come from the N VMs, where N is an integer greater than 1. An encapsulation unit is used to encapsulate the N data and the write address of the N data into a message according to the Remote Direct Memory Access (RDMA) protocol; A sending unit is used to send the message to the second network device.

15. The network device according to claim 14, characterized in that, The acquisition unit is specifically used for: Obtain the identifiers and memory addresses of the N VMs; Based on the identifiers and memory addresses of the N VMs, obtain the N data.

16. The network device according to claim 14, characterized in that, The acquisition unit is also used for: Based on the identifier and memory address of the abnormal VM and the identifiers and memory addresses of some of the N VMs, obtain M data points. The abnormal VM runs on the first host. The abnormal VM cannot obtain data based on its identifier and memory address. Here, M is a positive integer less than N. The packaging unit is also used for: The M data items are encapsulated into an exception message according to the RDMA protocol.

17. The network device according to claim 16, characterized in that, The network device further includes a generation unit, which is specifically used for: Generate a message sequence, the message sequence including the abnormal message and at least one of the messages.

18. The network device according to claim 17, characterized in that, The network device further includes a modification unit, which is specifically used for: Modify the message sequence, wherein modifying the message sequence includes deleting the abnormal message and adding a padding message to the message sequence.

19. A network device, characterized in that, include: A receiving unit is configured to receive a message from a first network device, wherein the first network device is configured on a first host and the second host is configured on a second host. The first host runs N virtual machines (VMs). The message is generated by encapsulating N data and the write address of the N data according to the Remote Direct Memory Access (RDMA) protocol. The N data comes from the N VMs, where N is an integer greater than 1. A decapsulation unit is used to decapsulate the message to obtain the N data items and the write address of the N data items; A storage unit is used to store the N data on the second host according to the write addresses of the N data.

20. The network device according to claim 19, characterized in that, The receiving unit is also used for: Receive a modified message sequence, wherein the modified message sequence includes a padding message; The network device further includes a deletion unit, which is specifically used for: Identify the padding message in the modified message sequence and delete the padding message.

21. A network device, characterized in that, include: Processor and memory, The processor is configured to execute instructions stored in the memory, causing the network device to perform the method of any one of claims 7 to 11.

22. A network device, characterized in that, include: Processor and memory, The processor is configured to execute instructions stored in the memory, causing the network device to perform the method of any one of claims 12 to 13.