A communication system and network card
By adjusting the transmission rate based on the data flow's transmission distance and round-trip delay in the communication system, and using rate limiting information for bandwidth control, the problem of uneven bandwidth usage between long-distance and short-distance traffic is solved, thereby improving data transmission efficiency and network performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2024-12-18
- Publication Date
- 2026-06-19
AI Technical Summary
When long-distance and short-distance traffic are transmitted simultaneously, the uneven bandwidth usage leads to network congestion and reduced data processing efficiency.
By employing different transmission rate adjustment strategies for data streams with different transmission distances in the communication system, adjusting the transmission rate based on round-trip delay using the network card, and controlling bandwidth through rate limiting information, it is ensured that data streams with different transmission distances occupy bandwidth more evenly.
This achieves a more balanced bandwidth usage for data streams at different transmission distances, reduces packet loss in long-distance traffic, and improves the data transmission performance and data processing efficiency of the RDMA network.
Smart Images

Figure CN122247927A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of communication technology, and in particular to a communication system and a network interface card (NIC). Background Technology
[0002] Congestion control is a flow control technology that addresses network congestion during communication and can be used for traffic scheduling in data center network congestion scenarios.
[0003] In business scenarios such as artificial intelligence (AI) training, inference, and cloud storage, due to the massive computing scale of large AI models and the characteristics of multi-replica local and geographically synchronized storage, it is often necessary to transmit data across multiple data centers in different regions over long distances. Data transmitted across regions over long distances can be called long-distance traffic, while data transmitted between communication devices within a single data center can be called short-distance traffic.
[0004] When long-distance and short-distance traffic are transmitted simultaneously in the network, there will be a problem of bandwidth contention between long-distance and short-distance traffic, resulting in an imbalance in bandwidth usage. Summary of the Invention
[0005] This application provides a communication system and a network interface card (NIC) that can more evenly distribute the bandwidth used by traffic at different transmission distances.
[0006] In a first aspect, this application provides a communication system that may include a first network interface card (NIC), a second NIC, and a third NIC. The transmission distance between the first NIC and the second NIC is greater than or equal to a first distance threshold, and the transmission distance between the third NIC and the second NIC is less than or equal to a second distance threshold, wherein the first distance threshold is greater than or equal to the second distance threshold.
[0007] The first network interface card (NIC) can be used to increase the transmission rate of the first data stream during transmission to the second NIC, based on the round-trip time (RTT) of the packets contained in the first data stream. The third NIC can be used to increase the transmission rate of the second data stream during transmission to the second NIC, based on the RTT of the packets contained in the second data stream; the RTT of the packets contained in the first data stream is greater than the RTT of the packets contained in the second data stream.
[0008] The second network interface card (NIC) can be used to send first rate limiting information to the first NIC when the transmission rate of the first data stream is greater than or equal to a first rate threshold. The first NIC can also be used to reduce the transmission rate of the first data stream based on the first rate limiting information. The second NIC can also be used to send second rate limiting information to the third NIC when the transmission rate of the second data stream is greater than or equal to a second rate threshold. The third NIC can also be used to reduce the transmission rate of the second data stream based on the second rate limiting information.
[0009] The network interface card (NIC) in the communication system provided in this application can employ different transmission rate adjustment strategies for different data streams at different transmission distances. For example, the first NIC can increase the speed of the first data stream by a larger round-trip delay, the third NIC can increase the speed of the second data stream by a smaller round-trip delay, and the second NIC can limit the speed of the first data stream according to a first rate threshold and the speed of the second data stream according to a second rate threshold. By employing different transmission rate adjustment strategies for different data streams at different transmission distances, it is possible to avoid some data streams consuming too much bandwidth, which helps to achieve a more balanced bandwidth usage among data streams at different transmission distances.
[0010] In one optional implementation, the first network interface card (NIC) is used to increase the transmission rate of the first data stream according to the round-trip time delay of the packets contained in the first data stream if it does not obtain the rate reduction information of the first data stream within a set time period. The rate reduction information of the first data stream may include at least one of the following:
[0011] Information that the round-trip time of the messages contained in the first data stream is greater than or equal to the second delay threshold within a set time period;
[0012] The first speed limit message sent by the second network card;
[0013] The first data stream forwarding device sends congestion information for the first data stream; the first data stream forwarding device is located between the first network interface card (NIC) and the second NIC.
[0014] If the first network interface card (NIC) receives a rate-reduction message for the first data stream, it can reduce the transmission rate of that data stream. This rate-reduction message includes congestion information sent by the forwarding device. Compared to the second NIC sending the congestion information to the first NIC, the first NIC receives the transmission path congestion notification earlier, allowing for faster response and adjustment, and preventing further packet loss due to transmission path congestion.
[0015] In one alternative implementation, the first rate limiting information is a rate limiting feedback message, or the first rate limiting information is a rate limiting identifier carried in a first feedback message, and the first feedback message is a feedback message for the messages contained in the first data stream.
[0016] In the above implementation, the first rate limiting information can be in the form of a message or a rate limiting identifier carried in the first feedback message. Transmitting the rate limiting information through the rate limiting identifier in the feedback message helps to save signaling and bandwidth.
[0017] In one alternative implementation, the first network interface card (NIC) is located in the first data center, while the second and third NICs are located in the second data center.
[0018] In one alternative implementation, the communication system may further include a forwarding device located between the first network interface card (NIC) and the second NIC, and also between the third NIC and the second NIC. The first data stream and the second data stream are transmitted through the same buffer queue in the forwarding device.
[0019] The technical solution of this application can be used to alleviate the problem of different data streams with different transmission distances competing for bandwidth in the same buffer queue.
[0020] In one alternative implementation, the messages contained in the first data stream conform to the Remote Direct Memory Access (RDMA) protocol.
[0021] The technical solution of this application can be applied to traffic transmitted using the RDMA protocol. The RDMA protocol is a packet loss sensitive protocol. By making the data streams of different transmission distances occupy bandwidth more evenly, packet loss of long-distance traffic can be reduced and the data transmission performance of the RDMA network can be improved.
[0022] Secondly, this application provides a network interface card (NIC) including a processor and a communication interface. The communication interface is used to transmit a first data stream and a third data stream under the control of the processor. The processor is used to increase the transmission rate of the first data stream according to the round-trip time (RTD) of the messages contained in the first data stream, and to increase the transmission rate of the third data stream according to the RTD of the messages contained in the third data stream; the RTD of the messages contained in the first data stream is greater than the RTD of the messages contained in the third data stream.
[0023] In one optional implementation, the transmission distance between the aforementioned network card and the destination network card of the first data stream is greater than or equal to a first distance threshold, and the transmission distance between the aforementioned network card and the destination network card of the third data stream is less than or equal to a second distance threshold, wherein the first distance threshold is greater than or equal to the second distance threshold.
[0024] In one alternative implementation, the round-trip time of the first message in the first data stream refers to the duration between the time the first message is sent and the time the feedback message of the first message is received.
[0025] In one alternative implementation, the round-trip time of the messages contained in the first data stream is less than or equal to a second delay threshold.
[0026] In one alternative implementation, the network interface card further includes a register; the second latency threshold is the historical round-trip latency of the packets contained in the first data stream stored in the register.
[0027] In one alternative implementation, the processor can specifically be used to: increase the transmission rate of the first data stream according to the round-trip time delay of the messages contained in the first data stream if the slowdown information of the first data stream is not obtained within a set time period.
[0028] In one optional implementation, the packets contained in the first data stream carry a flow type identifier; the flow type identifier is used to indicate that the transmission distance between the network interface card and the destination network interface card of the first data stream is greater than or equal to the first distance threshold.
[0029] Thirdly, this application provides a network interface card (NIC) including a processor and a communication interface; wherein the communication interface is used to receive messages contained in a first data stream and to receive messages contained in a second data stream; the processor can be used to send first rate limiting information to the source NIC of the first data stream through the communication interface when the transmission rate of the first data stream is greater than or equal to a first rate threshold; the first rate limiting information is used to instruct the source NIC of the first data stream to reduce the transmission rate of the first data stream; the transmission distance between the NIC and the source NIC of the first data stream is greater than or equal to a first distance threshold;
[0030] The processor can also be used to send second rate limiting information to the source network card of the second data stream through the communication interface when the transmission rate of the second data stream is greater than or equal to the second rate threshold; the second rate limiting information is used to instruct the source network card of the third data stream to reduce the transmission rate of the second data stream; the transmission distance between the network card and the source network card of the second data stream is less than or equal to the second distance threshold, and the first distance threshold is greater than or equal to the second distance threshold.
[0031] In one optional implementation, the first data stream belongs to the first traffic group. When all the data streams transmitted by the aforementioned network cards belong to the first traffic group, the first rate threshold adopts a first value. When the aforementioned network cards transmit data streams that do not belong to the first traffic group, the first rate threshold adopts a second value. The first value is greater than or equal to the second value. The transmission distance between the source network card of the data stream belonging to the first traffic group and the second network card is greater than or equal to the first distance threshold.
[0032] In one alternative implementation, the first rate limiting information is a rate limiting feedback message, or the first rate limiting information is a rate limiting identifier carried in a first feedback message, and the first feedback message is a feedback message for the messages contained in the first data stream.
[0033] Fourthly, this application provides a flow control method applied to a communication system, which may include a first network interface card (NIC), a second NIC, and a third NIC. The transmission distance between the first NIC and the second NIC is greater than or equal to a first distance threshold, and the transmission distance between the third NIC and the second NIC is less than or equal to a second distance threshold. The first distance threshold is greater than or equal to the second distance threshold. The flow control method includes:
[0034] During the process of transmitting the first data stream to the second network card, the first network card increases the transmission rate of the first data stream according to the round-trip time delay of the packets contained in the first data stream.
[0035] During the transmission of the second data stream from the third network card to the second network card, the transmission rate of the second data stream is increased according to the round-trip time of the packets contained in the second data stream; the round-trip time of the packets contained in the first data stream is greater than the round-trip time of the packets contained in the second data stream.
[0036] When the transmission rate of the first data stream is greater than or equal to the first rate threshold, the second network card sends the first rate limiting information to the first network card, and the first network card reduces the transmission rate of the first data stream based on the first rate limiting information.
[0037] When the transmission rate of the second data stream is greater than or equal to the second rate threshold, the second network card sends a second rate limiting message to the third network card. Based on the second rate limiting message, the third network card reduces the transmission rate of the second data stream.
[0038] In one alternative implementation, if the first network card does not obtain the rate reduction information of the first data stream within a set time period, it increases the transmission rate of the first data stream according to the round-trip time of the packets contained in the first data stream.
[0039] The rate reduction information of the first data stream may include at least one of the following:
[0040] Information that the round-trip time of the messages contained in the first data stream is greater than or equal to the second delay threshold within a set time period;
[0041] The first speed limit message sent by the second network card;
[0042] The first data stream forwarding device sends congestion information for the first data stream; the first data stream forwarding device is located between the first network interface card (NIC) and the second NIC.
[0043] In one alternative implementation, the first rate limiting information is a rate limiting feedback message, or the first rate limiting information is a rate limiting identifier carried in a first feedback message, and the first feedback message is a feedback message for the messages contained in the first data stream.
[0044] In one alternative implementation, the messages contained in the first data stream conform to the Remote Direct Memory Access (RDMA) protocol.
[0045] Fifthly, this application provides a flow control method applied to a network interface card (NIC) in a communication system, the flow control method comprising:
[0046] During the transmission of the first data stream, the transmission rate of the first data stream is increased according to the round-trip time delay of the messages contained in the first data stream.
[0047] During the transmission of the third data stream, the transmission rate of the third data stream is increased according to the round-trip time of the messages contained in the third data stream; the round-trip time of the messages contained in the first data stream is greater than the round-trip time of the messages contained in the third data stream.
[0048] In one optional implementation, the transmission distance between the aforementioned network card and the destination network card of the first data stream is greater than or equal to a first distance threshold, and the transmission distance between the aforementioned network card and the destination network card of the third data stream is less than or equal to a second distance threshold, wherein the first distance threshold is greater than or equal to the second distance threshold.
[0049] In one alternative implementation, the round-trip time of the first message in the first data stream refers to the duration between the time the first message is sent and the time the feedback message of the first message is received.
[0050] In one alternative implementation, the round-trip time of the messages contained in the first data stream is less than or equal to a second delay threshold.
[0051] In one alternative implementation, the second latency threshold is the historical round-trip latency of the packets contained in the first data stream stored in the aforementioned network interface card.
[0052] In one alternative implementation, if the network card does not obtain the rate reduction information of the first data stream within a set time period, it increases the transmission rate of the first data stream according to the round-trip time of the packets contained in the first data stream.
[0053] In one optional implementation, the packets contained in the first data stream carry a flow type identifier; the flow type identifier is used to indicate that the transmission distance between the network interface card and the destination network interface card of the first data stream is greater than or equal to the first distance threshold.
[0054] Sixthly, this application provides a flow control method applied to a network interface card (NIC) in a communication system, the flow control method comprising:
[0055] When the transmission rate of the first data stream is greater than or equal to a first rate threshold, a first rate limiting information is sent to the source network card of the first data stream; the first rate limiting information is used to instruct the source network card of the first data stream to reduce the transmission rate of the first data stream; the transmission distance between the network card and the source network card of the first data stream is greater than or equal to a first distance threshold.
[0056] When the transmission rate of the second data stream is greater than or equal to the second rate threshold, a second rate limiting information is sent to the source network card of the second data stream; the second rate limiting information is used to instruct the source network card of the third data stream to reduce the transmission rate of the second data stream; the transmission distance between the aforementioned network card and the source network card of the second data stream is less than or equal to the second distance threshold, and the first distance threshold is greater than or equal to the second distance threshold.
[0057] In one optional implementation, the first data stream belongs to the first traffic group. When all the data streams transmitted by the aforementioned network cards belong to the first traffic group, the first rate threshold adopts a first value. When the aforementioned network cards transmit data streams that do not belong to the first traffic group, the first rate threshold adopts a second value. The first value is greater than or equal to the second value. The transmission distance between the source network card of the data stream belonging to the first traffic group and the second network card is greater than or equal to the first distance threshold.
[0058] In one alternative implementation, the first rate limiting information is a rate limiting feedback message, or the first rate limiting information is a rate limiting identifier carried in a first feedback message, and the first feedback message is a feedback message for the messages contained in the first data stream.
[0059] In a seventh aspect, this application provides a communication device, which may include a network interface card provided in the second or third aspect, and may also include a computing unit or a storage unit.
[0060] Eighthly, embodiments of this application provide a computer-readable storage medium storing computer-executable instructions for causing a computer to perform any of the methods provided in the fifth aspect above.
[0061] Ninthly, embodiments of this application provide a computer-readable storage medium storing computer-executable instructions for causing a computer to perform any of the methods provided in the sixth aspect above.
[0062] In a tenth aspect, embodiments of this application provide a computer program product comprising computer-executable instructions for causing a computer to perform any of the methods provided in the fifth aspect above.
[0063] Eleventhly, embodiments of this application provide a computer program product comprising computer-executable instructions for causing a computer to perform any of the methods provided in the sixth aspect above.
[0064] The technical effects that can be achieved by any of the second to eleventh aspects mentioned above can be referred to the description of the beneficial effects in the first aspect mentioned above, and will not be repeated here. Attached Figure Description
[0065] Figure 1 This is a schematic diagram of a communication system according to an embodiment of this application;
[0066] Figure 2 A schematic diagram of a buffer queue of a forwarding device provided in an embodiment of this application;
[0067] Figure 3 An interaction diagram between network interface cards (NICs) during the flow control process provided in this application embodiment;
[0068] Figure 4 A schematic diagram of the transmission paths of various data streams in a communication system provided in an embodiment of this application;
[0069] Figure 5 A schematic diagram illustrating congestion control performed by a forwarding device according to an embodiment of this application;
[0070] Figure 6 A flowchart of a flow control method provided in an embodiment of this application;
[0071] Figure 7 A schematic diagram illustrating the transmission of long-distance and short-distance traffic in a communication system provided in this application embodiment;
[0072] Figure 8 This is a schematic diagram of the internal structure of a network interface card (NIC) provided in an embodiment of this application.
[0073] Figure 9 A flowchart illustrating another flow control method provided in this application embodiment;
[0074] Figure 10 A schematic diagram illustrating the transmission of long-distance and short-distance traffic in another communication system provided in this application embodiment;
[0075] Figure 11 This is a schematic diagram of the structure of a network interface card (NIC) provided in an embodiment of this application. Detailed Implementation
[0076] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the embodiments of this application will be described in detail below with reference to the accompanying drawings. The terminology used in the implementation section of this application is only for explaining specific embodiments of this application and is not intended to limit this application.
[0077] Before introducing the specific solutions provided in the embodiments of this application, some terms used in this application will be explained to facilitate understanding by those skilled in the art, but the terms used in this application are not limited.
[0078] (1) Data Stream: In a communication network, when a communication device needs to transmit data to another communication device during the execution of a service, it can transmit the data to the other communication device through multiple messages. These messages are dependent on each other and can be numbered according to the transmission order. The multiple messages are transmitted in the same direction, which can be called a data stream. Communication devices can transmit data streams through network interface cards (NICs). Multiple messages in the same data stream have the same source NIC and the same destination NIC.
[0079] (2) Long-distance traffic: When the transmission distance between the source network card and the destination network card of a data flow is greater than or equal to the first distance threshold, the data flow is considered long-distance traffic. For example, when the source network card and the destination network card of a data flow are located in data centers in different regions, the transmission distance between the source network card and the destination network card of the data flow is greater than the first distance threshold, and the data flow is considered long-distance traffic.
[0080] (3) Short-range traffic: When the transmission distance between the source network card and the destination network card of a data flow is less than or equal to the second distance threshold, the data flow is considered short-range traffic. For example, when the source network card and the destination network card of a data flow are located in the same data center, the transmission distance between the source network card and the destination network card of the data flow is less than the second distance threshold, and the data flow is considered short-range traffic.
[0081] In this application embodiment, "multiple" refers to two or more. Therefore, in this application embodiment, "multiple" can also be understood as "at least two". "At least one" can be understood as one or more, such as one, two, or more. For example, "including at least one" means including one, two, or more, and it does not limit which ones are included. For example, including at least one of A, B, and C, then it could include A, B, C, A and B, A and C, B and C, or A and B and C. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / ", unless otherwise specified, generally indicates that the preceding and following related objects have an "or" relationship.
[0082] Unless otherwise stated, the ordinal numbers such as "first" and "second" mentioned in the embodiments of this application are used to distinguish multiple objects, and are not used to limit the order, sequence, priority or importance of multiple objects.
[0083] The flow control method provided in this application can be applied to communication networks that include remote communication. This communication network can be a wired network or a wireless network; for example, it can be a 5th generation (5G) mobile communication network or a future mobile communication network, or it can be an Ethernet communication network. The communication network can include multiple local area networks (LANs), each LAN containing multiple communication devices. Communication devices within the same LAN can transmit data, and communication devices between different LANs can also transmit data.
[0084] Figure 1 An exemplary schematic diagram of a communication system to which an embodiment of this application applies is shown. Figure 1 The communication system shown can contain multiple data centers, such as data center 1, data center 2, etc. Figure 1 A data center can be understood as a local area network (LAN), while different data centers can be understood as different LANs. Data centers can communicate with each other via fiber optic cables. In some application scenarios, a data center can be an AI computing system, a cloud computing system, or a cloud storage system, belonging to the same cloud networking domain, the same availability zone (AZ), or the same point of delivery (POD). Different data centers can belong to different cloud networking domains, different AZs, or different PODs. A data center is located within a region or geographical area. Different data centers can be located in the same region or different AZs within the same geographical area, or in different regions or different AZs within different geographical areas.
[0085] Each data center may include one or more forwarding devices, as well as multiple communication devices, for example, Figure 1Data center 1, as shown, may include communication devices K1, K2, and K3, and forwarding device M1, etc. Data center 2 may include communication devices N1, N2, and N3, and forwarding device M3, etc. The communication devices in data centers 1 and 2 can be computing devices or storage devices. The forwarding devices in data centers 1 and 2 are devices with data forwarding capabilities; forwarding devices can be routers or switches, such as top-of-rack (TOP) switches or aggregation switches. For example, communication devices K1, K2, and K3 in data center 1 can be computing devices. A computing device is an electronic device with computing and data processing capabilities. A computing device can be a physical device, such as a desktop computer, computing server, host, computing chip, etc., or a virtual device, such as a module or unit with computing and data processing capabilities deployed in a cloud computing system, such as a virtual machine or control unit. Communication devices K1, K2, and K3 in data center 1 can also be storage devices, such as storage servers used to store data and files. The communication devices N1, N2, and N3 in data center 2 can be either computing devices or storage devices.
[0086] Communication devices K1, K2, and K3 in Data Center 1, and communication devices N1, N2, and N3 in Data Center 2, are all equipped with network interface cards (NICs). These NICs can communicate with other communication devices on the network. Therefore, it can also be said that... Figure 1 The communication system shown includes multiple network interface cards (NICs), which are deployed in different data centers. The NICs can be smart NICs or data processing unit (DPU) NICs, etc.
[0087] In some application scenarios, a forwarding device M2 can also be set up between data center 1 and data center 2. The number of forwarding devices M2 can be one or more. Figure 1 The diagram only uses one forwarding device M2. Forwarding device M2 can be a router or switch, such as a core switch, used to forward data sent by any communication device in data center 1 to any communication device in data center 2, and to forward data sent by any communication device in data center 2 to any communication device in data center 1.
[0088] It should be noted that in actual application scenarios, communication networks can contain more data centers, and the number of communication devices in different data centers can be different. The number of communication devices in any data center can be more than 3 or less than 3, and this application does not limit this.
[0089] In AI training and inference scenarios, the massive computational scale of large AI models typically requires the participation of multiple data centers. This necessitates long-distance data transmission across AZs or regions. Similarly, cloud storage scenarios, characterized by multi-replica local / geographical synchronous storage, also require the participation of multiple data centers. This again necessitates long-distance data transmission across AZs or regions. Data transmitted across AZs or regions between multiple data centers can be termed long-distance traffic. Data transmitted between communication devices within a single data center can be termed short-distance traffic. For example, communication device K1 in data center 1 can transmit data for a first service in the form of packets to communication device N3 in data center 2 via forwarding devices M1, M2, and M3. The multiple packets of the first service transmitted from communication device K1 to communication device N3 constitute a data stream. Communication device K1 can be considered the source device of this data stream, and communication device N3 can be considered the destination device. The process of data transmission from a source device to a destination device can be called end-to-end data transmission. Since communication devices K1 and N3 belong to different data centers, and data center 1 (K1) and data center 2 (N3) can be located in different A / Z regions or geographically distinct areas, the data flow between K1 and N3 is long-distance traffic. Communication device N1 in data center 2 can transmit the data of the second service to communication device N3 in data center 2 in the form of packets via forwarding device M3. Multiple packets of the second service transmitted from communication device N1 to communication device N3 constitute a data flow. For this data flow, communication device N1 can be called the source device, and communication device N3 can be called the destination device. Since communication devices N1 and N3 belong to the same data center, the data flow between them is short-distance traffic.
[0090] When transmitting data over long distances across geographical regions, either TCP or Remote Direct Memory Access Protocol (RDMA) can be used. RDMA is a network communication protocol that supports long-distance data transmission across geographical regions and features kernel bypass, low latency, and high bandwidth. RDMA can include RDMA over Convergent Ethernet (RoCEv2). Compared to TCP, RDMA offers faster transmission rates, lower processor overhead, and better meets the ever-increasing data volume demands of business operations. Due to its low latency and high throughput, RDMA requires more stringent network congestion control. The transmission latency of data packets is positively correlated with the transmission distance; when the transmission medium is fiber optic, the transmission latency increases by 0.5 µs for every 1 km increase in distance. RDMA can be used for both long-distance and short-distance traffic transmission. When long-distance and short-distance traffic are transmitted simultaneously in the network, the large difference in transmission delay between them can lead to an imbalance in bandwidth usage during network congestion control.
[0091] For example, long-distance traffic increases its transmission rate according to its transmission delay, and short-distance traffic increases its transmission rate according to its transmission delay. Since the transmission delay of long-distance traffic is much greater than that of short-distance traffic, after a period of time, the transmission rate of short-distance traffic may exceed that of long-distance traffic. This results in short-distance traffic consuming more bandwidth, leading to an imbalance in bandwidth usage between long-distance and short-distance traffic. In AI training and inference scenarios, if the destination device needs to perform a multiplication calculation using data from both long-distance and short-distance traffic, and short-distance traffic occupies more bandwidth, it can transmit more data in the same amount of time, while long-distance traffic occupies less bandwidth, transmitting less data in the same amount of time. After receiving the data from the short-distance traffic, the destination device still needs to wait for the corresponding data from the long-distance traffic to arrive before performing the multiplication calculation, which affects the data processing efficiency of the destination device.
[0092] In some application scenarios, for network congestion control, forwarding devices in the network can create multiple buffer queues for communication ports when forwarding data. Based on service priorities, the same buffer queue can be used to transmit data streams of multiple services with the same priority, or in other words, to transmit multiple data streams with the same priority. Different buffer queues are used to transmit data streams of services with different priorities. The priority of each buffer queue can be determined based on the priority of the service associated with the transmitted data stream. The service priority can be determined based on the service's class of service (COS). Different service types have different priorities; therefore, a buffer queue can also be called a COS priority queue. The forwarding device can set up an independent buffer space for each buffer queue to store the packets in the corresponding buffer queue.
[0093] For example, Figure 2 A schematic diagram of the buffer queue in forwarding device M3 is shown as an example. Taking forwarding device M3 as an example, such as... Figure 2 As shown, forwarding device M3 has eight buffer queues for the communication port, designated as buffer queues Q0 to Q7. Each buffer queue has its own independent buffer space. For example, buffer space G0 corresponds to buffer queue Q0, G1 to Q1, G2 to Q2, G3 to Q3, G4 to Q4, G5 to Q5, G6 to Q6, and G7 to Q7. Because each buffer queue has its own independent buffer space, they do not interfere with each other. Congestion in one buffer queue will not affect the data flow transmitted in other buffer queues. However, when long-distance and short-distance traffic have the same priority and are transmitted through the same buffer queue in forwarding device M3, bandwidth contention can still occur, leading to an imbalance in bandwidth usage.
[0094] Based on this, embodiments of this application provide a communication system, which may include a first network interface card (NIC), a second NIC, and a third NIC. The first NIC, second NIC, and third NIC may be... Figure 1The network interface card (NIC) in the communication device shown is described. The transmission distance between the first NIC and the second NIC is greater than or equal to a first distance threshold, and the transmission distance between the second NIC and the third NIC is less than or equal to a second distance threshold. The first distance threshold is greater than or equal to the second distance threshold. The first NIC can be used to increase the transmission rate of the first data stream to the second NIC according to the round-trip time (RTT) of the packets contained in the first data stream. The third NIC can be used to increase the transmission rate of the second data stream to the second NIC according to the RTT of the packets contained in the second data stream. The RTT of the packets contained in the first data stream is greater than the RTT of the packets contained in the second data stream. The second NIC can be used to send first rate-limiting information to the first NIC when the transmission rate of the first data stream is greater than or equal to the first rate threshold. The first NIC can also be used to reduce the transmission rate of the first data stream based on the first rate-limiting information. The second network interface card (NIC) can also be used to send second rate-limiting information to the third NIC when the transmission rate of the second data stream is greater than or equal to the second rate threshold. The third NIC is further used to reduce the transmission rate of the second data stream based on the second rate-limiting information. The communication system provided in this application adopts different transmission rate adjustment strategies for different data streams at different transmission distances, which can prevent some data streams from occupying too much bandwidth and help to make the bandwidth usage of data streams at different transmission distances more balanced.
[0095] In a communication system, each communication device can transmit data through a network interface card (NIC). Figure 3 An example diagram illustrates the interaction between network interface cards (NICs) during data transmission. For instance... Figure 3 As shown, the data transmission process may include the following steps:
[0096] S301, the first network card sends a message of the first data stream to the second network card.
[0097] The first network interface card (NIC) sends the first data stream packets to the second NIC at the transmission rate of the first data stream. The transmission distance between the first and second NICs is greater than or equal to a first distance threshold, which can be 1 km, 10 km, or 50 km. For example, the first and second NICs can be located in different local area networks (LANs), different data centers, or data centers in different Availability Zones (AZs). The transmission distance between different AZs can be between 65 km and 150 km. If communication is via fiber optic cable, the transmission latency between two NICs located in different AZs is approximately 1 ms.
[0098] For the first data stream, the source network interface card (NIC) is the first NIC, and the destination NIC is the second NIC. Both the first and second NICs can be NICs that support the RDMA protocol. For example, the first NIC can be... Figure 1 The network card of communication device K1 in data center 1 shown, the second network card can be Figure 1 The network interface card (NIC) of communication device N3 in data center 2 is shown. When communication device K1 is performing the first service, if there is data that needs to be transmitted to communication device N3, data transmission can be performed through the first NIC in communication device K1 and the second NIC in communication device N3. The first NIC can generate a message based on the data to be transmitted. This message can be a message conforming to the RDMA protocol. The first NIC can transmit data to the second NIC through the message, forming the first data stream.
[0099] During the execution of the first service by communication device K1, when the first network card (NIC) first sends a message to the second NIC, it can send the first data stream message to the second NIC according to a pre-set transmission rate of the first data stream. Then, the transmission rate of the first data stream can be adjusted according to the data transmission status of the communication network. The transmission rate of the first data stream can be used to limit the size of the data transmission window for the first NIC to send the first data stream messages. A larger data transmission window allows for more first data stream messages to be sent per unit time, while a smaller data transmission window allows for fewer first data stream messages to be sent per unit time.
[0100] like Figure 4 As shown, the packets of the first data stream sent by the first network interface card (NIC) (NIC in communication device K1 of data center 1) can be transmitted to the second NIC (NIC in communication device N3 of data center 2) through forwarding devices M1, M2, and M3. Forwarding devices M1, M2, and M3 can be switches. Taking forwarding device M3 as an example, assuming that forwarding device M3 forwards the packets of the first data stream through the buffer queue Q1 in the communication port, and forwarding device M3 has a buffer space corresponding to buffer queue Q1, forwarding device M3 saves the received packets of the first data stream to the buffer space corresponding to buffer queue Q1. In addition to saving the packets of the first data stream, the buffer space corresponding to buffer queue Q1 can also save packets of other data streams. Forwarding device M3 can read packets from the buffer space corresponding to buffer queue Q1 in the order they were stored in the buffer space corresponding to buffer queue Q1, and send them to the next-hop device corresponding to the packets. For the first data stream message, forwarding device M1 can send the message sent by the first network card to communication device M2. Communication device M2 sends the message to forwarding device M3 in data center 2, and then forwarding device M3 sends the message to the destination network card, that is, the second network card of communication device N3 in data center 2.
[0101] S302, the first network card increases the transmission rate of the first data stream according to the round-trip delay of the packets contained in the first data stream.
[0102] During the continuous transmission of the first data stream, the first network interface card (NIC) can adjust the transmission rate of the first data stream according to network transmission conditions. For example, the first NIC can adjust the transmission rate of the first data stream once at a first predetermined time interval. The first predetermined time interval can be the round-trip time delay of the packets contained in the first data stream, or it can be proportional to the round-trip time delay of the packets contained in the first data stream. For instance, the first NIC can adjust the transmission rate of the first data stream once each time it receives a feedback message from a packet in the first data stream.
[0103] If the first network card does not obtain the speed reduction information of the first data stream within the current first set time period, the first network card can increase the transmission rate of the first data stream according to the round-trip time of the packets contained in the first data stream.
[0104] In some embodiments, the round-trip time (RTT) of the packets included in the first data stream can be the RTT of any packet in the first data stream. The RTT of any packet in the first data stream refers to the duration between the sending time of the packet and the receiving time of the acknowledgment (ACK) packet. For example, each packet sent by the first network interface card (NIC) carries a flow ID and a packet sequence number. The flow ID is used to characterize the data stream to which the packet belongs. In one embodiment, for a first packet in the first data stream, the first packet carries the flow ID and packet sequence number of the first data stream. When sending the first packet, the first NIC can save the correspondence between the packet sequence number and the sending time, i.e., save the sending time of the first packet. When the second network interface card (NIC) receives the first packet, it generates a feedback packet based on the packet sequence number carried in the first packet and sends the feedback packet to the first NIC. Upon receiving the feedback packet, the first NIC determines the reception time of the feedback packet, retrieves the transmission time of the first packet based on the packet sequence number carried in the feedback packet, and calculates the round-trip time of the first packet based on the time difference between the reception time and the transmission time. In another embodiment, for the first packet in the first data stream, the first packet may carry both the packet sequence number and the transmission time. When the second network interface card (NIC) receives the first message, it generates a feedback message for the first message based on the message sequence number and sending time carried in the first message, and sends the feedback message to the first NIC. Upon receiving the feedback message, the first NIC determines the receiving time of the feedback message, obtains the sending time of the first message from the feedback message, and calculates the round-trip time of the first message based on the time difference between the receiving time of the feedback message and the sending time of the first message.
[0105] In other embodiments, the round-trip time (RTT) of the packets contained in the first data stream can be the average of the RTTs of multiple packets contained in the first data stream. For example, the first network interface card (NIC) can determine the RTT of the packets corresponding to multiple feedback packets received within a current first preset time period, thereby obtaining the RTT of multiple packets. The RTT of the packets contained in the first data stream can be the average of the obtained RTTs of the multiple packets.
[0106] In other embodiments, the round-trip time (RTT) of the packets contained in the first data stream can be the historical RTT of the packets contained in the first data stream. For example, the first network interface card (NIC) is provided with a register that stores the historical RTT of the packets in the first data stream; the historical RTT can also be called the historical minimum RTT. For a second packet in the first data stream, when a feedback packet of the second packet is received, the RTT of the second packet can be obtained based on the time difference between the reception time of the feedback packet and the transmission time of the second packet. The second packet can be any packet in the first data stream. If the RTT of the second packet is less than the currently stored historical minimum RTT, the first NIC can update the historical minimum RTT using the RTT of the second packet; that is, the RTT of the second packet is the new historical minimum RTT.
[0107] The first network interface card (NIC) can increase the transmission rate of the first data stream according to the round-trip time (RTT) of the packets contained in the first data stream, where the RTT of the packets contained in the first data stream can be less than or equal to a first delay threshold. In some embodiments, when the RTT of the packets contained in the first data stream is the RTT of any packet in the first data stream, or the average of the RTTs of multiple packets contained in the first data stream, the first delay threshold can be determined based on the transmission distance and information transmission speed between the first NIC and the second NIC. The first delay threshold can be less than the ratio of the transmission distance to the information transmission speed, wherein the information transmission speed is related to the characteristics of the transmission medium between the NICs. In other embodiments, when the RTT of the packets contained in the first data stream is the minimum RTT of the packets contained in the first data stream, the first delay threshold can be the historical minimum RTT stored in the first NIC.
[0108] For example, the round-trip time (RTT) of the packets contained in the first data stream can be denoted as the minimum RTT of the first data stream. The first network interface card (NIC) stores user-preconfigured base values and base RTT values. The first NIC can determine the growth rate of the first data stream based on the minimum RTT, base, and base RTT values. The formula for determining the growth rate of the first data stream can be expressed as:
[0109] The growth rate of the first data stream = base * (minimum RTT of the first data stream / base RTT)
[0110] After determining the growth rate of the first data stream, the first network interface card (NIC) can increase the transmission rate of the first data stream according to the growth rate. For example, the first NIC can increase the data transmission window of the first data stream according to the growth rate. For instance, assuming the current data transmission window size of the first data stream is L, meaning L packets can be transmitted at a time, if the next packet to be transmitted is numberN+1, when sending packets of the first data stream at the current transmission rate (i.e., the data transmission window size L), packets numbered from numberN+1 to numberN+L can be sent. Alternatively, assuming the growth rate of the first data stream is L1, if the data transmission window of the first data stream is increased by L1, resulting in a larger data transmission window size of L+L1, when the next packet to be transmitted is numberN+1, packets of the first data stream are sent according to the increased data transmission window size L+L1, and packets numbered from numberN+1 to numberN+L+L1 can be sent. It is evident that increasing the transmission rate of the first data stream increases the data transmission window of the first data stream, thereby increasing the number of messages that can be sent each time.
[0111] In some embodiments, the first network interface card (NIC) can add a first stream type identifier to the packets of the first data stream to be sent, based on the round-trip time (RTD) of the already sent first data stream packets. The first stream type identifier indicates that the transmission distance between the source NIC and the destination NIC of the packet is greater than or equal to a first distance threshold. For example, the stream type identifier can occupy 1 bit. Because the transmission distance between the first NIC (the source NIC of the first data stream) and the second NIC (the destination NIC of the first data stream) is greater than or equal to the first distance threshold, the RTD of the first data stream packets is relatively long, exceeding the set threshold. Since the RTD of the already sent first data stream packets exceeds the set threshold, the first NIC can add a stream type identifier "1" to the packets of the first data stream to be sent. The stream type identifier "1" indicates that the transmission distance between the source NIC and the destination NIC of the packet is greater than or equal to the first distance threshold.
[0112] If, within the current first set time period, the first network interface card (NIC) receives information indicating a rate reduction in the first data stream, the first NIC can reduce the transmission rate of the first data stream according to a set rate reduction margin. The rate reduction information for the first data stream may include one or more of the following:
[0113] 1. Information that the round-trip time of a message in the first data stream is greater than or equal to a second delay threshold within a first set time period.
[0114] If, within the current first set time period, the first network interface card (NIC) determines that the round-trip time (RTT) of a packet in the first data stream is greater than or equal to a second delay threshold, then the first NIC considers it to have acquired the rate-reduction information for the first data stream and can reduce the transmission rate of the first data stream according to the set rate-reduction margin. The second delay threshold can be determined based on the transmission distance and information transmission speed between the source and destination NICs of the first data stream, and the second delay threshold can be greater than the ratio of transmission distance to information transmission speed. If the RTT of a packet in the first data stream is greater than or equal to the second delay threshold, it indicates that the transmission path of the first data stream may be congested, causing an increase in the transmission time of packets or their feedback packets. In this case, the first NIC can reduce the transmission rate of the first data stream.
[0115] II. Congestion information of the first data stream sent by the forwarding device of the first data stream.
[0116] In this configuration, the forwarding device for the first data stream is located between the first network interface card (NIC) and the second NIC. For example, the first NIC in data center 1 transmits the packets of the first data stream to the second NIC in data center 2 via forwarding devices M1, M2, and M3. Forwarding devices M1, M2, and M3 are the forwarding devices for the first data stream. Taking forwarding device M1 as an example, if congestion occurs at forwarding device M1, it can directly send congestion information of the first data stream to the first NIC, without the second NIC needing to detect congestion in the transmission path before sending congestion information to the first NIC. By having the forwarding devices send the congestion information of the first data stream to the first NIC, the first NIC can receive the transmission path congestion notification earlier, respond and adjust more quickly, and avoid losing more packets due to transmission path congestion.
[0117] When the first network card receives congestion information from any forwarding device for the first data stream, it can reduce the transmission rate of the first data stream according to the set reduction rate.
[0118] 3. Rate limiting information sent by the destination network card of the first data stream.
[0119] The destination network interface card (NIC) of the first data stream is the second NIC. The second NIC can send rate limiting information to the first NIC, and the first NIC can respond to the rate limiting information by reducing the transmission rate of the first data stream. This rate limiting information is the first rate limiting information mentioned below, which will be explained in detail in step S303.
[0120] S303, the second network card sends the first speed limit information to the first network card.
[0121] During the transmission phase of the first data stream, the second network interface card (NIC) receives the first data stream transmitted by the first NIC and monitors the transmission rate of the first data stream. The transmission rate of the first data stream can be the number of packets of the first data stream received by the second NIC per unit time.
[0122] In some embodiments, when the second network interface card (NIC) detects that the transmission rate of the first data stream is greater than or equal to a first rate threshold, it can send first rate limiting information to the first NIC, where the first NIC is the source NIC of the first data stream. Since the first and second NICs are located in different data centers, the first data stream is long-distance traffic. The packets in the first data stream carry a flow type identifier "1," which indicates that the transmission distance between the source and destination NICs of the first data stream is greater than or equal to a first distance threshold. Based on the flow type identifier carried in the packets of the first data stream, the second NIC can determine that the first data stream belongs to a first traffic group. The first traffic group can be a long-distance traffic group, meaning that the transmission distance between the source and destination NICs of each data stream in the first traffic group is greater than or equal to the first distance threshold. The second NIC can obtain the pre-saved rate threshold corresponding to the first traffic group, i.e., the first rate threshold. In one embodiment, when all data streams transmitted by the second network interface card (NIC) belong to the first traffic group, the first rate threshold can be a first value; when the second NIC transmits data streams that do not belong to the first traffic group, i.e., the data streams transmitted by the second NIC also include data streams that do not belong to the first traffic group, the first rate threshold can be a second value; the first value is greater than or equal to the second value. If the transmission rate of the first data stream exceeds the first rate threshold, the second NIC can send first rate limiting information to the first NIC.
[0123] In other embodiments, when the second network interface card (NIC) detects congestion on the transmission path of the first data stream, it can send first rate-limiting information to the first NIC. For example, the first data stream originates from the first NIC, passes through forwarding devices M1, M2, and M3, and is transmitted to the second NIC. The transmission path of the first data stream can include forwarding devices M1, M2, and M3. If congestion occurs at forwarding device M3, forwarding device M3 can send a congestion notification to the second NIC. The congestion notification can be an explicit congestion notification (ECN) message. The second NIC can generate first rate-limiting information based on the received congestion notification and send the first rate-limiting information to the first NIC. The first rate-limiting information sent by the second NIC in data center 2 can be transmitted to the first NIC in data center 1 through forwarding devices M3, M2, and M1.
[0124] In some embodiments, the first rate limiting information may be a rate limiting feedback message, such as a congestion notification packet (CNP). In other embodiments, the first rate limiting information may be a rate limiting identifier, such as an ECN identifier, carried in the first feedback message, and the first feedback message may be a feedback message to be sent for packets in the first data stream. For example, when the second network interface card (NIC) determines that the transmission rate of the first data stream exceeds a first rate threshold, it may add an ECN identifier to the currently to-be-sent feedback message for packets in the first data stream and send the feedback message to the first NIC.
[0125] S304, the first network card reduces the transmission rate of the first data stream based on the received first rate limiting information.
[0126] When the first network interface card (NIC) receives the first rate-limiting information sent by the second NIC, it can reduce the transmission rate of the first data stream according to a set rate reduction margin. For example, the first NIC can reduce the data transmission window of the first data stream according to the set rate reduction margin. For instance, assuming the current data transmission window size of the first data stream is L (meaning L packets can be transmitted at a time), if the next packet to be transmitted is numberM+1, when sending packets of the first data stream at the current transmission rate (i.e., the data transmission window size L), packets numbered from numberM+1 to numberM+L can be sent. Assuming the set rate reduction margin is L2, if the data transmission window of the first data stream is reduced by the set rate reduction margin L2, resulting in a reduced data transmission window size of L-L2, when the next packet to be transmitted is numberM+1, when sending packets of the first data stream according to the reduced data transmission window size L-L2, packets numbered from numberM+1 to numberM+L-L2 can be sent. It is evident that reducing the transmission rate of the first data stream decreases the data transmission window of the first data stream, thus reducing the number of messages that can be sent each time.
[0127] S305, the third network card sends the second data stream packets to the second network card according to the transmission rate of the second data stream.
[0128] In this scenario, the transmission distance between the third network interface card (NIC) and the second NIC is less than or equal to a second distance threshold. This second distance threshold can be less than or equal to the first distance threshold mentioned above; for example, it could be 1 kilometer, 800 meters, or 500 meters. Exemplarily, the third and second NICs can be located in the same local area network (LAN) or the same data center. Transmission latency between two NICs in the same data center is within tens of microseconds, which can easily lead to short-distance traffic occupying more than 90% of the bandwidth at the destination NIC when long-distance and short-distance traffic are transmitted simultaneously.
[0129] For the second data stream, the source network interface card (NIC) is the third NIC, and the destination NIC is the second NIC. Both the third and second NICs can be NICs that support the RDMA protocol. For example, the third NIC could be... Figure 1 The network interface card (NIC) of communication device N1 in data center 1 shown, the second NIC can be... Figure 1 The network interface card (NIC) of communication device N3 in data center 2 is shown. When communication device N1 is performing the second service, if there is data that needs to be transmitted to communication device N3, data transmission can be performed through the third NIC in communication device N1 and the second NIC in communication device N3. The second service is related to the first service mentioned above; for example, the first and second services can be computational services executed in parallel during AI training. The third NIC can generate a message based on the data to be transmitted. This message can be a message conforming to the RDMA protocol. The third NIC can transmit data to the second NIC through the message, forming a second data stream.
[0130] During the execution of the second service by communication device N1, when the first network card sends a message to the second network card for the first time, it can send the second data stream message to the second network card according to the preset transmission rate of the second data stream. Then, the transmission rate of the second data stream can be adjusted according to the data transmission situation of the communication network. The transmission rate of the second data stream can be used to limit the size of the data transmission window for the third network card to send the first data stream message. The larger the data transmission window, the more second data stream messages are sent per unit time; the smaller the data transmission window, the fewer second data stream messages are sent per unit time.
[0131] like Figure 4 As shown, packets of the second data stream sent by the third network interface card (NIC) (NIC in communication device N1 of data center 2) can be transmitted to the second NIC (NIC in communication device N3 of data center 2) via forwarding device M3. Assume that forwarding device M3 forwards packets of the second data stream through buffer queue Q1 in the communication port. Forwarding device M3 has a buffer space corresponding to buffer queue Q1, and forwarding device M3 saves the received packets of the second data stream to the buffer space corresponding to buffer queue Q1. In addition to saving packets of the second data stream, the buffer space corresponding to buffer queue Q1 can also save packets of the first data stream mentioned above. Forwarding device M3 can read packets from the buffer space corresponding to buffer queue Q1 in the order they were stored in the buffer space, and send them to the next-hop device corresponding to the packet. For packets of the second data stream, forwarding device M3 can send the packets to the destination NIC of the second data stream, i.e., the second NIC of communication device N3 in data center 2.
[0132] S306, the third network card increases the transmission rate of the second data stream according to the round-trip delay of the packets contained in the second data stream.
[0133] During the continuous transmission of the second data stream, the third network interface card (NIC) can adjust the transmission rate of the second data stream according to network transmission conditions. For example, the third NIC can adjust the transmission rate of the second data stream every second set time interval. The second set time interval can be the round-trip time delay of the packets contained in the second data stream, or it can be proportional to the round-trip time delay of the packets contained in the second data stream.
[0134] If the third network card does not obtain the speed reduction information of the second data stream within the current second set time period, the second network card can increase the transmission rate of the second data stream according to the round-trip time of the packets contained in the second data stream.
[0135] In some embodiments, the round-trip time (RTT) of the packets included in the second data stream can be the RTT of any packet in the second data stream. The RTT of any packet in the second data stream refers to the duration between the sending time of the packet and the receiving time of the ACK message of the packet. The process by which the third network interface card (NIC) determines the RTT of any packet in the second data stream can be performed with reference to the process by which the first NIC determines the RTT of any packet in the first data stream in step S302, and will not be described again here.
[0136] In other embodiments, the round-trip time (RTT) of the packets included in the second data stream can be the average of the RTTs of multiple packets included in the second data stream. For example, the third network interface card (NIC) can determine the RTT of the packets corresponding to the multiple feedback packets received within the current second preset time period, thereby obtaining the RTT of the multiple packets. The RTT of the packets included in the second data stream can be the average of the obtained RTTs of the multiple packets.
[0137] In other embodiments, the round-trip time of the messages contained in the second data stream may be the historical minimum round-trip time of the messages contained in the second data stream.
[0138] The third network interface card (NIC) can increase the transmission rate of the second data stream according to the round-trip time (RTT) of the packets contained in the second data stream, where the RTT of the packets contained in the second data stream can be less than or equal to a third delay threshold. In some embodiments, when the RTT of the packets contained in the second data stream is the RTT of any packet in the second data stream, or the average of the RTTs of multiple packets contained in the second data stream, the third delay threshold can be determined based on the transmission distance and information transmission speed between the third NIC and the second NIC. The third delay threshold can be less than the ratio of the transmission distance to the information transmission speed, wherein the information transmission speed is related to the characteristics of the transmission medium between the NICs. In other embodiments, when the RTT of the packets contained in the second data stream is the minimum RTT of the packets contained in the second data stream, the third delay threshold can be the historical minimum RTT stored in the third NIC.
[0139] For example, the round-trip time (RTT) of the packets contained in the second data stream can be denoted as the minimum RTT of the second data stream. The third network interface card (NIC) stores the user-preconfigured base value and base RTT value. The third NIC can determine the growth rate of the second data stream based on the minimum RTT, base, and base RTT values. The formula for determining the growth rate of the second data stream can be expressed as:
[0140] The growth rate of the second data stream = base * (minimum RTT of the second data stream / base RTT)
[0141] After determining the growth rate of the second data stream, the third network interface card (NIC) can increase the transmission rate of the second data stream according to the growth rate. For example, the third NIC can increase the data transmission window of the second data stream according to the growth rate. This step can be performed with reference to the step of the first NIC increasing the data transmission window of the first data stream, and will not be described again here.
[0142] In this embodiment, the first network interface card (NIC) increases the transmission rate of the first data stream according to the round-trip time (RTD) of the packets contained in the first data stream. Since the RTD of the first data stream is relatively large, the rate of increase of the first data stream is relatively low, but the magnitude of each increase is relatively large. Similarly, the third NIC increases the transmission rate of the second data stream according to the RTD of the packets contained in the second data stream. Since the RTD of the second data stream is relatively low, the rate of increase of the second data stream is relatively high, but the magnitude of each increase is relatively small. This facilitates a more balanced bandwidth utilization between the first and second data streams, preventing the second data stream from consuming excessive bandwidth. In the buffer queue Q1 of the forwarding device M3, the bandwidth occupied by the first and second data streams is also relatively balanced.
[0143] In some embodiments, the third network interface card (NIC) can add a second stream type identifier to the packets of the second data stream to be sent, based on the round-trip time (RTD) of the already sent second data stream packets. The second stream type identifier indicates that the transmission distance between the source NIC and the destination NIC of the packet is less than or equal to a second distance threshold. For example, the stream type identifier can occupy 1 bit. Since the transmission distance between the third NIC (the source NIC of the second data stream) and the second NIC (the destination NIC of the second data stream) is less than or equal to the second distance threshold, the RTD of the second data stream packets is short and does not exceed the set threshold. Because the RTD of the already sent second data stream packets does not exceed the set threshold, the third NIC can add a stream type identifier "0" to the packets of the second data stream to be sent. The stream type identifier "0" indicates that the transmission distance between the source NIC and the destination NIC of the packet is less than or equal to the second distance threshold.
[0144] If, within the current second set time period, the third network interface card (NIC) receives information indicating a rate reduction in the second data stream, the third NIC can reduce the transmission rate of the second data stream according to a set rate reduction margin. The rate reduction information for the second data stream may include one or more of the following:
[0145] 1. Information that the round-trip delay of a message in the second data stream is greater than or equal to the fourth delay threshold within the second set time period.
[0146] The fourth delay threshold can be determined based on the transmission distance and information transmission speed between the source and destination network interface cards (NICs) of the second data stream. This fourth delay threshold can be greater than the ratio of transmission distance to information transmission speed. If the round-trip time of a packet in the second data stream is greater than or equal to the fourth delay threshold, it indicates that congestion may occur on the transmission path of the second data stream, increasing the transmission time of packets or their feedback packets. In this case, the second NIC can reduce the transmission rate of the second data stream.
[0147] 2. Congestion information of the second data stream sent by the forwarding device of the second data stream.
[0148] In this configuration, the forwarding device for the second data stream is located between the third network interface card (NIC) and the second NIC. For example, the third NIC in data center 2 transmits the packets of the second data stream to the transmission path of the second NIC in data center 2 via forwarding device M3. If congestion occurs at forwarding device M3, forwarding device M3 can directly send congestion information of the second data stream to the third NIC, without the second NIC needing to detect congestion on the transmission path before sending congestion information to the third NIC. For example, as... Figure 4As shown, assuming that at a certain moment, both the first and third network interface cards (NICs) are sending packets at maximum throughput, then forwarding device M3 needs to forward twice the maximum throughput. Therefore, the instantaneous throughput of forwarding device M3 often exceeds its forwarding capacity, making traffic congestion and queuing inevitable. If congestion occurs at forwarding device M3, it can directly send congestion information to the first and third NICs.
[0149] Taking the second data stream as an example, such as Figure 5 As shown, during the process of the third network card sending packets of the second data stream to the second network card through the forwarding device M3, the forwarding device M3 forwards the packets of the second data stream through the buffer queue Q1. The forwarding device M3 has a buffer space corresponding to the buffer queue Q1. The forwarding device M3 saves the packets of the second data stream sent by the third network card to the buffer space G1 corresponding to the buffer queue Q1. In addition to saving the packets of the second data stream, the buffer space G1 corresponding to the buffer queue Q1 can also save the packets of the first data stream mentioned above. If the forwarding device M3 detects that the number of packets saved in the buffer space G1 corresponding to the buffer queue Q1 reaches a set number threshold, which can be called the ECN threshold, the forwarding device M3 determines that the buffer queue Q1 is congested. The forwarding device M3 can send congestion information of the second data stream to the third network card. This congestion information can be an ECN congestion packet. The congestion information of the second data stream is sent from the forwarding device to the third network card. The third network card can receive the transmission path congestion notification earlier, respond and adjust more quickly, and avoid losing more packets due to transmission path congestion.
[0150] When the third network card receives congestion information about the second data stream sent by the forwarding device, it can reduce the transmission rate of the second data stream according to the set reduction rate.
[0151] 3. Rate limiting information sent by the destination network card of the second data stream.
[0152] The destination network interface card (NIC) of the second data stream is the second NIC. The second NIC can send rate limiting information to the third NIC, and the third NIC can respond to the rate limiting information by reducing the transmission rate of the second data stream. This rate limiting information is the second rate limiting information mentioned below, which will be explained in detail in step S307.
[0153] S307, the second network card sends the second speed limit information to the third network card.
[0154] During the transmission phase of the second data stream, the second network interface card (NIC) receives the second data stream transmitted by the third NIC and monitors the transmission rate of the second data stream. The transmission rate of the second data stream can be the number of packets in the second data stream received by the second NIC per unit time.
[0155] In some embodiments, when the second network interface card (NIC) detects that the transmission rate of the second data stream is greater than or equal to a second rate threshold, it can send second rate limiting information to the third NIC, which is the source NIC of the second data stream. Since the third NIC and the second NIC are located in the same data center, the second data stream is short-range traffic. The packets in the second data stream carry a flow type identifier "0," which indicates that the transmission distance between the source NIC and the destination NIC of the second data stream is less than or equal to the second distance threshold. Based on the flow type identifier carried in the packets of the second data stream, the second NIC can determine that the second data stream belongs to a second traffic group. The second traffic group can be a short-range traffic group, meaning that the transmission distance between the source NIC and the destination NIC of each data stream in the second traffic group is less than or equal to the second distance threshold. The second NIC can obtain the pre-saved rate threshold corresponding to the second traffic group, i.e., the second rate threshold. In one embodiment, when all data streams transmitted by the second network interface card (NIC) belong to the second traffic group, the second rate threshold can be a third value; when the second NIC transmits data streams that do not belong to the second traffic group, i.e., the data streams transmitted by the second NIC also include data streams that do not belong to the second traffic group, the second rate threshold can be a fourth value; the third value is greater than or equal to the fourth value. If the transmission rate of the second data stream exceeds the second rate threshold, the second NIC can send a second rate limiting message to the third NIC.
[0156] In other embodiments, when the second network interface card (NIC) detects congestion on the transmission path of the second data stream, it can send second rate-limiting information to the third NIC. For example, the second data stream originates from the third NIC and is transmitted to the second NIC via forwarding device M3. If congestion occurs at forwarding device M3, it can send a congestion notification to the second NIC. The second NIC can generate second rate-limiting information based on the received congestion notification and send it to the third NIC. The second rate-limiting information sent by the second NIC in data center 2 can be transmitted to the third NIC in data center 2 via forwarding device M3.
[0157] In some embodiments, the first rate threshold corresponding to the first traffic group may be less than or equal to the second rate threshold corresponding to the second traffic group. When the transmission rates of the first data stream and the second data stream are similar and both are close to the boundary of the rate threshold, the second network card can send the first rate limiting information to the first network card earlier to offset the longer transmission time required for the first rate limiting information to be transmitted to the first network card. The first network card reduces the transmission rate of the first data stream based on the first rate limiting information sent by the second network card, and the third network card reduces the transmission rate of the second data stream based on the second rate limiting information sent by the second network card, which is beneficial to the synchronous reduction of the transmission rates of the first data stream and the second data stream.
[0158] In some embodiments, the second rate limiting information may be a rate limiting feedback message. In other embodiments, the second rate limiting information may be a rate limiting identifier, such as an ECN identifier, carried in a second feedback message, which may be a feedback message to be sent for a packet in the second data stream. For example, when a second network interface card (NIC) determines that the transmission rate of the second data stream exceeds a second rate threshold, it may add an ECN identifier to the currently to-be-sent feedback message for a packet in the second data stream and send the feedback message to the third NIC.
[0159] S308, the third network card reduces the transmission rate of the second data stream based on the received second rate limiting information.
[0160] When the third network card receives the second rate-limiting information sent by the second network card, it can reduce the transmission rate of the second data stream according to the set rate reduction range.
[0161] In the above embodiments, the execution order of each step can be interchanged. For example, steps S303 and S304 can be executed before step S302, steps S307 and S308 can be executed before step S306, and step S305 can be executed before step S301 or synchronously with step S301. The embodiments of this application do not limit the execution order of each step.
[0162] To make it easier to understand, the flow control methods performed by the source network interface card (NIC) and the destination NIC are described below.
[0163] Figure 6 A flowchart illustrating a flow control method performed by a source-side network interface card (NIC) is provided. The source-side NIC can be any communication device in a communication system or a NIC configured within a communication device. Figure 6 Using the source network interface card as Figure 1 The explanation will be based on the network card in communication device K1. Figure 6 As shown, the flow control method executed by the source network interface card may include the following steps:
[0164] S601, the source network card sends the first data stream message to the destination network card of the first data stream according to the transmission rate of the first data stream.
[0165] For the first data stream, the source network interface card (NIC) can be the NIC in communication device K1 in data center 1, and the destination NIC of the first data stream can be... Figure 1 The network interface card (NIC) in communication device N3 in data center 2 is shown. For example, as... Figure 7As shown, when communication device K1 performs the first service, if there is data that needs to be transmitted to communication device N3, it can generate a message based on the data to be transmitted through the network interface card (NIC) in communication device K1, and transmit the data to the NIC in communication device N3 through the message, forming the first data stream. The first data stream belongs to long-distance traffic. Taking the NIC in communication device K1, i.e., the source NIC, as an RDMA NIC as an example, as follows... Figure 8 As shown, the RDMA network interface card (NIC) includes an RDMA protocol processing unit, which comprises a transport layer processing module and a congestion control module. The transport layer processing module generates packets for the first data stream based on the data to be transmitted, and maintains the transport layer state of the RDMA protocol, including updating packet sequence numbers, transport layer flow control, and processing transport layer feedback packets. The congestion control module controls the transmission rate of the first data stream packets according to the transmission rate of the first data stream.
[0166] During the execution of the first service, when the source network card sends a message to the destination network card of the first data stream for the first time, the congestion control module can control the transport layer processing module to send the message of the first data stream according to the preset transmission rate of the first data stream. Then, it can adjust the transmission rate of the first data stream according to the data transmission situation of the communication network.
[0167] The source network card is Figure 1 The network interface card (NIC) of communication device K1 in data center 1 shown in the figure has the destination NIC for the first data stream as follows: Figure 1 The network interface card (NIC) of communication device N3 in data center 2 is shown. The transmission distance between the source NIC and the destination NIC of the first data stream is greater than or equal to a first distance threshold. The packets of the first data stream sent by the source NIC can be transmitted to the destination NIC in data center 2 through forwarding devices M1, M2, and M3. Taking forwarding device M1 as an example, it is assumed that forwarding device M1 forwards the packets of the first data stream through the first buffer queue in the communication port.
[0168] S602, when the source network card does not obtain the speed reduction information of the first data stream, the transmission rate of the first data stream is increased according to the round-trip time of the packets contained in the first data stream.
[0169] During the continuous transmission of the first data stream, the congestion control module in the source network interface card (NIC) can adjust the transmission rate of the first data stream according to the network transmission conditions. For example, the congestion control module in the source NIC can adjust the transmission rate of the first data stream once every first set time interval.
[0170] If the source network card does not obtain the rate reduction information of the first data stream within the current first set time period, the congestion control module in the source network card can increase the transmission rate of the first data stream according to the round-trip time of the packets contained in the first data stream.
[0171] The rate reduction information for the first data stream may include one or more of the following:
[0172] 1. Information that the round-trip time of a message in the first data stream is greater than or equal to a second delay threshold within a first set time period;
[0173] 2. Congestion information of the first data stream sent by the forwarding device of the first data stream;
[0174] 3. Rate limiting information sent by the destination network card of the first data stream.
[0175] In some embodiments, such as Figure 8 As shown, the RDMA network card also includes a flow classification module. The flow classification module can add a first flow type identifier to the first data stream to be sent based on the round-trip delay of the first data stream packets that have been sent. The first flow type identifier is used to indicate that the transmission distance between the source network card and the destination network card of the packet is greater than or equal to a first distance threshold.
[0176] S603: When the source network card obtains the information about the slowdown of the first data stream, it reduces the transmission rate of the first data stream.
[0177] If, within the current first set time period, the source network interface card (NIC) receives information about the rate reduction of the first data stream, the congestion control module in the source NIC can reduce the transmission rate of the first data stream according to the set rate reduction margin. The congestion control module in the RDMA protocol processing device of the RDMA NIC also has a congestion coordination processing function. The congestion control module can be used to process the congestion information of the first data stream received by the source NIC from the forwarding device of the first data stream. For example, the congestion control module can reduce the transmission rate of the first data stream based on the congestion information of the first data stream.
[0178] S604, the source network card sends the third data stream message to the destination network card of the third data stream according to the transmission rate of the third data stream.
[0179] For the third data stream, the source network interface card (NIC) can be the NIC in communication device K1 in data center 1, and the destination NIC of the third data stream can be... Figure 1 The network interface card (NIC) in communication device K3 in data center 1 is shown. For example, as... Figure 7As shown, when communication device K1 is performing a third service, if there is data that needs to be transmitted to communication device K3, the network interface card (NIC) in communication device K1 can generate a message based on the data to be transmitted, and then transmit the data to the NIC in communication device K3 through the message, forming a third data stream. This third data stream is short-range traffic. For example... Figure 8 As shown, the RDMA protocol processing unit within the RDMA network card includes a transport layer processing module and a congestion control module. The transport layer processing module can generate packets for a third data stream based on the data to be transmitted, and the congestion control module can control the transmission rate of the third data stream packets according to the transmission rate of the third data stream.
[0180] During the execution of the third service, when the source network card sends a message to the destination network card of the third data stream for the first time, the congestion control module can control the transport layer processing module to send the message of the third data stream according to the preset transmission rate of the third data stream. Then, the transmission rate of the third data stream can be adjusted according to the data transmission situation of the communication network.
[0181] The source network card is Figure 1 The network interface card (NIC) of communication device K1 in data center 1 shown in the figure, the destination NIC of the third data stream is Figure 1 The network interface card (NIC) of communication device K3 in data center 1 is shown. The transmission distance between the source NIC and the destination NIC of the third data stream is less than or equal to a second distance threshold. The packets of the third data stream sent by the source NIC can be transmitted to the destination NIC in data center 2 through forwarding device M1. Assume that forwarding device M1 also forwards the packets of the third data stream through the first buffer queue in the communication port.
[0182] S605: When the source network card does not obtain the speed reduction information of the third data stream, the transmission rate of the third data stream is increased according to the round-trip time of the packets contained in the third data stream.
[0183] During the continuous transmission of third data stream packets, the congestion control module in the source network interface card (NIC) can adjust the transmission rate of the third data stream according to network transmission conditions. For example, the source NIC can adjust the transmission rate of the third data stream every second set time interval.
[0184] If the source network interface card (NIC) does not receive the rate reduction information for the third data stream within the current second set time period, the congestion control module in the source NIC can increase the transmission rate of the third data stream according to the round-trip time of the packets contained in the third data stream. The rate reduction information for the third data stream may include one or more of the following:
[0185] 1. Information that the round-trip time of a message in the third data stream within the second set time period is greater than or equal to the fourth delay threshold;
[0186] 2. Congestion information of the third data stream sent by the forwarding device of the third data stream; such as congestion information of the third data stream sent by forwarding device M1;
[0187] Third, the rate limiting information sent by the destination network card of the third data stream.
[0188] In some embodiments, the round-trip time (RTT) of the packets included in the third data stream can be the RTT of any packet in the third data stream. The RTT of any packet in the third data stream refers to the duration between the sending time of the packet and the receiving time of the feedback packet. The process by which the source network interface card (NIC) determines the RTT of any packet in the third data stream can be performed with reference to the process by which the first NIC determines the RTT of any packet in the first data stream, and will not be described again here.
[0189] In other embodiments, the round-trip time (RTT) of the packets contained in the third data stream can be the average of the RTTs of multiple packets contained in the third data stream. For example, the source network interface card (NIC) can determine the RTT of the packets corresponding to the multiple feedback packets received within the current second preset time period based on the feedback packets of the third data stream, thus obtaining the RTT of the multiple packets. The RTT of the packets contained in the third data stream can be the average of the obtained RTTs of the multiple packets.
[0190] In other embodiments, the round-trip time of the messages contained in the third data stream can be the historical minimum round-trip time of the messages contained in the third data stream.
[0191] The congestion control module in the source network interface card (NIC) can increase the transmission rate of the third data stream based on the round-trip time (RTT) of the packets contained in the third data stream. For example, the RTT of the packets contained in the third data stream can be denoted as the minimum RTT of the third data stream. The source NIC stores user-preconfigured base values and base RTT values. The congestion control module in the source NIC can determine the rate of increase of the third data stream based on the minimum RTT, base, and base RTT values. The formula for determining the rate of increase of the third data stream can be expressed as:
[0192] The growth rate of the third data stream = base * (minimum RTT of the third data stream / base RTT)
[0193] After determining the growth rate of the third data stream, the congestion control module in the source network interface card (NIC) can increase the transmission rate of the third data stream according to its growth rate. For example, the congestion control module in the source NIC can increase the data transmission window of the third data stream according to its growth rate. This step can be performed with reference to the step of increasing the data transmission window of the first data stream in the first NIC, and will not be described again here.
[0194] In this embodiment, the source network interface card (NIC) increases the transmission rate of the first data stream according to the round-trip time (RTD) of the packets contained in the first data stream. Since the RTD of the first data stream is relatively large, the rate of increase of the first data stream is relatively low, but the magnitude of each increase is relatively large. Conversely, the source NIC increases the transmission rate of the third data stream according to the RTD of the packets contained in the third data stream. Since the RTD of the third data stream is relatively low, the rate of increase of the third data stream is relatively high, but the magnitude of each increase is relatively small. This promotes a more balanced bandwidth utilization between the first and third data streams. In the first buffer queue of the forwarding device M1, the bandwidth occupied by the first and third data streams is also relatively balanced.
[0195] S606: When the source network card obtains information about the rate reduction of the third data stream, it reduces the transmission rate of the third data stream.
[0196] If the source network card receives information about the rate reduction of the third data stream within the current second set time period, the congestion control module in the source network card can reduce the transmission rate of the third data stream according to the set rate reduction amount.
[0197] In the above embodiments, the execution order of each step can be interchanged. For example, step S503 can be executed before step S502, step S506 can be executed before step S505, and step S504 can be executed before step S501 or synchronously with step S501. The embodiments of this application do not limit the execution order of each step.
[0198] Figure 9 A flowchart illustrating a flow control method performed by a target network interface card (NIC) is provided. The target NIC can be any communication device in a communication system or a NIC configured within a communication device. Figure 9 Using the source network interface card as Figure 1 The explanation will be based on the network card in communication device N3. Figure 9 As shown, the flow control method performed by the destination network interface card may include the following steps:
[0199] S901, the destination network card receives the first data stream message sent by the source network card of the first data stream.
[0200] For example, such as Figure 10 As shown, the source network interface card (NIC) of the first data stream can be the NIC in communication device K1 in data center 1, and the destination NIC can be... Figure 1The network interface card (NIC) in communication device N3 within data center 2 is shown. When communication device K1 performs a first service, if there is data that needs to be transmitted to communication device N3, it can generate a message based on the data to be transmitted through the NIC in communication device K1, and transmit the data to the NIC in communication device N3 through the message, forming a first data stream. The first data stream is a long-distance traffic stream. In some embodiments, in addition to sending the first data stream to the destination NIC, the source NIC of the first data stream can also send other data streams, such as a fourth data stream. The fourth data stream and the first data stream can be data streams from different services.
[0201] S902, when the destination network card detects that the transmission rate of the first data stream is greater than or equal to the first rate threshold, it sends the first rate limiting information to the source network card of the first data stream.
[0202] Since each network interface card (NIC) can function as both a source and a destination NIC, the structure of the destination NIC in this embodiment can be the same as... Figure 8 The RDMA network cards shown have the same structure. The destination network card may include a receive (RX) queue scheduling module for receiving packets from multiple data streams and scheduling among the data streams during packet reception. The RDMA protocol processing unit of the destination network card may include a rate limiting module for monitoring and limiting the transmission rate of each data stream.
[0203] The rate-limiting module in the destination network interface card (NIC) can monitor the transmission rate of the first data stream in real time. Based on the flow type identifier "1" carried in the packets of the first data stream, it can determine that the first data stream belongs to the first traffic group. The transmission distance between the source NIC and the destination NIC for each data stream in the first traffic group is greater than or equal to a first distance threshold. The destination NIC stores a first rate threshold corresponding to the first traffic group. In some embodiments, the first rate threshold can have two values: a first value and a second value, where the first value is greater than or equal to the second value. If all the data streams currently transmitted by the destination NIC belong to the first traffic group, for example, if the destination NIC is not currently transmitting short-range traffic, the rate-limiting module in the destination NIC can use the first value as the first rate threshold. If some of the data streams currently transmitted by the destination NIC do not belong to the first traffic group, for example, if the destination NIC is currently transmitting both long-range and short-range traffic, the rate-limiting module in the destination NIC can use the second value as the first rate threshold. The rate limiting module in the destination network card can compare the transmission rate of the first data stream with the first rate threshold. If the transmission rate of the first data stream is greater than or equal to the first rate threshold, the first rate limiting information can be sent to the source network card of the first data stream.
[0204] S903, the destination network card receives the second data stream packets sent by the source network card of the second data stream.
[0205] For example, such as Figure 10 As shown, the source network interface card (NIC) of the second data stream can be the NIC in communication device N1 in data center 2, and the destination NIC can be... Figure 1 The network interface card (NIC) in communication device N3 within data center 2 is shown. When communication device N1 is performing a second service, if there is data that needs to be transmitted to communication device N3, the NIC in communication device N1 can generate a message based on the data to be transmitted, and transmit the data to the NIC in communication device N3 through the message, forming a second data stream. This second data stream is short-range traffic. In some embodiments, the source NIC of the second data stream can send other data streams, such as a fifth data stream, in addition to sending the second data stream to the destination NIC. The fifth data stream and the second data stream can be data streams from different services.
[0206] S904, when the destination network card detects that the transmission rate of the second data stream is greater than or equal to the second rate threshold, it sends the second rate limiting information to the source network card of the second data stream.
[0207] The rate-limiting module in the destination network interface card (NIC) can monitor the transmission rate of the second data stream in real time. Based on the flow type identifier "0" carried in the packets of the second data stream, it can be determined that the second data stream belongs to the second traffic group. The transmission distance between the source NIC and the destination NIC for each data stream in the second traffic group is less than or equal to a second distance threshold. The destination NIC stores the second rate threshold corresponding to the second traffic group. In some embodiments, the second rate threshold can have two values: a third value and a fourth value, where the third value is greater than or equal to the fourth value. If all the data streams currently transmitted by the destination NIC belong to the second traffic group, for example, when the destination NIC is not currently transmitting long-distance traffic, the rate-limiting module in the destination NIC can use the third value as the second rate threshold. If some of the data streams currently transmitted by the destination NIC do not belong to the second traffic group, for example, when the destination NIC is currently transmitting both short-distance and long-distance traffic, the rate-limiting module in the destination NIC can use the fourth value as the second rate threshold. The rate limiting module in the destination network card can compare the transmission rate of the second data stream with the second rate threshold. If the transmission rate of the second data stream is greater than or equal to the second rate threshold, it can send the first rate limiting information to the source network card of the second data stream.
[0208] In the above embodiments, the execution order of each step can be interchanged. For example, step S903 can be executed before step S901 or simultaneously with step S901. This application embodiment does not limit the execution order of each step.
[0209] This application's embodiments can be applied to multi-AZ large-cluster training scenarios in AI applications and multi-AZ multi-replica modes in high-reliability storage scenarios. It provides congestion control on the end-side network interface cards (NICs) to ensure balanced bandwidth performance. The end-side NICs include source NICs and destination NICs. For example, in high-reliability storage scenarios, to ensure the high reliability of customer data, a multi-replica mode can be adopted for disaster recovery backup. Customer high-reliability data is stored in multiple physically isolated data centers to ensure data recovery in disaster scenarios such as fires and earthquakes. For example, a dual-replica mode can be used, where one set of data is stored twice, in two different data centers; or a triple-replica mode can be used, where one set of data is stored three times, in three different data centers. The business traffic in storage scenarios mainly consists of customer data storage and data retrieval, i.e., data writes and data reads. Storage devices in a storage domain AZ can store data from hundreds or thousands of remote servers, thus resulting in overlapping bidirectional read and write traffic on the network. The bidirectional read / write traffic includes long-distance traffic and short-distance traffic. By adopting the traffic control method provided in the embodiments of this application, different transmission rate adjustment strategies are used for different data streams with different transmission distances. This can prevent some data streams from occupying too much bandwidth and help to make the bandwidth usage of data streams with different transmission distances more balanced.
[0210] This application also provides a network interface card (NIC) that can be used to implement the functions of any of the NICs described in the above embodiments. Figure 11 An exemplary schematic diagram of a network interface card (NIC) is shown, such as... Figure 11 As shown, the network interface card 1100 may include a processor 1101 and a communication interface 1102. The processor 1101 may be any one or more processors, including but not limited to a data processing unit (DPU), a microprocessor (MP), or a digital signal processor (DSP). The communication interface 1102 can be used to send and receive data to enable communication between the network interface card 1100 and other devices or other network interface cards in a communication network.
[0211] Network card 1100 can be Figure 1 The network interface card (NIC) in any of the communication devices shown. The NIC 1100 can be connected to the central processing unit (CPU) in the communication device via a bus, which can be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
[0212] In some embodiments, the network interface card 1100 may further include a memory storing executable program code, which the processor 1101 executes to implement the functions of the network interface card in the above embodiments, thereby realizing the flow control method.
[0213] For example, when the network interface card 1100 is used as a source network interface card, the processor 1101 can execute the program code stored in the memory to implement the following steps: during the transmission of the first data stream through the communication interface 1102, the transmission rate of the first data stream is increased according to the round-trip time of the packets contained in the first data stream; during the transmission of the third data stream through the communication interface 1102, the transmission rate of the third data stream is increased according to the round-trip time of the packets contained in the third data stream; the round-trip time of the packets contained in the first data stream is greater than the round-trip time of the packets contained in the third data stream.
[0214] When network interface card 1100 is used as the destination network interface card, processor 1101 can execute program code stored in memory to implement the following steps: when the transmission rate of the first data stream is greater than or equal to a first rate threshold, a first rate limiting information is sent to the source network interface card of the first data stream through communication interface 1102; the first rate limiting information is used to instruct the source network interface card of the first data stream to reduce the transmission rate of the first data stream; the transmission distance between the aforementioned network interface card and the source network interface card of the first data stream is greater than or equal to a first distance threshold; when the transmission rate of the second data stream is greater than or equal to a second rate threshold, a second rate limiting information is sent to the source network interface card of the second data stream through communication interface 1102; the second rate limiting information is used to instruct the source network interface card of the third data stream to reduce the transmission rate of the second data stream; the transmission distance between the aforementioned network interface card and the source network interface card of the second data stream is less than or equal to a second distance threshold, and the first distance threshold is greater than or equal to the second distance threshold.
[0215] It is understood that the structures illustrated in the embodiments of this application do not constitute a specific limitation on the network interface card (NIC). In other embodiments of this application, the NIC may include more or fewer components than illustrated, or combine some components, or split some components, or have different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
[0216] This application embodiment also provides a communication device, which can be a computing device or a storage device, and the communication device is provided with... Figure 11 The network interface card (NIC) shown can communicate with other devices in the communication system. For example, when the communication device is a computing device, it may include a NIC and a computing unit; when the communication device is a storage device, it may include a NIC and a storage unit.
[0217] This application also provides a computer program product containing instructions. The computer program product may be a software or program product containing instructions, capable of running on a computing device or stored on any usable medium. When the computer program product is run on at least one computing device, it causes the at least one computing device to execute a device cluster operation method.
[0218] This application also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium that a computing device can store, or a data storage device such as a data center that includes one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid-state drive). The computer-readable storage medium includes instructions that instruct the computing device to perform a service operation method, or instruct the computing device to perform a service operation method.
[0219] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of the present invention.
Claims
1. A communication system, characterized in that, It includes a first network interface card (NIC), a second network interface card (NIC), and a third network interface card (NIC); the transmission distance between the first NIC and the second NIC is greater than or equal to a first distance threshold, and the transmission distance between the third NIC and the second NIC is less than or equal to a second distance threshold, wherein the first distance threshold is greater than or equal to the second distance threshold; The first network interface card (NIC) is used to increase the transmission rate of the first data stream according to the round-trip time of the packets contained in the first data stream during the process of transmitting the first data stream to the second NIC. The third network interface card (NIC) is used to increase the transmission rate of the second data stream according to the round-trip time (RTT) of the packets contained in the second data stream during the transmission of the second data stream to the second NIC; the RTT of the packets contained in the first data stream is greater than the RTT of the packets contained in the second data stream. The second network interface card (NIC) is used to send first rate limiting information to the first NIC when the transmission rate of the first data stream is greater than or equal to the first rate threshold; the first NIC is also used to reduce the transmission rate of the first data stream based on the first rate limiting information. The second network interface card (NIC) is further configured to send second rate limiting information to the third NIC when the transmission rate of the second data stream is greater than or equal to the second rate threshold; the third NIC is further configured to reduce the transmission rate of the second data stream based on the second rate limiting information.
2. The communication system according to claim 1, characterized in that, The first network interface card is used to increase the transmission rate of the first data stream according to the round-trip time of the packets contained in the first data stream if it does not obtain the speed reduction information of the first data stream within a set time period.
3. The communication system according to claim 2, characterized in that, The rate reduction information of the first data stream includes at least one of the following: Information that the round-trip time of the messages contained in the first data stream is greater than or equal to a first delay threshold within the set time period; The second network card sends the first rate limiting information; The congestion information of the first data stream is sent by the forwarding device of the first data stream; the forwarding device of the first data stream is located between the first network interface card and the second network interface card.
4. The communication system according to any one of claims 1 to 3, characterized in that, The first speed limit information is a speed limit feedback message, or the first speed limit information is a speed limit identifier carried in a first feedback message, and the first feedback message is a feedback message for the messages contained in the first data stream.
5. The communication system according to any one of claims 1 to 4, characterized in that, The first network interface card (NIC) is located in the first data center, and the second and third NICs are located in the second data center.
6. The communication system according to any one of claims 1 to 5, characterized in that, The communication system further includes a forwarding device, which is located between the first network interface card (NIC) and the second NIC, and the forwarding device is also located between the third NIC and the second NIC. The first data stream and the second data stream are transmitted through the same buffer queue in the forwarding device.
7. The communication system according to any one of claims 1 to 6, characterized in that, The messages contained in the first data stream conform to the Remote Direct Memory Access (RDMA) protocol.
8. A network interface card (NIC), characterized in that, Includes processor and communication interface; The communication interface is used to transmit a first data stream and a third data stream under the control of the processor; The processor is configured to increase the transmission rate of the first data stream according to the round-trip time of the messages contained in the first data stream, and to increase the transmission rate of the third data stream according to the round-trip time of the messages contained in the third data stream; wherein the round-trip time of the messages contained in the first data stream is greater than the round-trip time of the messages contained in the third data stream.
9. The network interface card according to claim 8, characterized in that, The transmission distance between the network card and the destination network card of the first data stream is greater than or equal to a first distance threshold, and the transmission distance between the network card and the destination network card of the third data stream is less than or equal to a second distance threshold, wherein the first distance threshold is greater than or equal to the second distance threshold.
10. The network interface card according to claim 8 or 9, characterized in that, The round-trip time of the first message in the first data stream refers to the duration between the time the first message is sent and the time the feedback message of the first message is received.
11. The network interface card according to any one of claims 8 to 10, characterized in that, The round-trip time of the messages contained in the first data stream is less than or equal to the second delay threshold.
12. The network interface card according to claim 11, characterized in that, The network interface card also includes a register; the second latency threshold is the historical round-trip latency of the packets contained in the first data stream stored in the register.
13. The network interface card according to any one of claims 8 to 12, characterized in that, The processor is specifically used to: if no speed-down information of the first data stream is obtained within a set time period, increase the transmission rate of the first data stream according to the round-trip time of the messages contained in the first data stream.
14. The network interface card according to any one of claims 8 to 13, characterized in that, The first data stream contains a stream type identifier in its packets; the stream type identifier is used to indicate that the transmission distance between the network card and the destination network card of the first data stream is greater than or equal to the first distance threshold.
15. A network interface card (NIC), characterized in that, Includes processor and communication interface; The communication interface is used to receive messages contained in the first data stream and messages contained in the second data stream. The processor is used to send first rate limiting information to the source network card of the first data stream through the communication interface when the transmission rate of the first data stream is greater than or equal to a first rate threshold. The first rate limiting information is used to instruct the source network card of the first data stream to reduce the transmission rate of the first data stream; the transmission distance between the network card and the source network card of the first data stream is greater than or equal to a first distance threshold. The processor is further configured to send second rate limiting information to the source network card of the second data stream through the communication interface when the transmission rate of the second data stream is greater than or equal to the second rate threshold; the second rate limiting information is used to instruct the source network card of the third data stream to reduce the transmission rate of the second data stream; the transmission distance between the network card and the source network card of the second data stream is less than or equal to the second distance threshold, and the first distance threshold is greater than or equal to the second distance threshold.
16. The network interface card according to claim 15, characterized in that, The first data stream belongs to the first traffic group. When all data streams transmitted by the network card belong to the first traffic group, the first rate threshold adopts a first value. When the network card transmits data streams that do not belong to the first traffic group, the first rate threshold adopts a second value. The first value is greater than or equal to the second value. The transmission distance between the source network card of the data stream belonging to the first traffic group and the network card is greater than or equal to the first distance threshold.
17. The network interface card according to claim 15 or 16, characterized in that, The first speed limit information is a speed limit feedback message, or the first speed limit information is a speed limit identifier carried in a first feedback message, and the first feedback message is a feedback message for the messages contained in the first data stream.
18. A communication device, characterized in that, The communication device includes the network interface card (NIC) according to any one of claims 8 to 14, and the communication device further includes a computing unit or a storage unit.
19. A communication device, characterized in that, The communication device includes a network interface card (NIC) according to any one of claims 15 to 17, and the communication device further includes a computing unit or a storage unit.