Data stream processing method and apparatus, communication chip and communication device
By integrating a hardware inference resource pool into the communication chip and using the hardware inference unit to build an inference analyzer, data streams can be processed directly within the chip, solving the problem of long processing times in existing technologies and achieving efficient data stream processing.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2025-12-24
- Publication Date
- 2026-07-02
AI Technical Summary
When using AI functions to process the data stream received by the forwarding chip, the existing technology takes a long time from sending a processing request from the forwarding chip to the AI analyzer to receiving the processing result, resulting in low efficiency.
By integrating a hardware inference resource pool into the communication chip, the inference model is determined based on the feature information of the data stream, and the inference analyzer is built using the hardware inference unit. The data stream is processed directly within the chip, avoiding interaction with external AI analyzers.
It improves the efficiency of AI functions in processing data streams, reduces processing time, and enhances the speed and efficiency of data stream processing by utilizing the rapid processing capabilities of dedicated hardware inference units.
Smart Images

Figure CN2025145024_02072026_PF_FP_ABST
Abstract
Description
Data stream processing methods and devices, communication chips, and communication equipment
[0001] This application claims priority to Chinese patent application filed on December 28, 2024, with application number 202411987992.1 and entitled "Data Stream Processing Method and Apparatus, Communication Chip, Communication Equipment", the entire contents of which are incorporated herein by reference. Technical Field
[0002] This application relates to the field of communication technology, and in particular to a data stream processing method and apparatus, a communication chip, and a communication device. Background Technology
[0003] In network devices, forwarding chips are primarily used for forwarding, processing, and managing data streams. In practical applications, artificial intelligence (AI) functions are often required to process the data streams received by the forwarding chip. For example, AI functions can be used to classify and statistically analyze packets received by the forwarding chip based on their characteristics, and to perform quality assessment and intelligent management of the data streams received by the forwarding chip.
[0004] Currently, in scenarios where AI functionality is used to process the data stream received by a forwarding chip, the forwarding chip sends a processing request, carrying the data stream received by the forwarding chip, to an AI analyzer that is communicatively connected to the network device where the forwarding chip resides. The AI analyzer processes the data stream received by the forwarding chip according to the processing request and then sends the processing result back to the forwarding chip.
[0005] However, the process from when the forwarding chip sends a processing request to the AI analyzer to when the forwarding chip receives the processing result from the AI analyzer takes a long time, resulting in low efficiency in using AI functions to process the data stream received by the forwarding chip. Summary of the Invention
[0006] This application provides a data stream processing method and apparatus, a communication chip, and a communication device. The technical solution of this application is as follows.
[0007] In a first aspect, a data stream processing method is provided, applied to a communication chip. The method includes: receiving a first message, the first message belonging to a first data stream; determining a first inference model based on feature information of the first message, the first inference model being used to instruct the processing of the first data stream using a first inference analyzer, the first inference analyzer including at least one hardware inference unit in a hardware inference resource pool, the communication chip including the hardware inference resource pool; and processing the first data stream using the first inference analyzer.
[0008] The first inference analyzer and the first inference model have a mapping relationship. The first inference analyzer is the first inference model mapped to the hardware inference resource pool, or in other words, the first inference analyzer is the embodiment of the first inference model in the hardware inference resource pool.
[0009] Each hardware inference unit in the hardware inference resource pool is used to implement a certain inference function. Any two hardware inference units in the hardware inference resource pool may have different inference functions, or some hardware inference units in the hardware inference resource pool may have the same inference function.
[0010] The inference model is also called the AI model or machine learning model, the inference analyzer is also called the AI analyzer or machine learning analyzer, the inference function is also called the AI function or machine learning function, and the hardware inference unit is also called the hardware AI unit or hardware machine learning unit.
[0011] The technical solution provided in this application involves a communication chip determining a first inference model based on the feature information of a first message belonging to a first data stream, and then using a first inference analyzer indicated by the first inference model to process the first data stream. Therefore, using AI functionality to process the first data stream is highly efficient. Specifically, on one hand, this application uses AI functionality within the communication chip to process the first data stream. Compared to using an external AI analyzer, this eliminates the need for interaction between the communication chip and the AI analyzer, resulting in shorter processing time and higher efficiency. On the other hand, the first inference analyzer in this application includes at least one hardware inference unit from a hardware inference resource pool. This hardware inference unit is a dedicated hardware inference unit, which has a faster processing speed. Therefore, using the first inference analyzer to process the first data stream is highly efficient.
[0012] Optionally, the method further includes: constructing a first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model. That is, mapping the first inference model to the hardware inference resource pool according to the configuration information of the first inference model.
[0013] In the technical solution provided in this application, a first inference analyzer is constructed based on the hardware inference resource pool according to the configuration information of the first inference model, which facilitates the processing of the first data stream by the first inference analyzer indicated by the first inference model.
[0014] Optionally, the first inference model includes at least one inference component, and the configuration information of the first inference model includes the mapping relationship between the at least one inference component and the at least one hardware inference unit in the hardware inference resource pool; constructing a first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model includes: determining the at least one hardware inference unit according to the mapping relationship between the at least one inference component and the at least one hardware inference unit; and constructing the first inference analyzer based on the at least one hardware inference unit.
[0015] The mapping relationship between the at least one inference component and the at least one hardware inference unit can be a one-to-one mapping relationship.
[0016] For example, the configuration information of the first inference model includes the identifier of the at least one inference component and the mapping relationship between the at least one inference component and the at least one hardware inference unit. The communication chip determines the at least one inference component based on the identifier of the at least one inference component, and then determines the at least one hardware inference unit based on the mapping relationship between the at least one inference component and the at least one hardware inference unit.
[0017] The technical solution provided in this application allows the communication chip to determine the at least one hardware inference unit based on the mapping relationship between at least one inference component in the first inference model and at least one hardware inference unit in the hardware inference resource pool, thereby enabling the mapping of the at least one inference component to the at least one hardware inference unit. Furthermore, the communication chip constructs a first inference analyzer based on the at least one hardware inference unit, enabling the mapping of the first inference model to the hardware inference resource pool.
[0018] Optionally, the at least one inference component may be multiple inference components, and the at least one hardware inference unit may be multiple hardware inference units. The configuration information of the first inference model further includes connectivity information of the multiple inference components, which is used to indicate the connectivity relationships of the multiple inference components. Constructing a first inference analyzer based on the at least one hardware inference unit includes: configuring the multiple hardware inference units to be connected according to the connectivity relationships indicated by the connectivity information of the multiple inference components. That is, configuring the multiple hardware inference units to be connected according to the connectivity relationships of the multiple inference components, and the connectivity relationships of the multiple hardware inference units correspond to the same connectivity relationships of the multiple inference components.
[0019] The technical solution provided in this application configures a communication chip with multiple hardware inference units corresponding to multiple inference components in the first inference model. These multiple inference components are connected according to their interconnection relationship, which facilitates mapping the first inference model to the hardware inference resource pool.
[0020] Optionally, the hardware inference units in the hardware inference resource pool are connected via gates; configuring the plurality of hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components includes: controlling the gate state of each gate used to connect the plurality of hardware inference units, so that the plurality of hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components. The gate may include a multiplexer, a demultiplexer, etc.
[0021] Optionally, the first inference model includes a first expert sub-model and a second expert sub-model. The first expert sub-model includes at least one first inference component, and the second expert sub-model includes at least one second inference component. The at least one inference component includes both the at least one first inference component and the at least one second inference component. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes at least one first hardware inference unit, and the second sub-inference analyzer includes at least one second hardware inference unit. The at least one hardware inference unit includes both the at least one first hardware inference unit and the at least one second hardware inference unit. Constructing the first inference analyzer based on the at least one inference component includes: constructing the first sub-inference analyzer based on the at least one first hardware inference unit; and constructing the second sub-inference analyzer based on the at least one second hardware inference unit.
[0022] The first sub-inference analyzer is mapped to the first expert sub-model; in other words, the first sub-inference analyzer is the embodiment of the first expert sub-model in the hardware inference resource pool. Similarly, the second sub-inference analyzer is mapped to the second expert sub-model in the hardware inference resource pool.
[0023] The technical solution provided in this application maps the first expert sub-model and the second expert sub-model to the hardware inference resource pool, which facilitates the mapping of the first inference model to the hardware inference resource pool (that is, the construction of the first inference analyzer based on the hardware inference resource pool).
[0024] Optionally, the at least one first inference component is a plurality of first inference components, the at least one second inference component is a plurality of second inference components, the at least one first hardware inference unit is a plurality of first hardware inference units, and the at least one second hardware inference unit is a plurality of second hardware inference units. The configuration information of the first inference model further includes connectivity information of the plurality of first inference components and connectivity information of the plurality of second inference components. The connectivity information of the plurality of first inference components is used to indicate the connectivity of the plurality of first inference components, and the connectivity information of the plurality of second inference components is used to indicate the connectivity of the plurality of second inference components. Constructing a first sub-inference analyzer based on the at least one first hardware inference unit includes: configuring the plurality of first hardware inference units to be connected according to the connectivity information of the plurality of first inference components. Constructing a second sub-inference analyzer based on the at least one second hardware inference unit includes: configuring the plurality of second hardware inference units to be connected according to the connectivity information of the plurality of second inference components. That is, the plurality of first hardware inference units are configured to be connected in accordance with the connectivity relationship of the plurality of first inference components, and the connectivity relationship of the plurality of first hardware inference units is the same as that of the plurality of first inference components; the plurality of second hardware inference units are configured to be connected in accordance with the connectivity relationship of the plurality of second inference components, and the connectivity relationship of the plurality of second hardware inference units is the same as that of the plurality of second inference components.
[0025] The technical solution provided in this application configures a communication chip with multiple first hardware inference units corresponding to multiple first inference components in a first expert sub-model. These units are connected according to the connectivity relationship of the multiple first inference components, facilitating the mapping of the first expert sub-model to the hardware inference resource pool. Similarly, the communication chip is configured with multiple second hardware inference units corresponding to multiple second inference components in a second expert sub-model. These units are connected according to the connectivity relationship of the multiple second inference components, facilitating the mapping of the second expert sub-model to the hardware inference resource pool.
[0026] Optionally, the hardware inference units in the hardware inference resource pool are connected via gates; configuring the plurality of first hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of first inference components includes: controlling the gate state of each gate used to connect the plurality of first hardware inference units, so that the plurality of first hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of first inference components; configuring the plurality of second hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components includes: controlling the gate state of each gate used to connect the plurality of second hardware inference units, so that the plurality of second hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components.
[0027] Optionally, the configuration information of the first inference model also includes serial-parallel connection information between the first expert sub-model and the second expert sub-model. This serial-parallel connection information is used to indicate the serial-parallel connection between the first expert sub-model and the second expert sub-model. Constructing the first inference analyzer based on at least one of the above inference components further includes: constructing the serial-parallel connection relationship between the first sub-inference analyzer and the second sub-inference analyzer according to the serial-parallel connection relationship indicated by the serial-parallel connection information between the first and second expert sub-models. Thus, the serial-parallel connection relationship between the first expert sub-model and the second expert sub-model can be mapped to the hardware inference resource pool.
[0028] Optionally, the hardware inference units in the hardware inference resource pool are connected via a gating mechanism. This serial-parallel connection information is used to indicate that the first expert sub-model and the second expert sub-model are connected in series. Based on the serial-parallel connection information indicating the first expert sub-model and the second expert sub-model, the serial-parallel connection relationship between the first sub-inference analyzer and the second sub-inference analyzer is constructed, including: controlling the gating state of the gating mechanism used to connect the first sub-inference analyzer and the second sub-inference analyzer, so that the first sub-inference analyzer and the second sub-inference analyzer are connected in series.
[0029] For example, by controlling the selection state of the gate used to connect the first target hardware inference unit and the second target hardware inference unit, the first target hardware inference unit and the second target hardware inference unit are connected, thereby connecting the first sub-inference analyzer and the second sub-inference analyzer in series. The first target hardware inference unit is the output inference unit in the first sub-inference analyzer, and corresponds to the input inference component in the first expert sub-model. The second target hardware inference unit is the input inference unit in the second sub-inference analyzer, and corresponds to the input inference component in the second expert sub-model. Specifically, the output inference unit in the first sub-inference analyzer is the hardware inference unit used to output the processing result (e.g., inference analysis result) of the first sub-inference analyzer. The input inference unit in the second sub-inference analyzer is the hardware inference unit used to receive input information. The output inference component in the first expert sub-model is the inference component used to output the processing result (e.g., inference analysis result) of the first expert sub-model. The input inference component in the second expert sub-model is the inference component used to receive input information.
[0030] Optionally, the configuration information of the first inference model may also include the parameter configuration information of at least one inference component; constructing the first inference analyzer based on the at least one hardware inference unit includes: configuring the inference parameters of the hardware inference unit corresponding to each inference component according to the parameter configuration information of each inference component in the at least one inference component.
[0031] The technical solution provided in this application configures the inference parameters of the hardware inference unit corresponding to each inference component in the hardware inference resource pool according to the parameter configuration information of each inference component in the first inference model. This enables the mapping of each inference component to the hardware inference unit, thereby facilitating the mapping of the first inference model to the hardware inference resource pool.
[0032] For example, the first inference model includes a first expert sub-model and a second expert sub-model. The first expert sub-model includes at least one first inference component, and the second expert sub-model includes at least one second inference component. The at least one inference component includes both the at least one first inference component and the at least one second inference component. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes at least one first hardware inference unit, and the second sub-inference analyzer includes at least one second hardware inference unit. The at least one hardware inference unit includes both the at least one first hardware inference unit and the at least one second hardware inference unit. Constructing the first sub-inference analyzer based on the at least one first hardware inference unit includes configuring inference parameters for the first hardware inference unit corresponding to each first inference component according to the parameter configuration information of each first inference component. Constructing the second sub-inference analyzer based on the at least one second hardware inference unit includes configuring inference parameters for the second hardware inference unit corresponding to each second inference component according to the parameter configuration information of each second inference component.
[0033] The technical solution provided in this application configures the inference parameters of the first hardware inference unit corresponding to each first inference component in the hardware inference resource pool based on the parameter configuration information of each first inference component in the first expert sub-model. This allows each first inference component to be mapped to a first hardware inference unit, thereby facilitating the mapping of the first expert sub-model to the hardware inference resource pool. Similarly, by configuring the inference parameters of the second hardware inference unit corresponding to each second inference component in the hardware inference resource pool based on the parameter configuration information of each second inference component in the second expert sub-model, this allows each second inference component to be mapped to a second hardware inference unit, thereby facilitating the mapping of the second expert sub-model to the hardware inference resource pool.
[0034] Optionally, processing the first data stream using a first inference analyzer includes: performing inference analysis on the first data stream using the first inference analyzer to obtain the inference analysis result of the first data stream; and performing processing operations related to the inference analysis result of the first data stream. For example, determining the processing operations related to the inference analysis result of the first data stream based on the inference analysis result of the first data stream and a first operation instruction table, wherein the first operation instruction table corresponds to a first inference model; and then performing the processing operations related to the inference analysis result of the first data stream.
[0035] Optionally, the processing operation includes at least one of the following: editing the packets of the first data stream; modifying the traffic management policy of the first data stream; modifying the inference analysis policy of the first data stream; and announcing the inference analysis results of the first data stream to a remote device.
[0036] Optionally, the first inference analyzer includes at least one hardware inference unit as described above, which constitutes a first inference path. The first inference analyzer is used to process the first data stream according to the first inference path.
[0037] Optionally, processing the first data stream using a first inference analyzer includes: processing the first data stream using the first inference analyzer based on relevant status information of the first data stream; wherein the relevant status information of the first data stream includes at least one of the following: status information of the first data stream; status information of data streams related to the first data stream; and status information of resources related to the first data stream.
[0038] Optionally, for any one of the first data stream and related data streams, the status information of any one data stream includes at least one of the following: the message information of the data stream; the forwarding information of the data stream; the traffic statistics of the data stream; the historical inference analysis results of the data stream obtained from the remote device; and the status information of resources related to the first data stream including at least one of the following: statistics of cache resources used to cache the first data stream; statistics of queue resources used to cache the first data stream; statistics of bandwidth resources used to forward the first data stream; and statistics of processing resources used to process the messages of the first data stream.
[0039] Optionally, the first data stream and the data streams related to the first data stream satisfy at least one of the following: the first data stream and the data streams related to the first data stream compete for cache resources; the first data stream and the data streams related to the first data stream compete for queue resources; the first data stream and the data streams related to the first data stream compete for bandwidth resources; the first data stream and the data streams related to the first data stream compete for processing resources.
[0040] In a second aspect, a data stream processing apparatus is provided for use in a communication chip. The data stream processing apparatus includes at least one functional module for performing the methods provided by the first aspect or any alternative method thereof. The at least one functional module can be implemented based on software, hardware, or a combination of both, and can be arbitrarily combined or divided based on a specific implementation.
[0041] Thirdly, a communication chip is provided, including the data stream processing device as provided in the second aspect.
[0042] Optionally, the communication chip may include, but is not limited to, a forwarding chip, a network access card, or a data processing unit (DPU) chip. The forwarding chip can be a network processor (NP) chip. A network access card is also called a network interface card (NIC).
[0043] Fourthly, a communication device is provided, including the communication chip as provided in the third aspect.
[0044] Optionally, the communication device includes, but is not limited to, network devices, wireless access devices, wireless communication devices, personal computer hosts, or laptops. Network devices may include switches or routers. Wireless access devices may be, for example, wireless local area network (WLAN) devices. Wireless communication devices include mobile terminals such as mobile phones and tablets.
[0045] Fifthly, a computer-readable storage medium is provided that stores a computer program, which, when executed, implements the method provided by the first aspect or any alternative method of the first aspect.
[0046] In a sixth aspect, a computer program product is provided, comprising a program or code that, when executed, implements the method provided as in the first aspect or any alternative manner of the first aspect.
[0047] The technical effects of the third to sixth aspects mentioned above can be referred to the technical effects of the first aspect and its optional implementation methods, and will not be elaborated here. Attached Figure Description
[0048] Figure 1 is a schematic diagram of a communication device provided in an embodiment of this application;
[0049] Figure 2 is a schematic diagram of a hardware inference resource pool provided in an embodiment of this application;
[0050] Figure 3 is a schematic diagram of another hardware inference resource pool provided in an embodiment of this application;
[0051] Figure 4 is a schematic diagram of a communication chip provided in an embodiment of this application;
[0052] Figure 5 is a schematic diagram of another communication chip provided in an embodiment of this application;
[0053] Figure 6 is a schematic diagram of an inference model A provided in an embodiment of this application;
[0054] Figure 7 is a schematic diagram of processing a data stream using an inference analyzer A according to an embodiment of this application;
[0055] Figure 8 is a schematic diagram of a reasoning model B provided in an embodiment of this application;
[0056] Figure 9 is a schematic diagram of a data stream processing method using inference analyzer B according to an embodiment of this application;
[0057] Figure 10 is a schematic diagram of a reasoning model C provided in an embodiment of this application;
[0058] Figure 11 is a schematic diagram of a data stream processing method using an inference analyzer C according to an embodiment of this application;
[0059] Figure 12 is a schematic diagram of using different inference analyzers to process different data streams according to an embodiment of this application;
[0060] Figure 13 is a flowchart of a data stream processing method provided in an embodiment of this application;
[0061] Figure 14 is a schematic diagram of a first inference model provided in an embodiment of this application;
[0062] Figure 15 is a schematic diagram of the first inference analyzer corresponding to the first inference model shown in Figure 14;
[0063] Figure 16 is a schematic diagram of another first inference model provided in an embodiment of this application;
[0064] Figure 17 is a schematic diagram of the first inference analyzer corresponding to the first inference model shown in Figure 16;
[0065] Figure 18 is a schematic diagram of another first inference model provided in an embodiment of this application;
[0066] Figure 19 is a schematic diagram of the first inference analyzer corresponding to the first inference model shown in Figure 18;
[0067] Figure 20 is a schematic diagram of yet another first inference model provided in an embodiment of this application;
[0068] Figure 21 is a schematic diagram of the first inference analyzer corresponding to the first inference model shown in Figure 20;
[0069] Figure 22 is a schematic diagram of yet another first inference model provided in an embodiment of this application;
[0070] Figure 23 is a schematic diagram of the first inference analyzer corresponding to the first inference model shown in Figure 22;
[0071] Figure 24 is a schematic diagram of yet another first inference model provided in the embodiments of this application;
[0072] Figure 25 is a schematic diagram of the first inference analyzer corresponding to the first inference model shown in Figure 24;
[0073] Figure 26 is a schematic diagram of yet another first inference model provided in an embodiment of this application;
[0074] Figure 27 is a schematic diagram of a data stream processing device provided in an embodiment of this application;
[0075] Figure 28 is a schematic diagram of another communication device provided in an embodiment of this application. Detailed Implementation
[0076] The embodiments of this application will now be described in further detail with reference to the accompanying drawings.
[0077] In network devices, forwarding chips are primarily used for forwarding, processing, and managing data streams. In practical applications, artificial intelligence (AI) functions are often required to process the data streams received by the forwarding chip.
[0078] Currently, in scenarios where AI functionality is used to process data streams received by a forwarding chip, the forwarding chip sends a processing request, carrying the data stream it received, to an AI analyzer connected to the network device where it resides. The AI analyzer processes the data stream according to the request and then sends the processing result back to the forwarding chip. The AI analyzer can include a central processing unit (CPU), a graphics processing unit (GPU), or a dedicated AI chip. AI analyzers can also be called machine learning (ML) analyzers.
[0079] However, the process from when the forwarding chip sends a processing request to the AI analyzer to when the forwarding chip receives the processing result from the AI analyzer generally takes several seconds or even several minutes, which is time-consuming and results in low efficiency of using AI functions to process the data stream received by the forwarding chip.
[0080] This application provides a data stream processing method and apparatus, a communication chip, and a communication device. The communication device includes a communication chip, which includes a hardware inference resource pool, and the hardware inference resource pool includes at least one hardware inference unit. After receiving a first message belonging to a first data stream, the communication chip determines a first inference model based on the feature information of the first message. The first inference model is used to instruct the use of a first inference analyzer to process the first data stream. The communication chip uses the first inference analyzer to process the first data stream, thus, using AI functions to process the first data stream is more efficient. Specifically, on one hand, this application uses AI functions in the communication chip to process the first data stream. Compared to using an external AI analyzer to process the data stream, this application does not require interaction between the communication chip and the AI analyzer, and the time consumption for processing the first data stream using AI functions is shorter, resulting in higher efficiency. On the other hand, this application uses a first inference analyzer to process the first data stream. The first inference analyzer includes at least one hardware inference unit in the hardware inference resource pool. The hardware inference unit is a dedicated hardware inference unit, for example, implemented using an application-specific integrated circuit (ASIC). The hardware inference unit has a fast processing speed and high efficiency, so the first inference analyzer can process the first data stream quickly and efficiently.
[0081] The communication chip can include, but is not limited to, forwarding chips, network access cards, or data processing unit (DPU) chips. A forwarding chip is also called a network chip or network forwarding chip; for example, a forwarding chip can be a network processor (NP) chip. A network access card is also called a network interface card (NIC). Communication equipment can include, but is not limited to, network devices, wireless access devices, wireless communication devices, personal computer hosts, or laptops. Network devices can include switches or routers. Wireless access devices are, for example, wireless local area network (WLAN) devices. Wireless communication devices include mobile terminals such as mobile phones and tablets.
[0082] The structure of the communication device and the communication chip provided in the embodiments of this application are described below.
[0083] Please refer to Figure 1, which shows a schematic diagram of a communication device provided in an embodiment of this application. The communication device includes a communication chip and multiple ports, with the communication chip communicatively connected to the multiple ports. The multiple ports are used for sending and receiving data streams. The communication chip is used for forwarding and managing the data streams. As shown in Figure 1, the communication chip includes a packet processor (PP) and a traffic manager (TM). The packet processor is used for forwarding the data streams, and the traffic manager is used for managing the data streams. The forwarding of the data streams includes, but is not limited to, packet parsing and packet routing, while the management of the data streams includes, but is not limited to, packet buffering, packet queuing, packet management, and packet scheduling. For example, for any message received by the communication device through any port, the message processor is used to: parse the message to obtain the destination address of the message, determine the outgoing port (i.e., message routing) of the message based on the destination address of the message, and assign the message to the outgoing port; the flow manager is used to: buffer the message in the message queue of the outgoing port of the message, schedule the message according to the queuing order of the message in the message queue, and send the message through the outgoing port when the message is scheduled.
[0084] In this embodiment, the communication chip is further configured to use AI functionality to process the data stream received by the communication chip. For example, the communication chip is further configured to: use AI functionality to perform inference analysis on the data stream received by the communication chip, obtain the inference analysis result of the data stream, and perform processing operations related to the inference analysis result of the data stream. As shown in FIG1, the communication chip further includes an AI module, which is connected to a message processor and a flow manager respectively. The AI module is configured to perform inference analysis on the data stream received by the communication chip to obtain the inference analysis result of the data stream; the message processor and / or flow manager are configured to perform processing operations related to the inference analysis result of the data stream.
[0085] In this embodiment, the communication chip includes a hardware inference resource pool, which includes at least one hardware inference unit, each of which implements an inference function. When the hardware inference resource pool includes multiple hardware inference units, any two of these units may have different inference functions, or some of them may have the same inference function (e.g., some of the units are identical). Furthermore, when the hardware inference resource pool includes multiple hardware inference units, these units can be connected via a gate, with each unit connected to at least one other unit via the gate. The gate for connecting any two units controls whether they are connected or disconnected. The connection or disconnection of the two units can be controlled by controlling the gate's state. The communication chip is used to: construct an inference analyzer based on the hardware inference resource pool according to the configuration information of the inference model, and then use the inference analyzer to process the data stream. For example, the communication chip is used to: based on the configuration information of the inference model, control the gating state of the gating device used to connect the hardware inference units in the hardware inference resource pool, thereby enabling the construction of an inference analyzer based on the hardware inference resource pool.
[0086] Please refer to Figure 2, which shows a schematic diagram of a hardware inference resource pool provided in an embodiment of this application. Figure 2 illustrates an example of a hardware inference resource pool comprising m×n hardware inference units, where m and n are both positive integers. The m×n hardware inference units include: hardware inference unit 11, hardware inference unit 12, hardware inference unit 13…hardware inference unit 1n, hardware inference unit 21, hardware inference unit 22, hardware inference unit 23…hardware inference unit 2n,…, hardware inference unit m1, hardware inference unit m2, hardware inference unit m3…hardware inference unit mn. Each of the m×n hardware inference units implements a specific inference function. Any two hardware inference units may have different inference functions, or some hardware inference units may have the same inference function (e.g., some hardware inference units may be identical). The m×n hardware inference units can be connected via a gating system. The communication chip is used to: construct an inference analyzer based on at least one of the m×n hardware inference units according to the configuration information of the inference model, and use the inference analyzer to process the data stream.
[0087] In optional embodiments, the hardware inference units in the hardware inference resource pool include at least one of the following: support vector machine (SVM) inference units, decision tree (DT) inference units, recurrent neural network (RNN) inference units, convolutional neural network (CNN) inference units, fully connected layer (FCL) inference units, recurrent layer inference units, naive bayes (NB) inference units, random forest (RF) inference units, multilayer perceptron (MLP) inference units, k-nearest neighbors (KNN) inference units, and clustering inference units. For example, the hardware inference units in the hardware inference resource pool include SVM inference units, DT inference units, RNN inference units, CNN inference units, FCL inference units, recurrent layer inference units, NB inference units, RF inference units, MLP inference units, KNN inference units, and clustering inference units, as shown in Figure 3.
[0088] It should be noted that, for the sake of brevity, Figures 2 and 3 do not show the gates used to connect the hardware inference units. The gates in this embodiment can be implemented in software or hardware, and may include multiplexers, demultiplexers, etc. Figure 3 illustrates an example where the hardware inference units in the hardware inference resource pool are different; in other embodiments, the hardware inference resource pool includes multiple identical hardware inference units. Furthermore, besides the hardware inference units shown in Figure 3, the hardware inference resource pool may also include any other possible hardware inference units. This embodiment sets up a hardware inference resource pool in the communication chip, enabling the communication chip to flexibly construct various inference analyzers based on the configuration information of various possible inference models to perform inference analysis on the data stream. This hardware inference resource pool provides the communication chip with a larger and more flexible selection space when constructing inference analyzers, improving the effectiveness of using inference analyzers to perform inference analysis on the data stream.
[0089] In an optional embodiment, as shown in Figure 1, the communication chip includes an AI module. The AI module includes the aforementioned hardware inference resource pool (not shown in Figure 1). The communication chip processes the data stream using an inference analyzer, including: the AI module using the inference analyzer to perform inference analysis on the data stream to obtain inference analysis results; and the message processor and / or traffic manager performing processing operations related to the inference analysis results. For example, the AI module constructs an inference analyzer based on the hardware inference resource pool according to the configuration information of the inference model, and then uses the inference analyzer to perform inference analysis on the data stream.
[0090] In this embodiment, the hardware inference unit is also referred to as a hardware AI unit or a hardware machine learning unit; the inference function is also referred to as an AI function or a machine learning function; the inference model is also referred to as an AI model or a machine learning model; and the inference analyzer is also referred to as an AI analyzer or a machine learning analyzer. In this embodiment, the inference analyzer and the inference model have a mapping relationship. The inference analyzer is a mapping to the inference model in the hardware inference resource pool, or in other words, the inference analyzer is the embodiment of the inference model in the hardware inference resource pool. The inference analyzer includes at least one hardware inference unit in the hardware inference resource pool. The communication chip uses the inference analyzer to process the data stream, which is equivalent to using the AI function to process the data stream. The hardware inference unit in this embodiment is a dedicated hardware inference unit. The hardware inference unit can be implemented using dedicated hardware circuits. The hardware inference unit has a faster processing speed and higher processing efficiency. Therefore, using an inference analyzer that includes a hardware inference unit to process the data stream is faster and more efficient.
[0091] For ease of description, the following explanation will use the example of a communication chip using AI functionality to process the first data stream.
[0092] The communication chip is used to: receive a first message belonging to a first data stream; determine a first inference model based on the feature information of the first message, the first inference model being used to instruct the use of a first inference analyzer to process the first data stream, the first inference analyzer including at least one hardware inference unit in a hardware inference resource pool; and process the first data stream using the first inference analyzer. The first inference analyzer and the first inference model have a mapping relationship; the first inference analyzer is mapped to the first inference model in the hardware inference resource pool, or in other words, the first inference analyzer is the embodiment of the first inference model in the hardware inference resource pool. Therefore, the communication chip using the first inference analyzer to process the first data stream can be understood as the communication chip using the first inference model mapped to the hardware inference resource pool to process the first data stream, that is, using AI functionality to process the first data stream.
[0093] In an optional embodiment, the communication chip is further configured to: after determining the first inference model based on the feature information of the first message, construct a first inference analyzer based on the configuration information of the first inference model and a hardware inference resource pool to map the first inference model to the hardware inference resource pool; then, use the first inference analyzer to process the first data stream. In a specific embodiment, the first inference model includes at least one inference component, and the configuration information of the first inference model includes a mapping relationship between the at least one inference component and the at least one hardware inference unit included in the first inference analyzer. For example, the mapping relationship between the at least one inference component and the at least one hardware inference unit is a one-to-one mapping relationship. The communication chip is configured to: determine the at least one hardware inference unit based on the mapping relationship between the at least one inference component and the at least one hardware inference unit, and construct the first inference analyzer based on the at least one hardware inference unit. In one embodiment, the at least one inference component is a single inference component, and the at least one hardware inference unit is a single hardware inference unit. The configuration information of the first inference model also includes parameter configuration information of the single inference component. The communication chip is configured to: configure the inference parameters of the single hardware inference unit based on the parameter configuration information of the single inference component; thereby realizing the construction of the first inference analyzer based on the single hardware inference unit. In another embodiment, the at least one inference component is a plurality of inference components, and the at least one hardware inference unit is a plurality of hardware inference units. The configuration information of the first inference model further includes connectivity information of the plurality of inference components and parameter configuration information of the plurality of inference components. The connectivity information of the plurality of inference components is used to indicate the connectivity of the plurality of inference components. The communication chip is used to: configure the plurality of hardware inference units to be connected according to the connectivity indicated by the connectivity information of the plurality of inference components, and configure the inference parameters of the hardware inference unit corresponding to each inference component according to the parameter configuration information of each of the plurality of inference components; thereby realizing the construction of a first inference analyzer based on the plurality of hardware inference units. For example, the hardware inference units in the hardware inference resource pool are connected through selectors; the communication chip controls the selection state of each selector used to connect the plurality of hardware inference units, so that the plurality of hardware inference units are connected according to the connectivity indicated by the connectivity information of the plurality of inference components.
[0094] In an optional embodiment, the communication chip processes the first data stream using a first inference analyzer, including: the communication chip processes the first data stream using the first inference analyzer based on relevant state information of the first data stream. For example, the communication chip is configured to: perform inference analysis on the first data stream using the first inference analyzer to obtain the inference analysis result of the first data stream, and perform processing operations related to the inference analysis result of the first data stream. In a specific embodiment, the communication chip is configured to: perform inference analysis on the first data stream using the first inference analyzer based on relevant state information of the first data stream to obtain the inference analysis result of the first data stream, and then perform processing operations related to the inference analysis result of the first data stream.
[0095] The relevant status information of the first data stream includes at least one of the following: status information of the first data stream; status information of data streams related to the first data stream; and status information of resources related to the first data stream. The first data stream and its related data streams satisfy at least one of the following: the first data stream and its related data streams compete for cache resources; the first data stream and its related data streams compete for queue resources; the first data stream and its related data streams compete for bandwidth resources; or the first data stream and its related data streams compete for processing resources. For example, if packets from the first data stream and packets from its related data streams need to be cached in the same cache space, then the first data stream and its related data streams compete for cache resources. If packets from the first data stream and packets from its related data streams need to be cached in the same packet queue, then the first data stream and its related data streams compete for queue resources. If packets from the first data stream and packets from its related data streams need to be forwarded through the same port, then the first data stream and its related data streams compete for the bandwidth resources of that port. If packets from a first data stream and packets from related data streams need to be processed in the same processing resources, then the first data stream and its related data streams will compete for processing resources. The conditions that the first data stream and its related data streams must meet are merely examples; they may also compete for other resources, such as table entry resources. Other conditions may also be satisfied by the first data stream and its related data streams, but this embodiment does not limit the scope of these conditions.
[0096] For any one of the data streams in the first data stream and related data streams, the status information of that data stream includes at least one of the following: the message information of that data stream; the forwarding information of that data stream; the traffic statistics of that data stream; and the historical reasoning analysis results of that data stream obtained from the remote device. The message information of any data stream refers to the information of the message itself, such as the information carried in the message. This message information is also called the original message information, which includes, but is not limited to, message size, message header information, and payload information. The message header information includes the destination address and protocol number. The forwarding information for any data stream includes various possible intermediate process information generated and / or used by the communication chip during the pipelined processing of the packets in that data stream. This intermediate process information includes, but is not limited to, tunnel information, information about the outgoing port of the packet determined based on the destination address of the packet, the processing priority of the packet within the communication chip determined based on the original packet information, and information about the packet queue used to buffer the packets of that data stream. The traffic statistics information for any data stream is statistical information obtained by performing traffic statistics on that data stream. This includes, but is not limited to, the number of packets belonging to a certain data stream passing through this communication chip within a specified duration (or time period), the rate of that data stream, the packet loss rate of that data stream, the latency of that data stream, and other possible statistical information. The traffic statistics information for any data stream includes historical traffic statistics and real-time traffic statistics. The historical inference analysis result of any data stream is the inference analysis result obtained by performing inference analysis on that data stream at a historical moment (or within a historical time period). The historical inference analysis result of any data stream includes, but is not limited to, the maximum latency of that data stream within the historical time period, the rate of that data stream within the historical time period, and the packet loss rate of that data stream within the historical time period. In this embodiment, after the communication chip performs inference analysis on any data stream, it can send the inference analysis result of that data stream to a remote device, so that the remote device can store the inference analysis result of that data stream. In this way, during subsequent inference analysis of that data stream and / or related data streams, the communication chip can obtain the historical inference analysis result of that data stream from the remote device, and combine it with the historical inference analysis result to perform inference analysis on that data stream and / or related data streams, thereby achieving further inference analysis by combining the historical inference analysis result of the data stream and improving the accuracy of the inference analysis. The remote device can be a controller or a server, etc.
[0097] The status information of resources related to the first data stream includes at least one of the following: statistical information on cache resources used to cache the first data stream; statistical information on queue resources used to cache the first data stream; statistical information on bandwidth resources used to forward the first data stream; and statistical information on processing resources used to process packets of the first data stream. The statistical information on cache resources used to cache the first data stream includes the utilization rate of the cache resources. The statistical information on queue resources used to cache the first data stream includes the utilization rate of the packet queue used to cache the first data stream and the latency of the packet queue used to cache the first data stream. The statistical information on bandwidth resources used to forward the first data stream includes the bandwidth utilization rate of the port used to forward the first data stream. The statistical information on processing resources used to process packets of the first data stream includes the utilization rate of the processing resources used to process packets of the first data stream. In optional embodiments, the status information of resources related to the first data stream may also include any possible information such as the latency of the port used to forward the first data stream; this embodiment of the application does not limit this.
[0098] In an optional embodiment, the communication chip is configured to: determine a first inference mode based on the feature information of a first message; determine a first inference model based on the first inference mode; construct a first inference analyzer based on a hardware inference resource pool according to the configuration information of the first inference model; perform inference analysis on a first data stream using the first inference model to obtain the inference analysis result of the first data stream; determine processing operations related to the inference analysis result of the first data stream; and execute processing operations related to the inference analysis result of the first data stream. For example, as shown in Figure 1, the communication chip includes an AI module, a message processor, and a traffic manager. The AI module includes a hardware inference resource pool. The message processor is configured to: determine a first inference mode based on the feature information of a first message; the AI module is configured to: determine a first inference model based on the first inference mode; construct a first inference analyzer based on the configuration information of the first inference model using the hardware inference resource pool; perform inference analysis on the first data stream using the first inference model according to the relevant status information of the first data stream to obtain the inference analysis result of the first data stream; and determine processing operations related to the inference analysis result of the first data stream. The message processor and / or traffic manager are configured to: execute processing operations related to the inference analysis result of the first data stream.
[0099] Please refer to Figure 4, which shows a schematic diagram of a communication chip provided in an embodiment of this application. The communication chip includes an AI module, a message processor, and a flow manager. The message processor includes a message parsing unit, an inference mode determination unit, and a forwarding and editing unit. The AI module includes an inference module, a flow statistics unit, and a resource statistics unit.
[0100] The message parsing unit is used to parse the messages received by the communication chip to obtain the message information of the message. For example, the message parsing unit is used to parse the first message belonging to the first data stream received by the communication chip to obtain the message information of the first message. The message information includes characteristic information, and includes a message header, message header information, etc. The message information is also called packet-level status information.
[0101] The flow statistics unit is used to acquire and store the flow statistics information of each data stream received by the communication chip. For example, the flow statistics unit is used to: perform flow statistics on each data stream based on the parsing results of the message parsing unit for each data stream received by the communication chip, and obtain the flow statistics information of each data stream; and / or, the flow statistics unit is used to: acquire the flow statistics information of each data stream from a remote device. The flow statistics information is also called flow-level status information.
[0102] The resource statistics unit interacts with the traffic manager to statistically analyze the resources used for caching each data stream received by the communication chip, including caching resources, queue resources, bandwidth resources, and packet processing resources, to obtain statistical information about the resources associated with each data stream. In other words, the resource statistics unit interacts with the traffic manager to obtain the status information of the resources associated with each data stream received by the communication chip. This status information is also called resource status information.
[0103] The inference mode determination unit is used to determine the first inference mode (also known as the inference mode of the first data stream) based on the feature information of the first message obtained by the message parsing unit from parsing the first message belonging to the first data stream.
[0104] The inference module is used to: determine a first inference model (the first inference model is an inference model used for inference analysis of the first data stream) based on the first inference mode determined by the inference mode determination unit; construct a first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model; obtain relevant status information of the first data stream according to the configuration information of the first inference model; perform inference analysis on the first data stream using the first inference analyzer according to the relevant status information of the first data stream, and obtain the inference analysis result of the first data stream; and determine the processing operations related to the inference analysis result of the first data stream. Taking the relevant status information of the first data stream, including the traffic statistics information (i.e., flow-level status information), the status information of resources related to the first data stream (i.e., resource status information), and the packet information (i.e., packet-level status information) of the first data stream, as an example, the inference module is used to: obtain the traffic statistics information of the first data stream from the flow statistics unit, obtain the status information of resources related to the first data stream from the resource statistics unit, and obtain the packet information of the first data stream from the packet parsing unit according to the configuration information of the first inference model; and use the first inference analyzer to perform inference analysis on the first data stream based on the traffic statistics information, the status information of resources related to the first data stream, and the packet information of the first data stream, to obtain the inference analysis result of the first data stream.
[0105] The processing operations related to the inference analysis results of the first data stream include at least one of the following: editing the packets of the first data stream; modifying the traffic management policy of the first data stream; modifying the inference analysis policy of the first data stream; and announcing the inference analysis results of the first data stream to a remote device. The inference module and / or the forwarding editing unit and / or the traffic manager are used to perform the processing operations related to the inference analysis results of the first data stream. For example, the processing operations related to the inference analysis results of the first data stream include editing the packets of the first data stream; the forwarding editing unit is used to edit the packets of the first data stream (e.g., the first packet); and the traffic manager is used to manage the packets processed by the forwarding editing unit, including but not limited to packet buffering, packet queuing, packet scheduling, and sending packets through the egress port. For example, processing operations related to the inference analysis results of the first data stream include modifying the traffic management policy of the first data stream; the traffic manager is used to modify the traffic management policy of the first data stream (in this case, the inference module can transmit information about the processing operations related to the inference analysis results of the first data stream to the traffic manager through the forwarding and editing unit, or the inference module can directly transmit information about the processing operations related to the inference analysis results of the first data stream to the traffic manager. If the inference module directly transmits information about the processing operations related to the inference analysis results of the first data stream to the traffic manager, the inference module is also connected to the traffic manager). For another example, processing operations related to the inference analysis results of the first data stream include modifying the inference analysis policy of the first data stream; the inference module is also used to modify the inference analysis policy of the first data stream. For yet another example, processing operations related to the inference analysis results of the first data stream include announcing the inference analysis results of the first data stream to a remote device; the inference module is also used to announce the inference analysis results of the first data stream to a remote device (in this case, the inference module is also connected to the remote device). In the absence of any processing operations related to the inference and analysis results of the first data stream that do not involve editing the packets of the first data stream, the forwarding and editing unit is used to perform routine forwarding and editing operations on the packets of the first data stream, and the flow manager is used to manage the packets processed by the forwarding and editing unit.
[0106] In an optional embodiment, the forwarding and editing unit is used to perform pipelined processing on the packets. During the pipelined processing of the packets of the first data stream, the forwarding and editing unit may generate and / or use various possible intermediate process information (i.e., forwarding information of the first data stream). The relevant status information of the first data stream also includes the forwarding information of the first data stream. In a specific embodiment, the relevant status information of the first data stream includes the traffic statistics information of the first data stream, the status information of the resources related to the first data stream, the packet information of the first data stream, and the forwarding information of the first data stream. The inference module is used to: perform inference analysis on the first data stream using a first inference analyzer based on the traffic statistics information of the first data stream, the status information of the resources related to the first data stream, the packet information of the first data stream, and the forwarding information of the first data stream.
[0107] In an optional embodiment, the relevant status information of the first data stream further includes historical inference analysis results of the first data stream obtained from a remote device. The inference module is used to perform inference analysis on the first data stream by combining the historical inference analysis results of the first data stream. In a specific embodiment, the relevant status information of the first data stream includes traffic statistics information of the first data stream, status information of resources related to the first data stream, packet information of the first data stream, forwarding information of the first data stream, and historical inference analysis results of the first data stream obtained from a remote device. The inference module is used to: perform inference analysis on the first data stream using a first inference analyzer based on the traffic statistics information of the first data stream, the status information of resources related to the first data stream, packet information of the first data stream, forwarding information of the first data stream, and historical inference analysis results of the first data stream.
[0108] In an optional embodiment, Figure 5 is a schematic diagram of another communication chip provided in an embodiment of this application. As shown in Figure 5, the inference module includes a selection unit, a hardware inference resource pool, a configuration unit, and a search unit. The description will continue using the processing of the first data stream as an example.
[0109] The inference mode determination unit is used to determine a first inference mode based on the feature information of the first message obtained by the message parsing unit from the first message belonging to the first data stream. The configuration unit is used to: determine a first inference model based on the first inference mode determined by the inference mode determination unit; and construct a first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model.
[0110] The selection unit is used to: determine a first inference model based on the first inference mode determined by the inference mode determination unit; obtain relevant status information of the first data stream based on the configuration information of the first inference model; input the relevant status information of the first data stream into the first inference analyzer, so that the first inference analyzer performs inference analysis on the first data stream based on the relevant status information of the first data stream to obtain the inference analysis result of the first data stream, and outputs the inference analysis result of the first data stream to the lookup unit. Taking the relevant status information of the first data stream, including the traffic statistics information (i.e., flow-level status information), the status information of resources related to the first data stream (i.e., resource status information), and the packet information (i.e., packet-level status information) of the first data stream, as an example, the selection unit is used to: obtain the traffic statistics information of the first data stream from the flow statistics unit, obtain the status information of resources related to the first data stream from the resource statistics unit, and obtain the packet information of the first data stream from the packet parsing unit according to the configuration information of the first inference model; input the traffic statistics information of the first data stream, the status information of resources related to the first data stream, and the packet information of the first data stream into the first inference analyzer, so that the first inference analyzer performs inference analysis on the first data stream based on the traffic statistics information of the first data stream, the status information of resources related to the first data stream, and the packet information of the first data stream to obtain the inference analysis result of the first data stream, and output the inference analysis result of the first data stream to the lookup unit.
[0111] The lookup unit is used to determine the processing operations related to the inference analysis results of the first data stream. For example, the lookup unit is used to determine the processing operations related to the inference analysis results of the first data stream based on the inference analysis results of the first data stream and a first operation instruction table. The first operation instruction table corresponds to a first inference model and includes the inference analysis results of the first data stream and indication information of the processing operations related to the inference analysis results of the first data stream. For example, the first operation instruction table includes the inference analysis results of the first data stream and operation instruction information corresponding to the inference analysis results of the first data stream. This operation instruction information is used to indicate the processing operations related to the inference analysis results of the first data stream. In an optional embodiment, the configuration unit is further used to: after determining the first inference model according to the first inference model determined by the inference model determination unit, send indication information of the first inference model to the lookup unit. The lookup unit is used to: determine the first operation instruction table corresponding to the first inference model based on the indication information of the first inference model, and determine the processing operations related to the inference analysis results of the first data stream based on the inference analysis results of the first data stream and the first operation instruction table.
[0112] As an example, the first data stream is data stream 1, the first inference mode is inference mode A, and the first inference model is inference model A. Figure 6 is a schematic diagram of inference model A. As shown in Figure 6, inference model A includes an FCL inference component and a loop layer inference component. The FCL inference component and the loop layer inference component are connected sequentially. The FCL inference component is the input inference component of inference model A, used to input information into inference model A. The loop layer inference component is the output inference component of inference model A, used to output the inference analysis results obtained by inference model A. Please refer to Figure 7, which shows a schematic diagram of using inference analyzer A to process data stream 1 according to an embodiment of this application. Inference analyzer A and inference model A have a mapping relationship; inference analyzer A is the embodiment of inference model A in the hardware inference resource pool. Referring to Figure 7 and in conjunction with Figure 5, the inference mode determination unit determines inference mode A based on the feature information of message 1 obtained by the message parsing unit from message 1 belonging to data stream 1 (e.g., the first message belonging to the first data stream).
[0113] The configuration unit determines inference model A based on inference mode A. The configuration unit then constructs inference analyzer A based on the hardware inference resource pool, using the configuration information of inference model A. In a specific embodiment, the configuration information of inference model A includes: the mapping relationship between FCL inference components and FCL inference units, the mapping relationship between loop layer inference components and loop layer inference units, the connectivity information between the FCL inference components and the loop layer inference components, the parameter configuration information of the FCL inference components, and the parameter configuration information of the loop layer inference components. The configuration unit determines the FCL inference unit in the hardware inference resource pool based on the mapping relationship between the FCL inference component and the FCL inference unit; the configuration unit determines the loop layer inference unit in the hardware inference resource pool based on the mapping relationship between the loop layer inference component and the loop layer inference unit; the configuration unit configures the FCL inference unit and the loop layer inference unit to be connected according to the connection relationship indicated by the connection relationship information between the FCL inference component and the loop layer inference component; the configuration unit configures the inference parameters of the FCL inference unit according to the parameter configuration information of the FCL inference component, so that the parameter values of each inference parameter in the FCL inference unit are the same as the parameter values of the corresponding inference parameters in the FCL inference component; the configuration unit configures the inference parameters of the loop layer inference unit according to the parameter configuration information of the loop layer inference component, so that the parameter values of each inference parameter in the loop layer inference unit are the same as the parameter values of the corresponding inference parameters in the loop layer inference component. Since the FCL inference component is the input inference component of inference model A, and the loop layer inference component is the output inference component of inference model A, the FCL inference unit is the input inference unit of inference analyzer A, and the loop layer inference unit is the output inference unit of inference model A. The FCL inference unit is used to input information to inference analyzer A, and the loop layer inference unit is used to output the inference analysis results obtained by inference analyzer A. The configuration unit also configures the FCL inference unit to be connected to the selection unit and the loop layer inference unit to be connected to the search unit.
[0114] The selection unit determines the inference model A based on inference mode A. Based on the configuration information of inference model A, the selection unit obtains the relevant status information X4, X8, and X9 of data flow 1 from candidate information X1, X2, X3… (referring to Figure 5, for example, candidate information X1, X2, X3… is jointly provided by the flow statistics unit, resource statistics unit, and message parsing unit). The selection unit inputs the relevant status information X4, X8, and X9 of data flow 1 into the inference analyzer A (specifically, into the FCL inference unit in this embodiment). The FCL inference unit and the loop layer inference unit in inference analyzer A sequentially perform inference analysis based on the relevant status information X4, X8, and X9 of data flow 1 to obtain the inference analysis result A of data flow 1. The loop layer inference unit in inference analyzer A outputs the inference analysis result A to the lookup unit. The lookup unit determines the processing operation A related to the inference analysis result A. For example, after the configuration unit determines the inference model A based on the inference mode A, it sends the instruction information of the inference model A to the search unit. The search unit determines the operation instruction table corresponding to the inference model A based on the instruction information of the inference model A. The search unit determines the processing operation A related to the inference analysis result A based on the inference analysis result A and the operation instruction table corresponding to the inference model A.
[0115] As another example, the first data stream is data stream 2, the first inference mode is inference mode B, and the first inference model is inference model B. Figure 8 is a schematic diagram of inference model B. As shown in Figure 8, inference model B includes a decision tree inference component. This decision tree inference component is the input inference component of inference model B, and it is also the output inference component of inference model B. This decision tree inference component is used to input information into inference model B, and it is used to output the inference analysis results obtained by inference model B through inference analysis. Please refer to Figure 9, which shows a schematic diagram of using inference analyzer B to process data stream 2 according to an embodiment of this application. Inference analyzer B and inference model B have a mapping relationship, and inference analyzer B is the embodiment of inference model B in the hardware inference resource pool. Referring to Figure 9 and in conjunction with Figure 5, the inference mode determination unit determines the inference mode B based on the feature information of message 2 obtained by the message parsing unit from message 2 belonging to data stream 2 (e.g., the first message belonging to the first data stream).
[0116] The configuration unit determines inference model B based on inference mode B. The configuration unit constructs inference analyzer B based on the hardware inference resource pool according to the configuration information of inference model B. In a specific embodiment, the configuration information of inference model B includes: the mapping relationship between decision tree inference components and decision tree inference units, and the parameter configuration information of the decision tree inference components. The configuration unit determines the decision tree inference unit in the hardware inference resource pool according to the mapping relationship between the decision tree inference components and the decision tree inference units; the configuration unit configures the inference parameters of the decision tree inference unit according to the parameter configuration information of the decision tree inference component, so that the parameter values of each inference parameter in the decision tree inference unit are the same as the parameter values of the corresponding inference parameters in the decision tree inference component. Since the decision tree inference component is both the input and output inference component of inference model B, it is also the input and output inference unit of inference analyzer B. The decision tree inference unit is used to input information into inference analyzer B and output the inference analysis results obtained by inference analyzer B. The configuration unit further configures the decision tree reasoning unit to be connected to the selection unit, and the configuration unit further configures the decision tree reasoning unit to be connected to the search unit.
[0117] The selection unit determines the inference model B based on the inference mode B. Based on the configuration information of the inference model B, the selection unit obtains the relevant status information X1, X6, and X7 of data flow 2 from the candidate information X1, X2, X3… (referring to Figure 5, for example, the candidate information X1, X2, X3… is jointly provided by the flow statistics unit, resource statistics unit, and message parsing unit). The selection unit inputs the relevant status information X1, X6, and X7 of data flow 2 into the inference analyzer B (specifically, into the decision tree inference unit in this embodiment). The decision tree inference unit in the inference analyzer B performs inference analysis based on the relevant status information X1, X6, and X7 of data flow 2 to obtain the inference analysis result B of data flow 2. This decision tree inference unit outputs the inference analysis result B to the search unit. The search unit determines the processing operation B related to the inference analysis result B. For example, after the configuration unit determines the inference model B based on the inference mode B, it sends the instruction information of the inference model B to the search unit. The search unit determines the operation instruction table corresponding to the inference model B based on the instruction information of the inference model B. The search unit determines the processing operation B related to the inference analysis result B based on the inference analysis result B and the operation instruction table corresponding to the inference model B.
[0118] In an optional embodiment, data flow 2 is a transmission control protocol (TCP) flow, and message 2 is a synchronization (SYN) message. The characteristic information of message 2 includes the transport layer protocol type of message 2, which is TCP. Information X1 is the TCP SYN flag carried by message 2, information X6 is the number of SYN messages of data flow 2 passing through the communication chip within a specified time period, and information X7 is the rate of SYN messages of data flow 2 passing through the communication chip within a specified time period. Inference model B performs inference analysis based on the relevant state information X1, X6, and X7 of data flow 2 to obtain the inference analysis result B of data flow 2. The inference analysis result B is used to indicate whether data flow 2 is a distributed denial-of-service (DDoS) attack flow. The processing operation B is either a drop operation or a send operation. For example, if the inference analysis result B indicates that data flow 2 is a DDoS attack flow, the processing operation B is a drop operation, and the message processor or flow manager drops the message of data flow 2. If the inference analysis result B indicates that data flow 2 is not a DDoS attack flow, then processing operation B is a send operation. The packet processor forwards the packets of data flow 2, and the traffic manager manages the packets of data flow 2 normally.
[0119] As another example, the first data stream is data stream 3, the first inference mode is inference mode C, and the first inference model is inference model C. Figure 10 is a schematic diagram of inference model C. As shown in Figure 10, inference model C includes a CNN inference component. This CNN inference component is the input inference component of inference model C, and it is also the output inference component of inference model C. This CNN inference component is used to input information into inference model C, and it is used to output the inference analysis results obtained by inference model C. Please refer to Figure 11, which shows a schematic diagram of using an inference analyzer C to process data stream 3 according to an embodiment of this application. The inference analyzer C and the inference model C have a mapping relationship, and the inference analyzer C is the embodiment of inference model C in the hardware inference resource pool. Referring to Figure 11 and in conjunction with Figure 5, the inference mode determination unit determines the inference mode C based on the feature information of message 3 obtained by the message parsing unit from message 3 belonging to data stream 3 (e.g., the first message belonging to the first data stream).
[0120] The configuration unit determines the inference model C based on the inference mode C. The configuration unit constructs an inference analyzer C based on the configuration information of the inference model C and the hardware inference resource pool. In a specific embodiment, the configuration information of the inference model C includes: the mapping relationship between CNN inference components and CNN inference units, and the parameter configuration information of the CNN inference components. The configuration unit determines the CNN inference unit in the hardware inference resource pool based on the mapping relationship between the CNN inference components and the CNN inference units; the configuration unit configures the inference parameters of the CNN inference unit according to the parameter configuration information of the CNN inference component, so that the parameter values of each inference parameter in the CNN inference unit are the same as the parameter values of the corresponding inference parameters in the CNN inference component. Since the CNN inference component is both the input and output inference component of the inference model C, the CNN inference unit is also the input and output inference unit of the inference analyzer C. The CNN inference unit is used to input information into the inference analyzer C and output the inference analysis results obtained by the inference analyzer C. The configuration unit also configures the CNN inference unit to be connected to the selection unit, and the configuration unit also configures the CNN inference unit to be connected to the search unit.
[0121] The selection unit determines the inference model C based on the inference mode C. Based on the configuration information of the inference model C, the selection unit obtains the relevant status information X2, X3, and X5 of data flow 3 from the candidate information X1, X2, X3… (referring to Figure 5, for example, the candidate information X1, X2, X3… is jointly provided by the flow statistics unit, resource statistics unit, and message parsing unit). The selection unit inputs the relevant status information X2, X3, and X5 of data flow 3 into the inference analyzer C (specifically, into the CNN inference unit in this embodiment). The CNN inference unit in the inference analyzer C performs inference analysis based on the relevant status information X2, X3, and X5 of data flow 3 to obtain the inference analysis result C of data flow 3. This CNN inference unit outputs the inference analysis result C to the search unit. The search unit determines the processing operation C related to the inference analysis result C. For example, after the configuration unit determines the inference model C based on the inference mode C, it sends the indication information of the inference model C to the search unit. The search unit determines the operation instruction table corresponding to the inference model C based on the indication information of the inference model C. The search unit determines the processing operation C related to the inference analysis result C based on the inference analysis result C and the operation instruction table corresponding to the inference model C.
[0122] In an optional embodiment, data stream 3 is a User Datagram Protocol (UDP) stream. The characteristic information of message 3 includes the transport layer protocol type of message 3, which is UDP. Information X2 is the message length of message 3, information X3 is the rate of data stream 3, and information X5 is the message transmission interval of data stream 3. The inference model C performs inference analysis based on the relevant state information X2, X3, and X5 of data stream 3 to obtain the inference analysis result C of data stream 3. The inference analysis result C is used to indicate whether data stream 3 is a conference stream. The processing operation C is an operation that modifies the traffic management policy of data stream 3. For example, if the inference analysis result C indicates that data stream 3 is a conference stream, the processing operation C is to set the discard priority of messages in data stream 3 to green priority, and the traffic manager sets the discard priority of messages in data stream 3 to green priority. If the inference analysis result C indicates that data stream 3 is not a conference stream, the processing operation C is to set the discard priority of data stream 3's packets to yellow priority. The flow manager sets the discard priority of data stream 3's packets to yellow priority. Green priority indicates that packets should be discarded as little as possible, while yellow priority is higher than green priority; that is, yellow priority packets are discarded first.
[0123] The aforementioned data streams 1, 2, and 3 are three distinct data streams. Inference analyzers A, B, and C can operate simultaneously online. As shown in Figure 12, the communication device receives data streams 1 and 2 through port 1 and data stream 3 through port 2. The communication chip processes packets from data streams 1, 2, and 3 within the same packet processing pipeline. During the pipelined processing of packets from data streams 1, 2, and 3, the communication chip uses inference analyzer A to perform inference analysis on data stream 1, inference analyzer B to perform inference analysis on data stream 2, and inference analyzer C to perform inference analysis on data stream 3. The processes of inference analysis of data stream 1 using inference analyzer A, data stream 2 using inference analyzer B, and data stream 3 using inference analyzer C can be performed in parallel; this embodiment does not limit this process.
[0124] As described above, the embodiments of this application can achieve linkage between the AI module and the message processor, traffic manager, remote device, etc., which can improve the processing effect of the data stream. For example, the traffic manager can manage the data stream based on the inference and analysis results of the data stream by the AI module, which can improve the traffic manager's management effect on the data stream.
[0125] It should be noted that, in the embodiments of this application, the term "connected" means connected and conductive, or connected and capable of information transmission. "Sequentially connected" means sequentially connected and conductive. Sequential connection of Y and Z means that Y is connected before Z, and Z is connected after Y; or, the output of Y is connected to the input of Z. "Y" and "Z" can be inference components or hardware inference units. For example, sequentially connected FCL inference component and loop layer inference component means that the FCL inference component and the loop layer inference component are sequentially connected and conductive, and the output of the FCL inference component is connected to the input of the loop layer inference component. Sequentially connected FCL inference unit and loop layer inference unit means that the FCL inference unit and the loop layer inference unit are sequentially connected and conductive, and the output of the FCL inference unit is connected to the input of the loop layer inference unit.
[0126] The following describes the method embodiments of this application.
[0127] Please refer to Figure 13, which shows a flowchart of a data stream processing method provided in an embodiment of this application. This data stream processing method is applied to a communication chip. For example, the communication chip is shown in Figure 1, Figure 4, or Figure 5. The following description assumes that the data stream processing method is executed by the communication chip. Referring to Figure 13, the data stream processing method includes the following steps S301 to S303.
[0128] S301. The communication chip receives the first message, which belongs to the first data stream.
[0129] The first message is any message in the first data stream. The communication device containing the communication chip receives the first message through any port connected to the communication chip. After receiving the first message, the port transmits the first message to the communication chip, and the communication chip receives the first message.
[0130] S302. The communication chip determines a first inference model based on the feature information of the first message. The first inference model is used to instruct the use of a first inference analyzer to process the first data stream. The first inference analyzer includes at least one hardware inference unit in a hardware inference resource pool.
[0131] The communication chip parses the first message to obtain its feature information, and then determines the first inference model based on this feature information. The feature information of the first message includes its transport layer protocol type and application type. The transport layer protocol type of the first message includes TCP, UDP, and Real-Time Transport Protocol (RTP). The application type of the first message refers to the type of application to which it belongs, and application types include video, voice, email, file download, and conferencing.
[0132] In an optional embodiment, the communication chip includes a mapping relationship between the feature information of the first message and the first inference model, and the communication chip determines the first inference model based on the feature information of the first message and the mapping relationship.
[0133] In a specific embodiment, the communication chip includes a mapping relationship between the feature information of the first message and a first inference mode, wherein the first inference mode is used to indicate a first inference model; the mapping relationship between the feature information of the first message and the first inference mode, as well as the indication of the first inference mode, realizes the mapping relationship between the feature information of the first message and the first inference model. The communication chip determines the first inference mode based on the feature information of the first message and the mapping relationship between the feature information of the first message and the first inference mode, and the communication chip determines the first inference model based on the first inference mode.
[0134] In a specific embodiment, the communication chip includes a feature-pattern mapping table. This table comprises multiple sets of feature-pattern mapping relationships, each mapping between message feature information and an inference mode. Each inference mode indicates an inference model. The communication chip searches the feature-pattern mapping table based on the feature information of the first message to determine message feature information that matches (e.g., is the same as) the feature information of the first message. The communication chip then determines the inference mode in the feature-pattern mapping table that has a mapping relationship with the message feature information that matches (e.g., is the same as) the feature information of the first message as the first inference mode.
[0135] In an optional embodiment, the communication chip includes a first model configuration template, which is a configuration template for a first inference model and is used to record configuration information of the first inference model. A first inference mode is used to indicate the first model configuration template, thereby indicating the first inference model (i.e., the first inference mode indicates the first inference model by indicating the first model configuration template). The communication chip determines the first model configuration template based on the first inference mode, and determines the first inference model based on the first model configuration template.
[0136] In a specific embodiment, the communication chip includes multiple model configuration templates, each corresponding one-to-one with a multiple inference model and each corresponding one-to-one with a multiple inference mode. Each model configuration template is a configuration template for a corresponding inference model, and each template records the configuration information of the corresponding inference model. Each inference mode indicates a corresponding model configuration template, thereby indicating the inference model corresponding to that template. A first inference mode is one of the multiple inference modes, and a first model configuration template is one of the multiple model configuration templates.
[0137] In this embodiment, a first inference model is used to instruct the processing of a first data stream using a first inference analyzer. The first inference analyzer includes at least one hardware inference unit in a hardware inference resource pool. The first inference analyzer and the first inference model have a mapping relationship; the first inference analyzer is a first inference model mapped to the hardware inference resource pool, or in other words, the first inference analyzer is the embodiment of the first inference model in the hardware inference resource pool. In a specific embodiment, the first inference model includes at least one inference component, and the first inference analyzer includes at least one hardware inference unit in the hardware inference resource pool. The at least one inference component and the at least one hardware inference unit have a mapping relationship, for example, a one-to-one mapping relationship. The inference parameters of each hardware inference unit included in the first inference analyzer correspond to the same inference parameters of the corresponding inference component included in the first inference model, and the parameter values of the corresponding inference parameters in each hardware inference unit included in the first inference analyzer and the corresponding inference component included in the first inference model are the same.
[0138] In one example, referring to Figures 6 and 7, the first inference model is inference model A as shown in Figure 6, and the first inference analyzer is inference analyzer A as shown in Figure 7. Inference model A includes an FCL inference component and a loop-layer inference component, and inference analyzer A includes an FCL inference unit and a loop-layer inference unit. The FCL inference unit and the FCL inference component have a mapping relationship, and the loop-layer inference unit and the loop-layer inference component also have a mapping relationship. The inference parameters of the FCL inference unit correspond to the same inference parameters of the FCL inference component, and the parameter values of the corresponding inference parameters in the FCL inference unit and the FCL inference component are the same; for example, both the FCL inference unit and the FCL inference component include inference parameter A1 and inference parameter A2. The parameter value of inference parameter A1 of the FCL inference unit is the same as the parameter value of inference parameter A1 of the FCL inference component, and the parameter value of inference parameter A2 of the FCL inference unit is the same as the parameter value of inference parameter A2 of the FCL inference component. The inference parameters of the loop layer inference unit correspond to the same inference parameters of the loop layer inference component, and the parameter values of the corresponding inference parameters in the loop layer inference unit and the loop layer inference component are the same; for example, both the loop layer inference unit and the loop layer inference component include inference parameter A3 and inference parameter A4, the parameter value of inference parameter A3 of the loop layer inference unit is the same as the parameter value of inference parameter A3 of the loop layer inference component, and the parameter value of inference parameter A4 of the loop layer inference unit is the same as the parameter value of inference parameter A4 of the loop layer inference component.
[0139] In another example, referring to Figures 8 and 9, the first inference model is inference model B as shown in Figure 8, and the first inference analyzer is inference analyzer B as shown in Figure 9. Inference model B includes a decision tree inference component, and inference analyzer B includes a decision tree inference unit, which has a mapping relationship with the decision tree inference component. The inference parameters of the decision tree inference unit correspond to the same inference parameters of the decision tree inference component, and the parameter values of the corresponding inference parameters in both the decision tree inference unit and the decision tree inference component are the same. For example, both the decision tree inference unit and the decision tree inference component include inference parameter B1 and inference parameter B2. The parameter value of inference parameter B1 of the decision tree inference unit is the same as the parameter value of inference parameter B1 of the decision tree inference component, and the parameter value of inference parameter B2 of the decision tree inference unit is the same as the parameter value of inference parameter B2 of the decision tree inference component.
[0140] In another example, referring to Figures 10 and 11, the first inference model is inference model C as shown in Figure 10, and the first inference analyzer is inference analyzer C as shown in Figure 11. Inference model C includes a CNN inference component, and inference analyzer C includes a CNN inference unit. The CNN inference unit and the CNN inference component have a mapping relationship. The inference parameters of the CNN inference unit correspond to the same inference parameters of the CNN inference component, and the parameter values of the corresponding inference parameters in the CNN inference unit and the CNN inference component are the same. For example, both the CNN inference unit and the CNN inference component include inference parameters C1 and C2. The parameter value of inference parameter C1 of the CNN inference unit is the same as the parameter value of inference parameter C2 of the CNN inference component, and the parameter value of inference parameter C2 of the CNN inference unit is the same as the parameter value of inference parameter C2 of the CNN inference component.
[0141] In an optional embodiment, the first inference model includes multiple inference components, and the first inference analyzer includes multiple hardware inference units in a hardware inference resource pool. The multiple inference components and the multiple hardware inference units have a mapping relationship, for example, a one-to-one mapping relationship. The connectivity relationships of the multiple hardware inference units correspond to the connectivity relationships of the multiple inference components; that is, the multiple hardware inference units correspond to the multiple inference components, and the multiple hardware inference units are connected according to the connectivity relationships of the multiple inference components. For example, the multiple inference components constitute an inference path (e.g., called an inference component path), and the multiple hardware inference units constitute an inference path (e.g., called an inference unit path), and the inference unit path corresponds to the inference component path. That is, the hardware inference units on the inference unit path correspond to the inference components on the inference component path, and the connectivity relationships of the hardware inference units on the inference unit path correspond to the connectivity relationships of the inference components on the inference component path. Referring again to Figures 6 and 7, inference model A includes an FCL inference component and a loop layer inference component. The FCL inference component and the loop layer inference component are connected in sequence to form an inference component path. Inference analyzer A includes an FCL inference unit and a loop layer inference unit. The FCL inference unit and the loop layer inference unit are connected in sequence to form an inference unit path. The inference unit path corresponds to the same path as the inference component path.
[0142] In an optional embodiment, the first inference model includes a first expert sub-model and a second expert sub-model, and the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer has a mapping relationship with the first expert sub-model; the first sub-inference analyzer is mapped to the first expert sub-model in the hardware inference resource pool, or in other words, the first sub-inference analyzer is the embodiment of the first expert sub-model in the hardware inference resource pool. The second sub-inference analyzer has a mapping relationship with the second expert sub-model; the second sub-inference analyzer is mapped to the second expert sub-model in the hardware inference resource pool, or in other words, the second sub-inference analyzer is the embodiment of the second expert sub-model in the hardware inference resource pool.
[0143] The first expert sub-model and the second expert sub-model are connected in parallel or in series; correspondingly, the first sub-inference analyzer and the second sub-inference analyzer are connected in parallel or in series. When the first and second expert sub-models are connected in parallel, the first and second sub-inference analyzers are also connected in parallel. When the first and second expert sub-models are connected in series, the first and second sub-inference analyzers are also connected in series. It should be noted that when the first and second expert sub-models are connected in parallel, they are not dependent on each other; however, when they are connected in series, they are dependent on each other; for example, if the second expert sub-model is connected after the first expert sub-model, at least a portion of the inputs to the second expert sub-model depends on the output of the first expert sub-model. Correspondingly, when the first and second sub-inference analyzers are connected in parallel, they are not dependent on each other; when they are connected in series, they are dependent on each other. For example, if the second sub-inference analyzer is connected in series after the first sub-inference analyzer, at least a portion of the input of the second sub-inference analyzer depends on the output of the first sub-inference analyzer. That is, in the embodiments of this application, the parallel / serial connection of the two expert sub-models depends on whether they are dependent on each other; if they are not dependent, they are connected in parallel; if they are dependent, for example, if the input of one expert sub-model depends on the output of the other, they are connected in series. The parallel / serial connection of two sub-inference analyzers depends on whether the two sub-inference analyzers have a dependency relationship; if the two sub-inference analyzers do not have a dependency relationship, the two sub-inference analyzers are connected in parallel; if the two sub-inference analyzers have a dependency relationship, for example, if the input of one of the two sub-inference analyzers depends on the output of the other sub-inference analyzer, the two sub-inference analyzers are connected in series.
[0144] In one example, Figure 14 is a schematic diagram of a first inference model provided in an embodiment of this application, and Figure 15 is a schematic diagram of a first inference analyzer corresponding to the first inference model shown in Figure 14. As shown in Figure 14, the first inference model includes a first expert sub-model and a second expert sub-model, which are connected in parallel and have no dependency relationship. Correspondingly, as shown in Figure 15, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer, which are connected in parallel and have no dependency relationship.
[0145] In another example, Figure 16 is a schematic diagram of another first inference model provided in an embodiment of this application, and Figure 17 is a schematic diagram of a first inference analyzer corresponding to the first inference model shown in Figure 16. As shown in Figure 16, the first inference model includes a first expert sub-model and a second expert sub-model, which are connected in series. The first expert sub-model and the second expert sub-model have a dependency relationship, and some inputs of the second expert sub-model depend on the output of the first expert sub-model. Correspondingly, as shown in Figure 17, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer, which are connected in series. The first sub-inference analyzer and the second sub-inference analyzer have a dependency relationship, and some inputs of the second sub-inference analyzer depend on the output of the first sub-inference analyzer.
[0146] In another example, Figure 18 is a schematic diagram of another first inference model provided in an embodiment of this application, and Figure 19 is a schematic diagram of a first inference analyzer corresponding to the first inference model shown in Figure 18. As shown in Figure 18, the first inference model includes a first expert sub-model and a second expert sub-model, which are connected in series. The first expert sub-model and the second expert sub-model have a dependency relationship, and all inputs of the second expert sub-model depend on the output of the first expert sub-model. Correspondingly, as shown in Figure 19, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer, which are connected in series. The first sub-inference analyzer and the second sub-inference analyzer have a dependency relationship, and all inputs of the second sub-inference analyzer depend on the output of the first sub-inference analyzer.
[0147] In this embodiment, the first expert sub-model includes at least one first inference component, and the second expert sub-model includes at least one second inference component. The at least one inference component included in the first inference model includes both the at least one first inference component and the at least one second inference component. The first sub-inference analyzer includes at least one first hardware inference unit, and the second sub-inference analyzer includes at least one second hardware inference unit. The at least one hardware inference unit included in the first inference analyzer includes both the at least one first hardware inference unit and the at least one second hardware inference unit. The mapping relationship between the at least one inference component and the at least one hardware inference unit includes the mapping relationship between the at least one first inference component and the at least one second inference component. The inference parameters of each first hardware inference unit included in the first sub-inference analyzer correspond to the same inference parameters of the corresponding first inference component included in the first expert sub-model. The parameter values of the corresponding inference parameters in each first hardware inference unit included in the first sub-inference analyzer and the corresponding first inference component included in the first expert sub-model are the same. The inference parameters of each second hardware inference unit included in the second sub-inference analyzer are the same as the inference parameters of the corresponding second inference component included in the second expert sub-model. The parameter values of the corresponding inference parameters in each second hardware inference unit included in the second sub-inference analyzer and the corresponding second inference component included in the second expert sub-model are the same.
[0148] In optional embodiments, the at least one first inference component is a plurality of first inference components (i.e., the first expert sub-model includes a plurality of first inference components), the at least one second inference component is a plurality of second inference components (i.e., the second expert sub-model includes a plurality of second inference components), the at least one first hardware inference unit is a plurality of first hardware inference units (i.e., the first sub-inference analyzer includes a plurality of first hardware inference units), and the at least one second hardware inference unit is a plurality of second hardware inference units (i.e., the second sub-inference analyzer includes a plurality of second hardware inference units). The connectivity relationships of the plurality of first hardware inference units correspond to the connectivity relationships of the plurality of first inference components, and the connectivity relationships of the plurality of second hardware inference units correspond to the connectivity relationships of the plurality of second inference components. For example, the at least one inference component included in the first inference model is a plurality of inference components, and the at least one hardware inference unit included in the first inference analyzer is a plurality of hardware inference units. The plurality of inference components constitute an inference component path, which includes a first inference component sub-path composed of the plurality of first inference components and a second inference component sub-path composed of the plurality of second inference components. The plurality of hardware inference units constitute an inference unit path, which includes a first inference unit sub-path composed of the plurality of first hardware inference units and a second inference unit sub-path composed of the plurality of second hardware inference units. The first inference unit sub-path corresponds to the first inference component sub-path, and the second inference unit sub-path corresponds to the second inference component sub-path.
[0149] In this embodiment, the first inference component sub-path and the second inference component sub-path can be two parallel sub-paths within the inference component path, or two sequential sub-paths within the inference component path. Correspondingly, the first inference unit sub-path and the second inference unit sub-path can be two parallel sub-paths within the inference unit path, or two sequential sub-paths within the inference unit path. In one embodiment, the first expert sub-model and the second expert sub-model are connected in parallel, the first sub-inference analyzer and the second sub-inference analyzer are connected in parallel, the first inference component sub-path and the second inference component sub-path are two parallel sub-paths within the inference component path, and the first inference unit sub-path and the second inference unit sub-path are two parallel sub-paths within the inference unit path. In another embodiment, the first expert sub-model and the second expert sub-model are connected in series, the first sub-inference analyzer and the second sub-inference analyzer are connected in series, the first inference component sub-path and the second inference component sub-path are two sequential sub-paths within the inference component path, and the first inference unit sub-path and the second inference unit sub-path are two sequential sub-paths within the inference unit path.
[0150] In one example, Figure 20 is a schematic diagram of another first inference model provided in an embodiment of this application, and Figure 21 is a schematic diagram of a first inference analyzer corresponding to the first inference model shown in Figure 20. As shown in Figure 20, the first inference model includes a first expert sub-model and a second expert sub-model, which are connected in parallel. The first expert sub-model includes a decision tree inference component, an SVM inference component, and a clustering inference component (the decision tree inference component, the SVM inference component, and the clustering inference component are three first inference components included in the first expert sub-model), which are connected sequentially. The second expert sub-model includes a loop layer inference component, an FCL inference component, and an NB inference component (the loop layer inference component, the FCL inference component, and the NB inference component are three second inference components included in the second expert sub-model), which are connected sequentially. As shown in Figure 21, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer, which are connected in parallel. The first sub-inference analyzer includes a decision tree inference unit, an SVM inference unit, and a clustering inference unit (these three are the first hardware inference units included in the first sub-inference analyzer), which are sequentially connected. The second sub-inference analyzer includes a loop layer inference unit, an FCL inference unit, and an NB inference unit (these three are the second hardware inference units included in the second sub-inference analyzer), which are sequentially connected. The decision tree inference unit and the decision tree inference component have a mapping relationship; the inference parameters of the decision tree inference unit correspond to the same inference parameters of the decision tree inference component, and the parameter values of the corresponding inference parameters are the same in both the decision tree inference unit and the decision tree inference component. The SVM inference unit and the SVM inference component have a mapping relationship. The inference parameters of the SVM inference unit and the inference parameters of the SVM inference component are the same, and the parameter values of the corresponding inference parameters are the same in both the SVM inference unit and the SVM inference component. Similarly, the clustering inference unit and the clustering inference component have a mapping relationship. The inference parameters of the clustering inference unit and the clustering inference component are the same, and the parameter values of the corresponding inference parameters are the same in both the clustering inference unit and the clustering inference component. Likewise, the recurrent layer inference unit and the recurrent layer inference component have a mapping relationship. The inference parameters of the recurrent layer inference unit and the recurrent layer inference component are the same, and the parameter values of the corresponding inference parameters are the same in both the recurrent layer inference unit and the recurrent layer inference component.The FCL inference unit and the FCL inference component have a mapping relationship. The inference parameters of the FCL inference unit and the inference parameters of the FCL inference component are the same, and the parameter values of the corresponding inference parameters in the FCL inference unit and the FCL inference component are the same. The NB inference unit and the NB inference component have a mapping relationship. The inference parameters of the NB inference unit and the inference parameters of the NB inference component are the same, and the parameter values of the corresponding inference parameters in the NB inference unit and the NB inference component are the same. The connectivity relationships of the decision tree inference unit, the SVM inference unit, and the clustering inference unit correspond to the connectivity relationships of the decision tree inference component, the SVM inference component, and the clustering inference component; the connectivity relationships of the recurrent layer inference unit, the FCL inference unit, and the NB inference unit correspond to the connectivity relationships of the recurrent layer inference component, the FCL inference component, and the NB inference component. For example, the decision tree inference component, the SVM inference component, and the clustering inference component are sequentially connected to form a first inference component sub-path; the recurrent layer inference component, the FCL inference component, and the NB inference component are sequentially connected to form a second inference component sub-path; the decision tree inference unit, the SVM inference unit, and the clustering inference unit are sequentially connected to form a first inference unit sub-path; the recurrent layer inference unit, the FCL inference unit, and the NB inference unit are sequentially connected to form a second inference unit sub-path. The first inference component sub-path and the second inference component sub-path are two parallel sub-paths; the first inference unit sub-path and the second inference component sub-path are two parallel sub-paths; the first inference unit sub-path and the first inference component sub-path are identical; and the second inference unit sub-path and the second inference component sub-path are identical.
[0151] In another example, Figure 22 is a schematic diagram of another first inference model provided in an embodiment of this application, and Figure 23 is a schematic diagram of a first inference analyzer corresponding to the first inference model shown in Figure 22. As shown in Figure 22, the first inference model includes a first expert sub-model and a second expert sub-model, which are connected in series. The first expert sub-model includes a decision tree inference component and a loop layer inference component (the decision tree inference component and the loop layer inference component are two first inference components included in the first expert sub-model), which are connected in sequence. The second expert sub-model includes an FCL inference component, an SVM inference component, a clustering inference component, and an NB inference component (the FCL inference component, the SVM inference component, the clustering inference component, and the NB inference component are four second inference components included in the second expert sub-model), which are connected in sequence. As shown in Figure 23, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer, which are connected in series. The first sub-inference analyzer includes a decision tree inference unit and a loop layer inference unit (the decision tree inference unit and the loop layer inference unit are two first hardware inference units included in the first sub-inference analyzer), and the decision tree inference unit and the loop layer inference unit are connected sequentially. The second sub-inference analyzer includes an FCL inference unit, an SVM inference unit, a clustering inference unit, and an NB inference unit (the FCL inference unit, the SVM inference unit, the clustering inference unit, and the NB inference unit are four second hardware inference units included in the second sub-inference analyzer), and the FCL inference unit, the SVM inference unit, the clustering inference unit, and the NB inference unit are connected sequentially. The decision tree inference unit and the decision tree inference component have a mapping relationship; the inference parameters of the decision tree inference unit correspond to the same inference parameters of the decision tree inference component, and the parameter values of the corresponding inference parameters in the decision tree inference unit and the decision tree inference component are the same. The loop-layer inference unit and the loop-layer inference component have a mapping relationship. The inference parameters of the loop-layer inference unit and the loop-layer inference component are the same, and the parameter values of the corresponding inference parameters are the same in both the loop-layer inference unit and the loop-layer inference component. Similarly, the FCL inference unit and the FCL inference component have a mapping relationship. The inference parameters of the FCL inference unit and the FCL inference component are the same, and the parameter values of the corresponding inference parameters are the same in both the FCL inference unit and the FCL inference component. Likewise, the SVM inference unit and the SVM inference component have a mapping relationship. The inference parameters of the SVM inference unit and the SVM inference component are the same, and the parameter values of the corresponding inference parameters are the same in both the SVM inference unit and the SVM inference component.The clustering inference unit and the clustering inference component have a mapping relationship. The inference parameters of the clustering inference unit and the inference parameters of the clustering inference component are the same, and the parameter values of the corresponding inference parameters in the clustering inference unit and the clustering inference component are the same. The NB inference unit and the NB inference component have a mapping relationship. The inference parameters of the NB inference unit and the inference parameters of the NB inference component are the same, and the parameter values of the corresponding inference parameters in the NB inference unit and the NB inference component are the same. The connectivity relationship between the decision tree inference unit and the recurrent layer inference unit corresponds to the connectivity relationship between the decision tree inference component and the recurrent layer inference component. The connectivity relationship between the FCL inference unit, the SVM inference unit, the clustering inference unit, and the NB inference unit corresponds to the connectivity relationship between the FCL inference component, the SVM inference component, the clustering inference component, and the NB inference component. For example, the decision tree inference component and the recurrent layer inference component are sequentially connected to form a first inference component sub-path; the FCL inference component, the SVM inference component, the clustering inference component, and the NB inference component are sequentially connected to form a second inference component sub-path; the decision tree inference unit and the recurrent layer inference unit are sequentially connected to form a first inference unit sub-path; the FCL inference unit, the SVM inference unit, the clustering inference unit, and the NB inference unit are sequentially connected to form a second inference unit sub-path. The first inference component sub-path and the second inference component sub-path are two sequential sub-paths in the inference component path formed by the sequential connection of the decision tree inference component, the recurrent layer inference component, the FCL inference component, the SVM inference component, the clustering inference component, and the NB inference component. The first inference unit sub-path and the second inference unit sub-path are two sequential sub-paths in the inference unit path formed by the sequential connection of the decision tree inference unit, the recurrent layer inference unit, the FCL inference unit, the SVM inference unit, the clustering inference unit, and the NB inference unit. The sub-path of the first inference unit corresponds to the sub-path of the first inference component, and the sub-path of the second inference unit corresponds to the sub-path of the second inference component. Furthermore, the connectivity between the first sub-inference analyzer and the second sub-inference analyzer corresponds to the connectivity between the first expert sub-model and the second expert sub-model. Specifically, as shown in Figure 22, the loop layer inference component in the first expert sub-model is connected to the FCL inference component in the second expert sub-model. Correspondingly, as shown in Figure 23, the loop layer inference unit in the first sub-inference analyzer is connected to the FCL inference unit in the second sub-inference analyzer. Here, the loop layer inference component is the output inference component of the first expert sub-model, the FCL inference component is the input inference component of the second expert sub-model, the loop layer inference unit is the output inference unit of the first sub-inference analyzer, and the FCL inference unit is the input inference unit of the second sub-inference analyzer.The output inference component in the first expert sub-model is the inference component used to output the processing results (e.g., inference analysis results) of the first expert sub-model. The input inference component in the second expert sub-model is the inference component used to receive input information. The output inference unit in the first sub-inference analyzer is the hardware inference unit used to output the processing results (e.g., inference analysis results) of the first sub-inference analyzer. The input inference unit in the second sub-inference analyzer is the hardware inference unit used to receive input information.
[0152] It can be understood that the decision tree inference unit, loop layer inference unit, FCL inference unit, SVM inference unit, clustering inference unit and NB inference unit in the first inference analyzer shown in Figures 21 and 23 can be hardware inference units in the hardware inference resource pool shown in Figure 3. The first inference analyzer shown in Figures 21 and 23 can be an inference analyzer built based on the hardware inference resource pool shown in Figure 3.
[0153] The above description uses the example of a first inference model including a first expert sub-model and a second expert sub-model, and a first inference analyzer including a first sub-inference analyzer and a second sub-inference analyzer. In some embodiments, the first inference model includes only one expert sub-model, which is the first inference model; correspondingly, the first inference analyzer includes only one sub-inference analyzer, which is the first inference analyzer. For example, inference model A shown in Figure 6, inference model B shown in Figure 8, and inference model C shown in Figure 10 all include only one expert sub-model, and the corresponding inference analyzers A, B, and C also include only one sub-inference analyzer. In other embodiments, the first inference model includes two or more expert sub-models, and correspondingly, the first inference analyzer includes two or more sub-inference analyzers. For example, Figure 24 is a schematic diagram of another first inference model provided in an embodiment of this application, and Figure 25 is a schematic diagram of the first inference analyzer corresponding to the first inference model shown in Figure 24. As shown in Figure 24, the first inference model includes a first expert sub-model, a second expert sub-model, and a third expert sub-model. The first and second expert sub-models are connected in series, as are the first and third expert sub-models, and the second and third expert sub-models are connected in parallel. Correspondingly, as shown in Figure 25, the first inference analyzer includes a first sub-inference analyzer, a second sub-inference analyzer, and a third sub-inference analyzer. The first and second sub-inference analyzers are connected in series, as are the first and third sub-inference analyzers, and the second and third sub-inference analyzers are connected in parallel.
[0154] In the first inference model shown in Figure 24, the first expert sub-model includes at least one first inference component, the second expert sub-model includes at least one second inference component, and the third expert sub-model includes at least one third inference component. The at least one first inference component constitutes a first inference component sub-path, the at least one second inference component constitutes a second inference component sub-path, and the at least one third inference component constitutes a third inference component sub-path. The at least one first inference component, the at least one second inference component, and the at least one third inference component constitute a reasoning component path. The first and second inference component sub-paths are two sequential sub-paths within the reasoning component path, the first and third inference component sub-paths are two sequential sub-paths within the reasoning component path, and the second and third inference component sub-paths are two parallel sub-paths within the reasoning component path. Correspondingly, in the first inference analyzer shown in Figure 25, the first sub-inference analyzer includes at least one first hardware inference unit from the hardware inference resource pool, the second sub-inference analyzer includes at least one second hardware inference unit from the hardware inference resource pool, and the third sub-inference analyzer includes at least one third hardware inference unit from the hardware inference resource pool. The at least one first hardware inference unit is mapped to the at least one first inference component included in the first expert sub-model; the at least one second hardware inference unit is mapped to the at least one second inference component included in the second expert sub-model; and the at least one third hardware inference unit is mapped to the at least one third inference component included in the third expert sub-model. The inference parameters of each first hardware inference unit in the first sub-inference analyzer correspond to the same inference parameters of the corresponding first inference component included in the first expert sub-model. The parameter values of the corresponding inference parameters are the same in both the first hardware inference unit in the first sub-inference analyzer and the corresponding first inference component included in the first expert sub-model. Similarly, the inference parameters of each second hardware inference unit in the second sub-inference analyzer correspond to the same inference parameters of the corresponding second inference component included in the second expert sub-model. The parameter values of the corresponding inference parameters are the same in both the second hardware inference unit in the second sub-inference analyzer and the corresponding second inference component included in the second expert sub-model. The inference parameters of each third hardware inference unit included in the third sub-inference analyzer are the same as the inference parameters of the corresponding third inference component included in the third expert sub-model. The parameter values of the corresponding inference parameters in each third hardware inference unit included in the third sub-inference analyzer and the corresponding third inference component included in the third expert sub-model are the same.The at least one first hardware inference unit constitutes a first inference unit sub-path, the at least one second hardware inference unit constitutes a second inference unit sub-path, and the at least one third hardware inference unit constitutes a third inference unit sub-path. The first inference unit sub-path corresponds to the same sub-path as the first inference component sub-path, the second inference unit sub-path corresponds to the same sub-path as the second inference component sub-path, and the third inference unit sub-path corresponds to the same sub-path as the third inference component sub-path. The at least one first hardware inference unit, the at least one second hardware inference unit, and the at least one third hardware inference unit constitute an inference unit path. The first inference unit sub-path and the second inference unit sub-path are two sequential sub-paths within the inference unit path, and the first inference unit sub-path and the third inference unit sub-path are two sequential sub-paths within the inference unit path. The second inference unit sub-path and the third inference unit sub-path are two parallel sub-paths within the inference unit path.
[0155] In an optional embodiment, the first inference model is a fully connected neural network model, comprising multiple neurons that are fully connected. The first inference model shown in Figure 24 is illustrated as an example. Referring to Figure 26, which shows a schematic diagram of the first inference model shown in Figure 24, each of the first, second, and third expert sub-models comprises multiple neurons, all of which are fully connected. Since the second and third expert sub-models are connected in parallel, the neurons connecting them are invalid neurons (corresponding to invalid connections in Figure 26). These invalid neurons are used to transmit invalid information. After receiving valid information transmitted by other neurons, any invalid neuron replaces the valid information with invalid information and transmits the valid information to the neuron connected to that invalid neuron. The invalid information can be default information or pre-agreed information. In the first inference analyzer corresponding to the first inference model shown in Figure 26, the second sub-inference analyzer and the third sub-inference analyzer are connected in parallel. The communication device used to connect the second sub-inference analyzer and the third sub-inference analyzer is disconnected, or the communication device used to connect the second sub-inference analyzer and the third sub-inference analyzer transmits invalid information between the second sub-inference analyzer and the third sub-inference analyzer.
[0156] S303. The communication chip uses a first inference analyzer to process the first data stream.
[0157] In an optional embodiment, the first inference analyzer includes at least one hardware inference unit that constitutes a first inference path (i.e., an inference unit path), and the first inference analyzer is used to process the first data stream according to the first inference path. For example, the first inference analyzer includes multiple hardware inference units that are sequentially connected to form the first inference path, and the multiple hardware inference units process the first data stream sequentially.
[0158] In an optional embodiment, the communication chip processes the first data stream using a first inference analyzer, including: the communication chip performing inference analysis on the first data stream using the first inference analyzer to obtain the inference analysis result of the first data stream; and the communication chip performing processing operations related to the inference analysis result of the first data stream. The first inference analyzer is used to perform inference analysis on the first data stream according to a first inference path. The processing operations related to the inference analysis result of the first data stream include at least one of the following: editing the packets of the first data stream; modifying the traffic management policy of the first data stream; modifying the inference analysis policy of the first data stream; and notifying a remote device of the inference analysis result of the first data stream. The operation of editing the packets of the first data stream can be an operation of editing the packets of the first data stream in a packet processing pipeline, for example, modifying the priority carried by the packets of the first data stream. The operations for modifying the traffic management policy of the first data stream include: modifying the management parameters of the buffer space used to cache the packets of the first data stream; modifying the management parameters of the packet queue used to cache the packets of the first data stream; modifying the drop priority of the packets of the first data stream; and modifying the scheduling priority of the packets of the first data stream. For example, if the inference analysis policy of the first data stream is a rate analysis policy, which is used to infer and analyze the rate of the first data stream, the operation for modifying the inference analysis policy of the first data stream could be: changing the inference analysis policy of the first data stream to a packet loss inference analysis policy, or changing the inference analysis policy of the first data stream to a rate inference analysis policy and a packet loss inference analysis policy, whereby the packet loss inference analysis policy is used to infer and analyze the packet loss situation of the first data stream. It should be noted that the description of the processing operations here is only an example, and the processing operations related to the inference analysis results of the first data stream can also be any other possible operations. For example, the processing operation related to the inference analysis results of the first data stream could also be the operation of obtaining a state snapshot of the communication chip.
[0159] In an optional embodiment, the communication chip processes the first data stream using a first inference analyzer based on the relevant status information of the first data stream. For example, the communication chip uses the first inference analyzer to perform inference analysis on the first data stream based on the relevant status information of the first data stream to obtain the inference analysis result of the first data stream. Then, the communication chip performs processing operations related to the inference analysis result of the first data stream. In a specific embodiment, the communication chip inputs the relevant status information of the first data stream into the first inference analyzer, causing the first inference analyzer to perform inference analysis on the first data stream based on the relevant status information of the first data stream and output the inference analysis result of the first data stream; the communication chip obtains the inference analysis result of the first data stream output by the first inference analyzer. The first inference analyzer is used to perform inference analysis on the first data stream according to a first inference path based on the relevant status information of the first data stream. The relevant status information of the first data stream includes at least one of the following: the status information of the first data stream; the status information of data streams related to the first data stream; and the status information of resources related to the first data stream. The first data stream and its related data streams satisfy at least one of the following: the first data stream and its related data streams compete for cache resources; the first data stream and its related data streams compete for queue resources; the first data stream and its related data streams compete for bandwidth resources; the first data stream and its related data streams compete for processing resources. For any one of the first data streams and its related data streams, the status information of that any one data stream includes at least one of the following: the message information of that any one data stream; the forwarding information of that any one data stream; the traffic statistics information of that any one data stream; and the historical inference analysis results of that any one data stream obtained from a remote device. That is, the status information of the first data stream includes at least one of the following: the message information of the first data stream; the forwarding information of the first data stream; the traffic statistics information of the first data stream; and the historical inference analysis results of the first data stream obtained from a remote device. The status information of any data stream associated with the first data stream includes at least one of the following: the message information of the data stream; the forwarding information of the data stream; the traffic statistics of the data stream; and the historical reasoning analysis results of the data stream obtained from the remote device. The status information of the resources associated with the first data stream includes at least one of the following: statistics of cache resources used to cache the first data stream; statistics of queue resources used to cache the first data stream; statistics of bandwidth resources used to forward the first data stream; and statistics of processing resources used to process the messages of the first data stream. For a description of the status information related to the first data stream and the data streams associated with the first data stream, please refer to the foregoing embodiments; further details will not be provided here.
[0160] The following example illustrates how a communication chip uses a first inference analyzer to perform inference analysis on a first data stream.
[0161] In one embodiment, the first inference analyzer is inference analyzer A as shown in Figure 7. The communication chip inputs the relevant state information X4, X8, and X9 of the first data stream into the first inference analyzer. The first inference analyzer performs inference analysis on the first data stream based on the relevant state information X4, X8, and X9 to obtain the inference analysis result A of the first data stream, and outputs the inference analysis result A. As shown in Figure 7, the first inference analyzer includes an FCL inference unit and a loop layer inference unit. The FCL inference unit and the loop layer inference unit are sequentially connected to form a first inference path. The first inference analyzer performs inference analysis on the first data stream according to the relevant state information X4, X8, and X9 of the first data stream, and obtains the inference analysis result A. That is, the FCL inference unit and the loop layer inference unit sequentially perform inference analysis on the first data stream to obtain the inference analysis result A, and the loop layer inference unit outputs the inference analysis result A.
[0162] In another embodiment, the first inference analyzer is inference analyzer B as shown in FIG9. The communication chip inputs the relevant state information X1, X6, and X7 of the first data stream into the first inference analyzer. The first inference analyzer performs inference analysis on the first data stream based on the relevant state information X1, X6, and X7 to obtain the inference analysis result B of the first data stream, and outputs the inference analysis result B. As shown in FIG9, the first inference analyzer includes a decision tree inference unit. The decision tree inference unit performs inference analysis on the first data stream based on the relevant state information X1, X6, and X7 to obtain the inference analysis result B, and outputs the inference analysis result B.
[0163] In another embodiment, the first inference analyzer is the inference analyzer C shown in Figure 11. The communication chip inputs the relevant state information X2, X3, and X5 of the first data stream into the first inference analyzer. The first inference analyzer performs inference analysis on the first data stream based on the relevant state information X2, X3, and X5 to obtain the inference analysis result C of the first data stream, and outputs the inference analysis result C. As shown in Figure 11, the first inference analyzer includes a CNN inference unit. This CNN inference unit performs inference analysis on the first data stream based on the relevant state information X2, X3, and X5 to obtain the inference analysis result C, and outputs the inference analysis result C.
[0164] In another embodiment, as shown in Figure 15, the first inference analyzer receives relevant state information F1 and F2 of the first data stream from the communication chip. The first inference analyzer performs inference analysis on the first data stream based on F1 and F2, obtaining inference analysis results U1 and U2. The first inference analyzer then outputs inference analysis results U1 and U2. As shown in Figure 15, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer connected in parallel. The communication chip receives relevant state information F1 from the first data stream from the first sub-inference analyzer and receives relevant state information F2 from the first data stream from the second sub-inference analyzer. The first sub-inference analyzer performs inference analysis on the first data stream based on F1, obtaining inference analysis result U1, and then outputs inference analysis result U1. The second sub-inference analyzer performs inference analysis on the first data stream based on the relevant state information F2 of the first data stream to obtain the inference analysis result U2, and outputs the inference analysis result U2. In a specific embodiment, the first inference analyzer, as shown in Figure 21, includes a decision tree inference unit, an SVM inference unit, and a clustering inference unit. These three units are sequentially connected to form a first inference unit sub-path. The second sub-inference analyzer includes a loop layer inference unit, an FCL inference unit, and an NB inference unit. These units are sequentially connected to form a second inference unit sub-path. The decision tree inference unit, SVM inference unit, clustering inference unit, loop layer inference unit, FCL inference unit, and NB inference unit constitute a first inference path. The first and second inference unit sub-paths are two parallel sub-paths within the first inference path. The first sub-inference analyzer performs inference analysis on the first data stream based on the relevant state information F1 of the first data stream, following the first inference unit sub-paths to obtain the inference analysis result U1. That is, the decision tree inference unit, SVM inference unit, and clustering inference unit in the first sub-inference analyzer sequentially perform inference analysis on the first data stream to obtain inference analysis result U1, and the clustering inference unit outputs inference analysis result U1. The second sub-inference analyzer, based on the relevant state information F2 of the first data stream, performs inference analysis on the first data stream according to the sub-path of the second inference unit to obtain inference analysis result U2. That is, the loop layer inference unit, FCL inference unit, and NB inference unit in the second sub-inference analyzer sequentially perform inference analysis on the first data stream to obtain inference analysis result U2, and the NB inference unit outputs inference analysis result U2.For example, a first sub-inference analyzer is used to classify the application of the first data stream, and a second sub-inference analyzer is used to predict the flow behavior of the first data stream. Inference analysis result U1 is the application classification result of the first data stream, which is "file download," indicating that the first data stream is a file download application. Inference analysis result U2 is the flow behavior prediction result of the first data stream, which is "burst warning," indicating that there is a possibility of a burst in the first data stream. It should be noted that the relevant status information F1 of the first data stream may include at least one of the status information of the first data stream, the status information of data streams related to the first data stream, and the status information of resources related to the first data stream. The relevant status information F2 of the first data stream may include at least one of the status information of the first data stream, the status information of data streams related to the first data stream, and the status information of resources related to the first data stream. Furthermore, the relevant status information F1 and the relevant status information F2 may include the same information, or they may be different. This embodiment of the application does not limit this.
[0165] In another embodiment, as shown in Figure 17, the first inference analyzer is configured such that the communication chip inputs relevant state information G1 and G2 of the first data stream into the first inference analyzer. The first inference analyzer performs inference analysis on the first data stream based on these two information points, obtaining an inference analysis result V2. The first inference analyzer then outputs the inference analysis result V2. As shown in Figure 17, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer, which are connected in series. The communication chip inputs the relevant state information G1 of the first data stream into the first sub-inference analyzer, and the communication chip inputs the relevant state information G2 of the first data stream into the second sub-inference analyzer. The first sub-inference analyzer performs inference analysis on the first data stream based on the relevant state information G1, obtaining an inference analysis result V1, and then outputs the inference analysis result V1 to the second sub-inference analyzer. The second sub-inference analyzer performs inference analysis on the first data stream based on the relevant state information G2 and the inference analysis result V1 to obtain the inference analysis result V2, and outputs the inference analysis result V2. In a specific embodiment, the first inference analyzer, as shown in Figure 23, includes a decision tree inference unit and a loop layer inference unit, which are sequentially connected to form a first inference unit sub-path. The second sub-inference analyzer includes an FCL inference unit, an SVM inference unit, a clustering inference unit, and an NB inference unit, which are sequentially connected to form a second inference unit sub-path. The decision tree inference unit, the loop layer inference unit, the FCL inference unit, the SVM inference unit, the clustering inference unit, and the NB inference unit constitute a first inference path, and the first inference unit sub-path and the second inference unit sub-path are two sequential sub-paths within the first inference path. The first sub-inference analyzer, based on the relevant state information G1 of the first data stream, performs inference analysis on the first data stream according to the sub-path of the first inference unit to obtain inference analysis result V1. That is, the decision tree inference unit and the loop layer inference unit in the first sub-inference analyzer sequentially perform inference analysis on the first data stream to obtain inference analysis result V1, and the loop layer inference unit outputs inference analysis result V1 to the second sub-inference analyzer. The second sub-inference analyzer, based on the relevant state information G2 of the second data stream and the inference analysis result V1, performs inference analysis on the first data stream according to the sub-path of the second inference unit to obtain inference analysis result V2. That is, the FCL inference unit, SVM inference unit, clustering inference unit, and NB inference unit in the second sub-inference analyzer sequentially perform inference analysis on the first data stream to obtain inference analysis result V2, and the NB inference unit outputs inference analysis result V2.For example, a first sub-inference analyzer is used to classify the application of the first data stream, and a second sub-inference analyzer is used to determine the quality of the application corresponding to the first data stream. Inference analysis result V1 is the application classification result of the first data stream, which is "video conferencing." This application classification result indicates that the first data stream is a video conferencing application. Inference analysis result V2 is the quality of the video conferencing application, which is "poor." It should be noted that the relevant status information G1 of the first data stream may include at least one of the status information of the first data stream, the status information of data streams related to the first data stream, and the status information of resources related to the first data stream. The relevant status information G2 of the first data stream may include at least one of the status information of the first data stream, the status information of data streams related to the first data stream, and the status information of resources related to the first data stream. Furthermore, the relevant status information G1 and the relevant status information G2 may include the same information, or they may be different. This embodiment of the application does not limit this.
[0166] In another embodiment, as shown in Figure 19, the first inference analyzer is configured such that the communication chip inputs relevant status information of the first data stream into the first inference analyzer. The first inference analyzer performs inference analysis on the first data stream based on this status information, obtaining an inference analysis result E2. The first inference analyzer then outputs the inference analysis result E2. As shown in Figure 19, the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer, which are connected in series. The communication chip inputs relevant status information of the first data stream into the first sub-inference analyzer. The first sub-inference analyzer performs inference analysis on the first data stream based on this status information, obtaining an inference analysis result E1. The first sub-inference analyzer then outputs the inference analysis result E1 to the second sub-inference analyzer. The second sub-inference analyzer performs inference analysis on the first data stream based on the inference analysis result E1, obtaining an inference analysis result E2. The second sub-inference analyzer then outputs the inference analysis result E2. It should be noted that in the first inference analyzer shown in Figure 19, the first sub-inference analyzer and the second sub-inference analyzer each include at least one hardware inference unit from the hardware inference resource pool. For example, a first sub-inference analyzer includes at least one first hardware inference unit from a hardware inference resource pool, and a second sub-inference analyzer includes at least one second hardware inference unit from the same hardware inference resource pool. The at least one first hardware inference unit constitutes a first inference unit sub-path, and the at least one second hardware inference unit constitutes a second inference unit sub-path. The at least one first hardware inference unit and the at least one second hardware inference unit constitute a first inference path, where the first and second inference unit sub-paths are two sequential sub-paths within the first inference path. The first sub-inference analyzer performs inference analysis on the first data stream according to the relevant state information of the first data stream and the first inference unit sub-path to obtain inference analysis result E1. The second sub-inference analyzer performs inference analysis on the first data stream according to the second inference unit sub-path based on inference analysis result E1 to obtain inference analysis result E2.
[0167] In another embodiment, as shown in Figure 25, the first inference analyzer is configured such that the communication chip inputs relevant status information of the first data stream into the first inference analyzer. The first inference analyzer performs inference analysis on the first data stream based on this information, obtaining inference analysis results R2 and R3. The first inference analyzer then outputs inference analysis results R2 and R3. As shown in Figure 25, the first inference analyzer includes a first sub-inference analyzer, a second sub-inference analyzer, and a third sub-inference analyzer. The first and second sub-inference analyzers are connected in series, as are the first and third sub-inference analyzers, and are connected in parallel. The communication chip inputs relevant status information of the first data stream into the first sub-inference analyzer. The first sub-inference analyzer performs inference analysis on the first data stream based on this information, obtaining inference analysis result R1. The first sub-inference analyzer then outputs inference analysis result R1 to the second and third sub-inference analyzers, respectively. The second sub-inference analyzer performs inference analysis on the first data stream based on the inference analysis result R1 to obtain inference analysis result R2, and outputs inference analysis result R2. The third sub-inference analyzer performs inference analysis on the first data stream based on the inference analysis result R1 to obtain inference analysis result R3, and outputs inference analysis result R3. It should be noted that in the first inference analyzer shown in Figure 25, the first sub-inference analyzer includes at least one first hardware inference unit from the hardware inference resource pool, the second sub-inference analyzer includes at least one second hardware inference unit from the hardware inference resource pool, and the third sub-inference analyzer includes at least one third hardware inference unit from the hardware inference resource pool. The at least one first hardware inference unit constitutes a first inference unit sub-path, the at least one second hardware inference unit constitutes a second inference unit sub-path, and the at least one third hardware inference unit constitutes a third inference unit sub-path. These at least one first hardware inference unit, at least one second hardware inference unit, and at least one third hardware inference unit constitute a first inference path (i.e., an inference unit path). The first and second inference unit sub-paths are two sequential sub-paths within the first inference path, and the first and third inference unit sub-paths are two sequential sub-paths within the first inference path. The second and third inference unit sub-paths are two parallel sub-paths within the first inference path. The first sub-inference analyzer performs inference analysis on the first data stream according to the relevant state information of the first data stream and follows the first inference unit sub-path to obtain inference analysis result R1. The second sub-inference analyzer performs inference analysis on the first data stream according to the second inference unit sub-path based on the inference analysis result R1 and follows the second inference unit sub-path to obtain inference analysis result R2.The third sub-inference analyzer performs inference analysis on the first data stream according to the inference analysis result R3 and the third inference unit sub-path to obtain the inference analysis result R3.
[0168] In an optional embodiment, after the communication chip uses a first inference analyzer to perform inference analysis on the first data stream to obtain the inference analysis result of the first data stream, the communication chip determines the processing operation related to the inference analysis result of the first data stream based on the inference analysis result of the first data stream and a first operation instruction table. Then, the communication chip executes the processing operation related to the inference analysis result of the first data stream. The first operation instruction table corresponds to the first inference model (or, in other words, the first operation instruction table corresponds to the first inference analyzer), and includes the inference analysis result of the first data stream and indication information of the processing operation related to the inference analysis result of the first data stream. For example, the first operation instruction table includes the inference analysis result of the first data stream and operation instruction information corresponding to the inference analysis result of the first data stream, which is used to indicate the processing operation related to the inference analysis result of the first data stream. In a specific embodiment, the communication chip includes multiple operation instruction tables, each corresponding one-to-one with a multiple inference model. After the communication chip uses a first inference analyzer to perform inference analysis on the first data stream and obtains the inference analysis result of the first data stream, the communication chip determines the first operation instruction table corresponding to the first inference model from the multiple operation instruction tables. The communication chip searches the first operation instruction table based on the inference analysis result of the first data stream to determine the processing operation related to the inference analysis result of the first data stream. Then, the communication chip executes the processing operation related to the inference analysis result of the first data stream.
[0169] Based on the above description, it is easy to understand that when the first inference model includes multiple expert sub-models, the first inference analyzer includes multiple sub-inference analyzers, and the communication chip uses these multiple sub-inference analyzers to perform inference analysis on the first data stream. Therefore, the embodiments of this application can achieve simultaneous inference analysis of the same data stream using multiple sub-inference analyzers, thereby realizing relatively complex inference analysis.
[0170] It should be noted that the communication chip can execute the embodiment shown in Figure 3 to process the first data stream when the first data stream is the data stream to be analyzed and the first data stream meets the inference analysis conditions. Furthermore, the communication chip can execute the embodiment shown in Figure 3 based on each message of the first data stream, or it can periodically execute the embodiment shown in Figure 3 based on the messages of the first data stream. For example, the communication chip executes the embodiment shown in Figure 3 based on each message of the first data stream, where the first message is any message of the first data stream. Another example is that the communication chip periodically executes the embodiment shown in Figure 3 based on the messages of the first data stream; the communication chip periodically acquires messages of the first data stream and executes the embodiment shown in Figure 3 based on the acquired messages, where the first message is a message belonging to the first data stream acquired periodically by the communication chip. The inference analysis conditions may include, but are not limited to, the duration of the first data stream exceeding a preset duration, the number of messages belonging to the first data stream among the messages passing through the communication chip reaching a preset number, and the duration since the last use of the AI function to process the first data stream reaching a target duration. The preset duration, the preset number, and the target duration can all be set according to actual conditions. For example, the preset duration is 1 second, the preset quantity is 100, and the target duration is 100 milliseconds.
[0171] The embodiment shown in Figure 3 illustrates the use of AI functionality in a communication chip to process a first data stream. The communication chip can use AI functionality to process any data stream it receives. The implementation process of using AI functionality to process any data stream received by the communication chip can refer to the implementation process of using AI functionality to process the first data stream; therefore, it will not be repeated here.
[0172] In summary, the technical solution provided in this application involves a communication chip receiving a first message belonging to a first data stream. The chip then determines a first inference model based on the characteristic information of the first message. This first inference model instructs the use of a first inference analyzer to process the first data stream. The communication chip then uses the first inference analyzer to process the first data stream. On one hand, this application uses AI functionality to process the first data stream within the communication chip. Compared to using an external AI analyzer, this application eliminates the need for interaction between the communication chip and the AI analyzer, resulting in shorter processing time and higher efficiency when using AI functionality to process the first data stream. On the other hand, this application uses a first inference analyzer to process the first data stream. The first inference analyzer includes at least one hardware inference unit from a hardware inference resource pool. This hardware inference unit is a dedicated hardware inference unit, which has a faster processing speed and higher efficiency. Therefore, using the first inference analyzer to process the first data stream is both faster and more efficient.
[0173] Furthermore, in the technical solution provided in this application embodiment, the communication chip processes the first data stream using a first inference analyzer based on the relevant status information of the first data stream. The relevant status information of the first data stream includes the status information of the first data stream, the status information of data streams related to the first data stream, and the status information of resources related to the first data stream. That is, this application embodiment combines the status information of the first data stream, the status information of data streams related to the first data stream, and the status information of resources related to the first data stream, etc., to process the first data stream using the first inference analyzer, thereby improving the accuracy of processing the first data stream. For example, this application embodiment combines the status information of the first data stream, the status information of data streams related to the first data stream, and the status information of resources related to the first data stream, etc., to perform inference analysis on the first data stream using the first inference analyzer, thereby improving the accuracy of the inference analysis results of the first data stream.
[0174] In an optional embodiment, after the communication chip determines the first inference model based on the feature information of the first message, the communication chip constructs a first inference analyzer based on the configuration information of the first inference model and a hardware inference resource pool. Then, the communication chip uses the first inference analyzer to process the first data stream. The implementation process of the communication chip constructing the first inference analyzer based on the hardware inference resource pool is described below.
[0175] In an optional embodiment, the first inference model includes at least one inference component, the first inference analyzer includes at least one hardware inference unit in a hardware inference resource pool, and the configuration information of the first inference model includes a mapping relationship between the at least one inference component and the at least one hardware inference unit. For example, the mapping relationship is a one-to-one mapping relationship, where the at least one inference component corresponds one-to-one with the at least one hardware inference unit. The communication chip determines the at least one hardware inference unit in the hardware inference resource pool based on the mapping relationship between the at least one inference component and the at least one hardware inference unit; the communication chip constructs the first inference analyzer based on the at least one hardware inference unit.
[0176] In a specific embodiment, the configuration information of the first inference model further includes parameter configuration information of the at least one inference component. The communication chip constructs a first inference analyzer based on the at least one hardware inference unit, including: the communication chip configures the inference parameters of the hardware inference unit corresponding to each inference component according to the parameter configuration information of each inference component, so that the inference parameters of the hardware inference unit corresponding to each inference component are the same as the inference parameters of each inference component, and the parameter values of the corresponding inference parameters in the hardware inference unit and each inference component are the same. When the at least one inference component is multiple inference components, the configuration information of the first inference model further includes connectivity information of the multiple inference components, which is used to indicate the connectivity of the multiple inference components. The construction of the first inference analyzer based on the at least one hardware inference unit by the communication chip further includes: the communication chip configuring the multiple hardware inference units to be connected according to the connectivity indicated by the connectivity information of the multiple inference components. For example, hardware inference units in a hardware inference resource pool are connected via gates. A gate connecting any two hardware inference units controls whether those two units are connected or disconnected. The communication chip controls the gate state of each gate used to connect the multiple hardware inference units, ensuring that the multiple hardware inference units are connected according to the connectivity information indicated by the connectivity relationship information of the multiple inference components. The connectivity relationship of the multiple hardware inference units corresponds to the connectivity relationship of the multiple inference components.
[0177] In one embodiment, the first inference model includes at least one inference component, which is a single inference component, and the first inference analyzer includes at least one hardware inference unit, which is a single hardware inference unit. The configuration information of the first inference model includes the mapping relationship between the inference component and the hardware inference unit, and the parameter configuration information of the inference component. The communication chip constructs the first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model, including: the communication chip determining the hardware inference unit in the hardware inference resource pool according to the mapping relationship between the inference component and the hardware inference unit; and the communication chip configuring the inference parameters of the hardware inference unit according to the parameter configuration information of the inference component, so that the inference parameters of the hardware inference unit are the same as the inference parameters of the inference component, and the parameter values of the corresponding inference parameters in the hardware inference unit and the inference component are the same. In one example, the first inference model is inference model B as shown in FIG8, and the first inference analyzer is inference analyzer B as shown in FIG9. The first inference model includes a decision tree inference component, and the first inference analyzer includes a decision tree inference unit. The configuration information of the first inference model includes the mapping relationship between the decision tree inference component and the decision tree inference unit, as well as the parameter configuration information of the decision tree inference component. The communication chip determines the decision tree inference unit in the hardware inference resource pool based on the mapping relationship between the decision tree inference component and the decision tree inference unit. The communication chip configures the inference parameters of the decision tree inference unit according to the parameter configuration information of the decision tree inference component, ensuring that the inference parameters of the decision tree inference unit are the same as those of the decision tree inference component, and that the parameter values of the corresponding inference parameters in both the decision tree inference unit and the decision tree inference component are identical. For example, both the decision tree inference unit and the decision tree inference component include inference parameter B1 and inference parameter B2, and the parameter configuration information of the decision tree inference component includes the parameter values of inference parameter B1 and inference parameter B2. The communication chip configures the inference parameter B1 of the decision tree inference unit according to the parameter value of the inference parameter B1 of the decision tree inference component, so that the parameter value of the inference parameter B1 of the decision tree inference unit is the same as the parameter value of the inference parameter B1 of the decision tree inference component. The communication chip also configures the inference parameter B2 of the decision tree inference unit according to the parameter value of the inference parameter B2 of the decision tree inference component, so that the parameter value of the inference parameter B2 of the decision tree inference unit is the same as the parameter value of the inference parameter B2 of the decision tree inference component. In another example, the first inference model is inference model C as shown in Figure 10, and the first inference analyzer is inference analyzer C as shown in Figure 11. The first inference model includes a CNN inference component, and the first inference analyzer includes a CNN inference unit.The configuration information of the first inference model includes the mapping relationship between the CNN inference component and the CNN inference unit, as well as the parameter configuration information of the CNN inference component. The communication chip determines the CNN inference unit in the hardware inference resource pool based on the mapping relationship between the CNN inference component and the CNN inference unit. The communication chip configures the inference parameters of the CNN inference unit according to the parameter configuration information of the CNN inference component, ensuring that the inference parameters of the CNN inference unit are the same as those of the CNN inference component, and that the corresponding inference parameter values are identical in both the CNN inference unit and the CNN inference component. For example, both the CNN inference unit and the CNN inference component include inference parameter C1 and inference parameter C2. The parameter configuration information of the CNN inference component includes the parameter values of inference parameter C1 and inference parameter C2. The communication chip configures the inference parameter C1 of the CNN inference unit according to the parameter value of inference parameter C1 of the CNN inference component, ensuring that the parameter value of inference parameter C1 of the CNN inference unit is the same as that of inference parameter C1 of the CNN inference component. The communication chip configures the inference parameter C2 of the CNN inference unit according to the parameter value of the inference parameter C2 of the CNN inference component, so that the parameter value of the inference parameter C2 of the CNN inference unit is the same as the parameter value of the inference parameter C2 of the CNN inference component.
[0178] In another embodiment, the first inference model includes at least one inference component, which is a plurality of inference components, and the first inference analyzer includes at least one hardware inference unit, which is a plurality of hardware inference units. The configuration information of the first inference model includes the mapping relationship between the plurality of inference components and the plurality of hardware inference units, the parameter configuration information of the plurality of inference components, and the connectivity information of the plurality of inference components. The communication chip constructs the first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model, including: the communication chip determining the plurality of hardware inference units in the hardware inference resource pool according to the mapping relationship between the plurality of inference components and the plurality of hardware inference units; the communication chip configuring the inference parameters of the hardware inference unit corresponding to each inference component according to the parameter configuration information of each inference component, such that the inference parameters of the hardware inference unit corresponding to each inference component correspond to the same inference parameters of each inference component, and the parameter values of the corresponding inference parameters in the hardware inference unit corresponding to each inference component and in each inference component are the same; and the communication chip configuring the plurality of hardware inference units to be connected according to the connectivity relationship indicated by the connectivity information of the plurality of inference components. In one example, the first inference model is inference model A as shown in Figure 6, and the first inference analyzer is inference analyzer A as shown in Figure 7. The first inference model includes an FCL inference component and a loop-layer inference component, which are sequentially connected. The first inference analyzer includes an FCL inference unit and a loop-layer inference unit. The configuration information of the first inference model includes the mapping relationship between the FCL inference component and the FCL inference unit, the mapping relationship between the loop-layer inference component and the loop-layer inference unit, the parameter configuration information of the FCL inference component, the parameter configuration information of the loop-layer inference component, and the connectivity information between the FCL inference component and the loop-layer inference component, which indicates that the FCL inference component and the loop-layer inference component are sequentially connected. The communication chip determines the FCL inference unit in the hardware inference resource pool based on the mapping relationship between the FCL inference component and the FCL inference unit. The communication chip determines the loop-layer inference unit in the hardware inference resource pool based on the mapping relationship between the loop-layer inference component and the loop-layer inference unit. The communication chip configures the inference parameters of the FCL inference unit according to the parameter configuration information of the FCL inference component, ensuring that the inference parameters of the FCL inference unit are the same as those of the FCL inference component, and that the corresponding inference parameter values are identical in both the FCL inference unit and the FCL inference component. Similarly, the communication chip configures the inference parameters of the loop layer inference unit according to the parameter configuration information of the loop layer inference component, ensuring that the inference parameters of the loop layer inference unit are the same as those of the loop layer inference component, and that the corresponding inference parameter values are identical in both the loop layer inference unit and the loop layer inference component.For example, both the FCL inference unit and the FCL inference component include inference parameters A1 and A2, and both the loop layer inference unit and the loop layer inference component include inference parameters A3 and A4. The parameter configuration information of the FCL inference component includes the parameter values of inference parameter A1 and A2, and the parameter configuration information of the loop layer inference component includes the parameter values of inference parameter A3 and A4. The communication chip configures the inference parameter A1 of the FCL inference unit according to the parameter value of inference parameter A1 of the FCL inference component, so that the parameter value of inference parameter A1 of the FCL inference unit is the same as the parameter value of inference parameter A1 of the FCL inference component. The communication chip configures the inference parameter A2 of the FCL inference unit according to the parameter value of inference parameter A2 of the FCL inference component, so that the parameter value of inference parameter A2 of the FCL inference unit is the same as the parameter value of inference parameter A2 of the FCL inference component. The communication chip configures the inference parameter A3 of the loop-layer inference unit according to the parameter value of the inference parameter A3 of the loop-layer inference component, so that the parameter value of the inference parameter A3 of the loop-layer inference unit is the same as the parameter value of the inference parameter A3 of the loop-layer inference component. Furthermore, the communication chip configures the FCL inference unit and the loop-layer inference unit to be connected according to the connectivity relationship indicated by the connectivity information of the FCL inference component and the loop-layer inference component. That is, the communication chip configures the FCL inference unit and the loop-layer inference unit to be connected sequentially. For example, the communication chip controls the selection state of the selector used to connect the FCL inference unit and the loop-layer inference unit to ensure that the FCL inference unit and the loop-layer inference unit are connected sequentially.
[0179] In an optional embodiment, the first inference model includes a first expert sub-model and a second expert sub-model, and the first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The configuration information of the first inference model includes the configuration information of the first expert sub-model and the second expert sub-model. The communication chip constructs the first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model, including: the communication chip constructs the first sub-inference analyzer based on the hardware inference resource pool according to the configuration information of the first expert sub-model; the communication chip constructs the second sub-inference analyzer based on the hardware inference resource pool according to the configuration information of the second expert sub-model.
[0180] In an optional embodiment, the first expert sub-model includes at least one first inference component, and the second expert sub-model includes at least one second inference component. The at least one inference component included in the first inference model includes both the at least one first inference component and the at least one second inference component. The first sub-inference analyzer includes at least one first hardware inference unit, and the second sub-inference analyzer includes at least one second hardware inference unit. The at least one hardware inference unit included in the first inference analyzer includes both the at least one first hardware inference unit and the at least one second hardware inference unit. The mapping relationship between the at least one inference component and the at least one hardware inference unit includes the mapping relationship between the at least one first inference component and the at least one first hardware inference unit, and the mapping relationship between the at least one second inference component and the at least one second hardware inference unit. For example, the configuration information of the first expert sub-model includes the mapping relationship between the at least one first inference component and the at least one first hardware inference unit, and the configuration information of the second expert sub-model includes the mapping relationship between the at least one second inference component and the at least one second hardware inference unit. The communication chip determines the at least one first hardware inference unit in the hardware inference resource pool according to the mapping relationship between the at least one first inference component and the at least one first hardware inference unit, and the communication chip constructs the first sub-inference analyzer based on the at least one first hardware inference unit. The communication chip determines the at least one second hardware inference unit in the hardware inference resource pool based on the mapping relationship between the at least one second inference component and the at least one second hardware inference unit, and the communication chip constructs a second sub-inference analyzer based on the at least one second hardware inference unit.
[0181] In a specific embodiment, the configuration information of the first inference model further includes parameter configuration information of the at least one first inference component and parameter configuration information of the at least one second inference component. For example, the configuration information of the first expert sub-model further includes parameter configuration information of the at least one first inference component, and the configuration information of the second expert sub-model further includes parameter configuration information of the at least one second inference component. The communication chip constructs a first sub-inference analyzer based on the at least one first hardware inference unit, including: the communication chip configuring inference parameters for the first hardware inference unit corresponding to each first inference component based on the parameter configuration information of each of the at least one first inference components, such that the inference parameters of the first hardware inference unit corresponding to each first inference component are the same as the inference parameters of each first inference component, and the parameter values of the corresponding inference parameters in the first hardware inference unit corresponding to each first inference component and in each first inference component are the same. The communication chip constructs a second sub-inference analyzer based on the at least one second hardware inference unit, including: the communication chip configuring inference parameters for the second hardware inference unit corresponding to each second inference component based on the parameter configuration information of each second inference component, such that the inference parameters of the second hardware inference unit corresponding to each second inference component are the same as the inference parameters of each second inference component, and the parameter values of the corresponding inference parameters in the second hardware inference unit and each second inference component are the same. When the at least one first inference component is multiple first inference components, the configuration information of the first inference model also includes connectivity information of the multiple first inference components, which is used to indicate the connectivity of the multiple first inference components. For example, the configuration information of the first expert sub-model also includes the connectivity information of the multiple first inference components. The construction of the first sub-inference analyzer based on the at least one first hardware inference unit further includes: the communication chip configuring the multiple first hardware inference units to be connected according to the connectivity indicated by the connectivity information of the multiple first inference components. For example, hardware inference units in a hardware inference resource pool are connected via gates. A communication chip controls the gate state of each gate used to connect the plurality of first hardware inference units, ensuring that the plurality of first hardware inference units are connected according to the connectivity information indicated by the connectivity relationship information of the plurality of first inference components. The connectivity relationship of the plurality of first hardware inference units corresponds to the connectivity relationship of the plurality of first inference components. When at least one second inference component is a plurality of second inference components, the configuration information of the first inference model also includes connectivity information of the plurality of second inference components, which is used to indicate the connectivity relationship of the plurality of second inference components.For example, the configuration information of the second expert sub-model also includes connectivity information of the plurality of second inference components. The communication chip constructs a second sub-inference analyzer based on the at least one second hardware inference unit, and further includes: configuring the plurality of second hardware inference units to be connected according to the connectivity relationships indicated by the connectivity information of the plurality of second inference components. For example, the hardware inference units in the hardware inference resource pool are connected via gates, and the communication chip controls the selection state of each gate used to connect the plurality of second hardware inference units, so that the plurality of second hardware inference units are connected according to the connectivity relationships indicated by the connectivity information of the plurality of second inference components, and the connectivity relationships of the plurality of second hardware inference units correspond to the connectivity relationships of the plurality of second inference components.
[0182] In an optional embodiment, the first inference model includes a first expert sub-model and a second expert sub-model. The configuration information of the first inference model further includes serial-parallel connection information between the first and second expert sub-models. This serial-parallel connection information indicates whether the first and second expert sub-models are connected in series or in parallel. The communication chip constructs a first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model. The method further includes: the communication chip constructing a serial-parallel connection between the first and second sub-inference analyzers according to the serial-parallel connection information indicated by the serial-parallel connection information between the first and second expert sub-models. The serial-parallel connection between the first and second sub-inference analyzers is the same as the serial-parallel connection between the first and second expert sub-models. In one embodiment, hardware inference units in the hardware inference resource pool are connected via a gate, and this serial-parallel connection information is used to indicate that the first expert sub-model and the second expert sub-model are connected in series. The communication chip controls the gate state of the gate used to connect the first sub-inference analyzer and the second sub-inference analyzer, thereby connecting the first sub-inference analyzer and the second sub-inference analyzer in series. For example, the communication chip controls the gate state of the gate used to connect the target first hardware inference unit and the target second hardware inference unit, thereby connecting the target first hardware inference unit and the target second hardware inference unit, thus connecting the first sub-inference analyzer and the second sub-inference analyzer in series. The target first hardware inference unit is the output inference unit in the first sub-inference analyzer, and the target first hardware inference unit corresponds to the input inference component in the first expert sub-model. The target second hardware inference unit is the input inference unit in the second sub-inference analyzer, and the target second hardware inference unit corresponds to the input inference component in the second expert sub-model. The output inference unit in the first sub-inference analyzer is the hardware inference unit in the first sub-inference analyzer used to output the processing result (e.g., inference analysis result) of the first sub-inference analyzer. The input inference unit in the second sub-inference analyzer is the hardware inference unit used to receive input information. The output inference component in the first expert sub-model is the inference component used to output the processing results (e.g., inference analysis results) of the first expert sub-model. The input inference component in the second expert sub-model is the inference component used to receive input information.
[0183] In one embodiment, the first inference model includes a first expert sub-model and a second expert sub-model. The first expert sub-model includes at least one first inference component, which is actually multiple first inference components. The second expert sub-model includes at least one second inference component, which is actually multiple second inference components. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes at least one first hardware inference unit, which is actually multiple first hardware inference units. The second sub-inference analyzer includes at least one second hardware inference unit, which is actually multiple second hardware inference units. The configuration information of the first inference model includes the configuration information of the first expert sub-model, the configuration information of the second expert sub-model, and the serial / parallel connection information between the first and second expert sub-models. The configuration information of the first expert sub-model includes the mapping relationship between the multiple first inference components and the multiple first hardware inference units, the parameter configuration information of the multiple first inference components, and the connectivity information of the multiple first inference components. The configuration information of the second expert sub-model includes the mapping relationship between the multiple second inference components and the multiple second hardware inference units, the parameter configuration information of the multiple second inference components, and the connectivity information of the multiple second inference components. The communication chip constructs a first sub-inference analyzer based on the configuration information of the first expert sub-model and a hardware inference resource pool. This includes: the communication chip determining the plurality of first hardware inference units in the hardware inference resource pool based on the mapping relationship between the plurality of first inference components and the plurality of first hardware inference units; the communication chip configuring inference parameters for the first hardware inference unit corresponding to each of the plurality of first inference components based on the parameter configuration information of each of the plurality of first inference components, such that the inference parameters of the first hardware inference unit corresponding to each of the first inference components are the same as the inference parameters of each of the first inference components, and the parameter values of the corresponding inference parameters in the first hardware inference unit corresponding to each of the first inference components are the same; and the communication chip configuring the plurality of first hardware inference units to be connected according to the connection relationship indicated by the connection relationship information of the plurality of first inference components.The communication chip constructs a second sub-inference analyzer based on the configuration information of the second expert sub-model and the hardware inference resource pool. This includes: the communication chip determining the plurality of second hardware inference units in the hardware inference resource pool according to the mapping relationship between the plurality of second inference components and the plurality of second hardware inference units; the communication chip configuring the inference parameters of the second hardware inference unit corresponding to each of the plurality of second inference components based on the parameter configuration information of each second inference component, ensuring that the inference parameters of the second hardware inference unit corresponding to each second inference component are the same as the inference parameters of each second inference component, and that the parameter values of the corresponding inference parameters in the second hardware inference unit and each second inference component are the same; and the communication chip configuring the plurality of second hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components. The communication chip constructs the serial-parallel connection relationship between the first sub-inference analyzer and the second sub-inference analyzer according to the serial-parallel connection relationship indicated by the serial-parallel connection relationship information between the first expert sub-model and the second expert sub-model. For example, hardware inference units in the hardware inference resource pool are connected via gates, and this serial-parallel connection information is used to indicate that the first expert sub-model and the second expert sub-model are connected in series. The communication chip controls the gate state of each gate used to connect the plurality of first hardware inference units, ensuring that the plurality of first hardware inference units are connected according to the connectivity information indicated by the connectivity information of the plurality of first inference components. The communication chip controls the gate state of each gate used to connect the plurality of second hardware inference units, ensuring that the plurality of second hardware inference units are connected according to the connectivity information indicated by the connectivity information of the plurality of second inference components. The communication chip controls the gate state of each gate used to connect the first sub-inference analyzer and the second sub-inference analyzer, ensuring that the first sub-inference analyzer and the second sub-inference analyzer are connected in series.
[0184] In one example, the first inference model is shown in Figure 20, and the first inference analyzer is shown in Figure 21. The first inference model includes a first expert sub-model and a second expert sub-model, which are connected in parallel. The first expert sub-model includes a decision tree inference component, an SVM inference component, and a clustering inference component, which are sequentially connected. The second expert sub-model includes a recurrent layer inference component, an FCL inference component, and an NB inference component, which are sequentially connected. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes a decision tree inference unit, an SVM inference unit, and a clustering inference unit, while the second sub-inference analyzer includes a recurrent layer inference unit, an FCL inference unit, and an NB inference unit. The configuration information of the first inference model includes the configuration information of the first expert sub-model, the configuration information of the second expert sub-model, and the serial / parallel connection information between the first and second expert sub-models, which indicates the parallel connection of the first and second expert sub-models. The configuration information of the first expert sub-model includes: the mapping relationship between the decision tree inference component and the decision tree inference unit, the mapping relationship between the SVM inference component and the SVM inference unit, the mapping relationship between the clustering inference component and the clustering inference unit, the parameter configuration information of the decision tree inference component, the parameter configuration information of the SVM inference component, the parameter configuration information of the clustering inference component, and the connectivity information of the decision tree inference component, the SVM inference component and the clustering inference component, wherein the connectivity information is used to indicate that the decision tree inference component, the SVM inference component and the clustering inference component are connected sequentially. The configuration information of the second expert sub-model includes: the mapping relationship between the loop layer inference component and the loop layer inference unit, the mapping relationship between the FCL inference component and the FCL inference unit, the mapping relationship between the NB inference component and the NB inference unit, the parameter configuration information of the loop layer inference component, the parameter configuration information of the FCL inference component, the parameter configuration information of the NB inference component, and the connectivity information of the loop layer inference component, the FCL inference component, and the NB inference component, which indicates that the loop layer inference component, the FCL inference component, and the NB inference component are connected sequentially. Based on the configuration information of the first expert sub-model, the communication chip constructs the first sub-inference analyzer based on the hardware inference resource pool.Specifically, this includes: the communication chip determining the decision tree inference unit in the hardware inference resource pool based on the mapping relationship between the decision tree inference component and the decision tree inference unit; the communication chip determining the SVM inference unit in the hardware inference resource pool based on the mapping relationship between the SVM inference component and the SVM inference unit; the communication chip determining the clustering inference unit in the hardware inference resource pool based on the mapping relationship between the clustering inference component and the clustering inference unit; the communication chip configuring the inference parameters of the decision tree inference unit according to the parameter configuration information of the decision tree inference component, so that the inference parameters of the decision tree inference unit are the same as the inference parameters of the decision tree inference component, and the parameter values of the corresponding inference parameters in the decision tree inference unit and the decision tree inference component are the same; the communication chip determining the SVM inference unit according to the mapping relationship between the SVM inference component and the clustering inference unit; the communication chip determining the SVM ... according to the mapping relationship between the SVM inference component and the clustering inference unit; the communication chip determining the SVM inference unit according to the mapping relationship between the SVM inference component and the clustering inference unit; the communication chip determining the SVM inference unit according to the mapping relationship between the SVM inference component and the clustering inference unit; the communication chip determining the SVM inference unit according to the mapping relationship between the SVM inference component and the clustering inference unit; the communication chip determining the S The parameter configuration information of the decision tree inference component is used to configure the inference parameters of the SVM inference unit, ensuring that the inference parameters of the SVM inference unit are the same as those of the SVM inference component, and that the corresponding inference parameter values are identical in both the SVM inference unit and the SVM inference component. The communication chip configures the inference parameters of the clustering inference unit according to the parameter configuration information of the clustering inference component, ensuring that the inference parameters of the clustering inference unit are the same as those of the clustering inference component, and that the corresponding inference parameter values are identical in both the clustering inference unit and the clustering inference component. The communication chip configures the decision tree inference unit, the SVM inference unit, and the clustering inference unit to be sequentially connected according to the connectivity information of the decision tree inference component, the SVM inference component, and the clustering inference component. Based on the configuration information of the second expert sub-model, the communication chip constructs a second sub-inference analyzer based on the hardware inference resource pool.Specifically, this includes: the communication chip determining the loop layer inference unit in the hardware inference resource pool based on the mapping relationship between the loop layer inference component and the loop layer inference unit; the communication chip determining the FCL inference unit in the hardware inference resource pool based on the mapping relationship between the FCL inference component and the FCL inference unit; the communication chip determining the NB inference unit in the hardware inference resource pool based on the mapping relationship between the NB inference component and the NB inference unit; the communication chip configuring the inference parameters of the loop layer inference unit according to the parameter configuration information of the loop layer inference component, so that the inference parameters of the loop layer inference unit are the same as the inference parameters of the loop layer inference component, and the parameter values of the corresponding inference parameters in the loop layer inference unit and the loop layer inference component are the same; the communication chip determining the NB inference unit in the hardware inference resource pool based on the mapping relationship between the FCL inference component and the NB inference unit; the communication chip determining the NB inference unit in the hardware inference resource pool based on the mapping relationship between the FCL inference component and the NB inference unit; the communication chip configuring the inference parameters of the loop layer inference unit according to the parameter configuration information of the loop layer inference component, so that the inference parameters of the loop layer inference unit are the same as the inference parameters of the loop layer inference component, and the parameter values of the corresponding inference parameters in the loop layer inference unit and the loop layer inference component are the same; the communication chip determining the NB inference unit according to the mapping relationship between the FCL inference component and the NB inference unit; the communication chip determining the NB inference unit in the hardware inference resource pool based on ... The parameter configuration information of the inference component is used to configure the inference parameters of the FCL inference unit, so that the inference parameters of the FCL inference unit are the same as those of the FCL inference component, and the corresponding inference parameter values are the same in both the FCL inference unit and the FCL inference component. The communication chip configures the inference parameters of the NB inference unit according to the parameter configuration information of the NB inference component, so that the inference parameters of the NB inference unit are the same as those of the NB inference component, and the corresponding inference parameter values are the same in both the NB inference unit and the NB inference component. The communication chip configures the cyclic layer inference unit, the FCL inference unit, and the NB inference unit to be connected sequentially according to the connectivity information of the cyclic layer inference component, the FCL inference component, and the NB inference component.
[0185] In another example, the first inference model is shown in Figure 22, and the first inference analyzer is shown in Figure 23. The first inference model includes a first expert sub-model and a second expert sub-model, which are connected in series. The first expert sub-model includes a decision tree inference component and a recurrent layer inference component, which are sequentially connected. The second expert sub-model includes an FCL inference component, an SVM inference component, a clustering inference component, and an NB inference component, which are sequentially connected. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes a decision tree inference unit and a recurrent layer inference unit, and the second sub-inference analyzer includes an FCL inference unit, an SVM inference unit, a clustering inference unit, and an NB inference unit. The configuration information of the first inference model includes the configuration information of the first expert sub-model, the configuration information of the second expert sub-model, and the series / parallel connection information between the first and second expert sub-models, which indicates the series connection of the first and second expert sub-models. The configuration information of the first expert sub-model includes: the mapping relationship between the decision tree inference component and the decision tree inference unit, the mapping relationship between the loop layer inference component and the loop layer inference unit, the parameter configuration information of the decision tree inference component, the parameter configuration information of the loop layer inference component, and the connectivity information between the decision tree inference component and the loop layer inference component, wherein the connectivity information is used to indicate that the decision tree inference component and the loop layer inference component are connected sequentially. The configuration information of the second expert sub-model includes: the mapping relationship between the FCL inference component and the FCL inference unit, the mapping relationship between the SVM inference component and the SVM inference unit, the mapping relationship between the clustering inference component and the clustering inference unit, the mapping relationship between the NB inference component and the NB inference unit, the parameter configuration information of the FCL inference component, the parameter configuration information of the SVM inference component, the parameter configuration information of the clustering inference component, the parameter configuration information of the NB inference component, and the connectivity information of the FCL inference component, the SVM inference component, the clustering inference component, and the NB inference component. This connectivity information indicates that the FCL inference component, the SVM inference component, the clustering inference component, and the NB inference component are sequentially connected. Based on the configuration information of the first expert sub-model, the communication chip constructs a first sub-inference analyzer based on the hardware inference resource pool.Specifically, the process includes: the communication chip determining the decision tree inference unit in the hardware inference resource pool based on the mapping relationship between the decision tree inference component and the decision tree inference unit; the communication chip determining the loop layer inference unit in the hardware inference resource pool based on the mapping relationship between the loop layer inference component and the loop layer inference unit; the communication chip configuring the inference parameters of the decision tree inference unit according to the parameter configuration information of the decision tree inference component, so that the inference parameters of the decision tree inference unit are the same as the inference parameters of the decision tree inference component, and the parameter values of the corresponding inference parameters in the decision tree inference unit and the decision tree inference component are the same; the communication chip configuring the inference parameters of the loop layer inference unit according to the parameter configuration information of the loop layer inference component, so that the inference parameters of the loop layer inference unit are the same as the inference parameters of the loop layer inference component, and the parameter values of the corresponding inference parameters in the loop layer inference unit and the loop layer inference component are the same; and the communication chip configuring the decision tree inference unit and the loop layer inference unit to be sequentially connected according to the connectivity information of the decision tree inference component and the loop layer inference component. The communication chip constructs a second sub-inference analyzer based on the configuration information of the second expert sub-model and the hardware inference resource pool.Specifically, this includes: the communication chip determining the FCL inference unit in the hardware inference resource pool based on the mapping relationship between the FCL inference component and the FCL inference unit; the communication chip determining the SVM inference unit in the hardware inference resource pool based on the mapping relationship between the SVM inference component and the SVM inference unit; determining the clustering inference unit in the hardware inference resource pool based on the mapping relationship between the clustering inference component and the clustering inference unit; the communication chip determining the NB inference unit in the hardware inference resource pool based on the mapping relationship between the NB inference component and the NB inference unit; the communication chip configuring the inference parameters of the FCL inference unit according to the parameter configuration information of the FCL inference component, so that the inference parameters of the FCL inference unit are the same as the inference parameters of the FCL inference component, and the parameter values of the corresponding inference parameters in the FCL inference unit and the FCL inference component are the same; and the communication chip configuring the inference parameters of the SVM inference unit according to the parameter configuration information of the SVM inference component, so that the SVM inference unit... The inference parameters of the unit are the same as those of the SVM inference component, and the corresponding inference parameter values are the same in both the SVM inference unit and the SVM inference component. The communication chip configures the inference parameters of the clustering inference unit according to the parameter configuration information of the clustering inference component, ensuring that the inference parameters of the clustering inference unit are the same as those of the clustering inference component, and that the corresponding inference parameter values are the same in both the clustering inference unit and the clustering inference component. The communication chip also configures the inference parameters of the NB inference unit according to the parameter configuration information of the NB inference component, ensuring that the inference parameters of the NB inference unit are the same as those of the NB inference component, and that the corresponding inference parameter values are the same in both the NB inference unit and the NB inference component. Finally, the communication chip configures the FCL inference unit, the SVM inference unit, the clustering inference unit, and the NB inference unit to be sequentially connected according to the connectivity information of the FCL inference component, the SVM inference component, the clustering inference component, and the NB inference component. Furthermore, based on the series-parallel connection information between the first and second expert sub-models, the communication chip configures the first sub-inference analyzer and the second sub-inference analyzer to be connected in series. For example, the communication chip configures the loop layer inference unit to be connected to the FCL inference unit, thus connecting the first and second sub-inference analyzers in series.
[0186] It should be noted that, in the embodiments of this application, the configuration information of the first inference model includes the parameter configuration information of each inference component in the first inference model, and the parameter configuration information includes the parameter values of the inference parameters. The parameter values of the inference parameters of each inference component in the first inference model are determined in advance by training the first inference model. For example, the inference parameters of the decision tree component include branch and comparison parameters, the inference parameters of the SVM inference component include variable coefficient parameters, the inference parameters of the NB inference component include conditional probability parameters, the inference parameters of the clustering inference component include type object coordinate parameters, and the inference parameters of the neural network inference component include weight bias parameters. The parameter values of these inference parameters can all be determined in advance by training the first inference model. Furthermore, when the first inference model includes multiple expert sub-models, each of the multiple expert sub-models is trained independently in advance to determine the parameter values of the inference parameters of the inference components included in each expert sub-model.
[0187] In this embodiment, the communication chip includes a first model configuration template, which can be generated based on the configuration information of a first inference model. The first model configuration template is used to record the configuration information of the first inference model. For example, the first inference model is pre-trained offline to obtain the parameter values of the inference parameters of each inference component in the first inference model, and the first model configuration template is generated based on the first inference model and the parameter values of the inference parameters of each inference component in the first inference model.
[0188] This application uses the example of a communication chip building a first inference analyzer based on a hardware inference resource pool to illustrate the concept. The communication chip can build any possible inference analyzer based on this hardware inference resource pool. The implementation process of the communication chip building other inference analyzers based on this hardware inference resource pool can refer to the implementation process of the communication chip building the first inference analyzer based on this hardware inference resource pool, and will not be described in detail here.
[0189] The above is a description of the method embodiments of this application. The following describes the apparatus embodiments of this application, which are used to execute the method of this application. For details not disclosed in the apparatus embodiments of this application, please refer to the method embodiments.
[0190] Please refer to Figure 27, which shows a schematic diagram of a data stream processing device 700 provided in an embodiment of this application. The data stream processing device 700 is applied to a communication chip. For example, the data stream processing device 700 is a communication chip or a functional component within a communication chip. The communication chip is located in a communication device. As shown in Figure 27, the data stream processing device 700 includes a receiving module 710, a determining module 720, and a processing module 730.
[0191] The receiving module 710 is used to receive the first message, which belongs to the first data stream;
[0192] The determination module 720 is used to determine a first inference model based on the feature information of the first message. The first inference model is used to indicate the use of a first inference analyzer to process the first data stream. The first inference analyzer includes at least one hardware inference unit in a hardware inference resource pool. The communication chip includes the hardware inference resource pool.
[0193] Processing module 730 is used to process the first data stream using the first inference analyzer.
[0194] In an optional embodiment, referring to FIG27, the data stream processing device 700 further includes a construction module 740. The construction module 740 is used to construct a first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model.
[0195] In an optional embodiment, the first inference model includes at least one inference component, and the configuration information of the first inference model includes a mapping relationship between the at least one inference component and the at least one hardware inference unit; the construction module 740 is configured to: determine the at least one hardware inference unit according to the mapping relationship between the at least one inference component and the at least one hardware inference unit; and construct a first inference analyzer based on the at least one hardware inference unit.
[0196] In an optional embodiment, the at least one inference component is a plurality of inference components, the at least one hardware inference unit is a plurality of hardware inference units, and the configuration information of the first inference model further includes connectivity information of the plurality of inference components, the connectivity information being used to indicate the connectivity of the plurality of inference components; the construction module 740 is configured to: configure the plurality of hardware inference units to be connected according to the connectivity indicated by the connectivity information of the plurality of inference components.
[0197] In an optional embodiment, the hardware inference units in the hardware inference resource pool are connected via gates; the construction module 740 is configured to: control the gate state of each gate used to connect the plurality of hardware inference units, so that the plurality of hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components.
[0198] In an optional embodiment, the first inference model includes a first expert sub-model and a second expert sub-model. The first expert sub-model includes at least one first inference component, and the second expert sub-model includes at least one second inference component. The at least one inference component includes the at least one first inference component and the at least one second inference component. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes at least one first hardware inference unit, and the second sub-inference analyzer includes at least one second hardware inference unit. The at least one hardware inference unit includes the at least one first hardware inference unit and the at least one second hardware inference unit. The construction module 740 is configured to: construct the first sub-inference analyzer based on the at least one first hardware inference unit; and construct the second sub-inference analyzer based on the at least one second hardware inference unit.
[0199] In an optional embodiment, the at least one first inference component is a plurality of first inference components, the at least one second inference component is a plurality of second inference components, the at least one first hardware inference unit is a plurality of first hardware inference units, and the at least one second hardware inference unit is a plurality of second hardware inference units. The configuration information of the first inference model further includes connectivity information of the plurality of first inference components and connectivity information of the plurality of second inference components. The connectivity information of the plurality of first inference components is used to indicate the connectivity of the plurality of first inference components, and the connectivity information of the plurality of second inference components is used to indicate the connectivity of the plurality of second inference components. The construction module 740 is configured to: configure the plurality of first hardware inference units to be connected according to the connectivity information of the plurality of first inference components; and configure the plurality of second hardware inference units to be connected according to the connectivity information of the plurality of second inference components.
[0200] In an optional embodiment, the hardware inference units in the hardware inference resource pool are connected via gates; the construction module 740 is configured to: control the gate state of each gate used to connect the plurality of first hardware inference units, so that the plurality of first hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of first inference components; and control the gate state of each gate used to connect the plurality of second hardware inference units, so that the plurality of second hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components.
[0201] In an optional embodiment, the configuration information of the first inference model further includes serial-parallel relationship information between the first expert sub-model and the second expert sub-model. This serial-parallel relationship information is used to indicate the serial-parallel relationship between the first expert sub-model and the second expert sub-model. The construction module 740 is used to: construct the serial-parallel relationship between the first sub-inference analyzer and the second sub-inference analyzer according to the serial-parallel relationship indicated by the serial-parallel relationship information between the first expert sub-model and the second expert sub-model.
[0202] In an optional embodiment, the hardware inference units in the hardware inference resource pool are connected by a gating mechanism. This serial-parallel connection information is used to indicate that the first expert sub-model and the second expert sub-model are connected in series. The construction module 740 is used to: connect the first sub-inference analyzer and the second sub-inference analyzer in series by controlling the gating state of the gating mechanism used to connect the first sub-inference analyzer and the second sub-inference analyzer.
[0203] In an optional embodiment, the configuration information of the first inference model further includes parameter configuration information of the at least one inference component; the construction module 740 is configured to: configure the inference parameters of the hardware inference unit corresponding to each inference component according to the parameter configuration information of each inference component in the at least one inference component.
[0204] In an optional embodiment, the processing module 730 is configured to: perform inference analysis on the first data stream using a first inference analyzer to obtain the inference analysis result of the first data stream; and perform processing operations related to the inference analysis result of the first data stream.
[0205] In an optional embodiment, the processing operation includes at least one of the following: editing the packets of the first data stream; modifying the traffic management policy of the first data stream; modifying the inference analysis policy of the first data stream; and announcing the inference analysis results of the first data stream to a remote device.
[0206] In an optional embodiment, the at least one hardware inference unit constitutes a first inference path, and the first inference analyzer is used to process the first data stream according to the first inference path.
[0207] In an optional embodiment, the processing module 730 is configured to: process the first data stream using a first inference analyzer based on the relevant status information of the first data stream; wherein the relevant status information of the first data stream includes at least one of the following: status information of the first data stream; status information of data streams related to the first data stream; and status information of resources related to the first data stream.
[0208] In an optional embodiment, for any one of the first data stream and the data streams related to the first data stream, the status information of the any one data stream includes at least one of the following: the message information of the any one data stream; the forwarding information of the any one data stream; the traffic statistics of the any one data stream; the historical reasoning analysis results of the any one data stream obtained from the remote device; and the status information of the resources related to the first data stream includes at least one of the following: statistics of cache resources used to cache the first data stream; statistics of queue resources used to cache the first data stream; statistics of bandwidth resources used to forward the first data stream; and statistics of processing resources used to process the messages of the first data stream.
[0209] In an optional embodiment, the first data stream and the data streams related to the first data stream satisfy at least one of the following: the first data stream and the data streams related to the first data stream compete for cache resources; the first data stream and the data streams related to the first data stream compete for queue resources; the first data stream and the data streams related to the first data stream compete for bandwidth resources; the first data stream and the data streams related to the first data stream compete for processing resources.
[0210] The implementation of the receiving module 710 is described in S301, the implementation of the determining module 720 is described in S302, and the implementation of the processing module 730 is described in S303. The implementation of the building module 740 is described in the above description of building the first inference analyzer. In the data stream processing apparatus provided in this application embodiment, the receiving module 710, determining module 720, processing module 730, and building module 740 can all be implemented based on software, hardware, or a combination of software and hardware, and can be arbitrarily combined or divided based on specific implementations. For example, referring to Figures 1, 4, and 5, the receiving module 710 can be combined with a message parsing unit; the determining module 720 can be combined with an inference mode determining unit, a configuration unit, a selection unit, etc.; the processing module 730 can be combined with a selection unit, a hardware inference resource pool, a search unit, a forwarding and editing unit, a traffic manager, etc.; and the building module 740 can be combined with a configuration unit. For example, the receiving module 710 is integrated into the message parsing unit; the determining module 720 includes an inference mode determining unit, and a portion of the functions of the determining module 720 are integrated into the configuration unit and the selection unit; the processing module 730 includes a hardware inference resource pool, a lookup unit, a forwarding editing unit, and a traffic manager, and a portion of the functions of the processing module 730 are integrated into the selection unit; the building module 740 can be integrated into the configuration unit. The data stream processing apparatus provided in this application embodiment can be implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The PLD can be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL), or any combination thereof.
[0211] In summary, the technical solution provided in this application involves a communication chip receiving a first message belonging to a first data stream. The chip then determines a first inference model based on the characteristic information of the first message. This first inference model instructs the use of a first inference analyzer to process the first data stream. The communication chip then uses the first inference analyzer to process the first data stream. On one hand, this application uses AI functionality to process the first data stream within the communication chip. Compared to using an external AI analyzer, this application eliminates the need for interaction between the communication chip and the AI analyzer, resulting in shorter processing time and higher efficiency when using AI functionality to process the first data stream. On the other hand, this application uses a first inference analyzer to process the first data stream. The first inference analyzer includes at least one hardware inference unit from a hardware inference resource pool. This hardware inference unit is a dedicated hardware inference unit, which has a faster processing speed and higher efficiency. Therefore, using the first inference analyzer to process the first data stream is both faster and more efficient.
[0212] Based on the same inventive concept, this application provides a communication chip, including a data stream processing device as shown in FIG27. The communication chip includes a forwarding chip, a network access card, or a DPU chip, etc. The forwarding chip can be an NP chip. The network access card is also called a network interface card (NIC).
[0213] The communication chip may include programmable logic circuits and / or program instructions, which, when running, are used to implement at least some of the steps in the data stream processing method provided in the method embodiment shown in FIG13.
[0214] Based on the same inventive concept, embodiments of this application provide a communication device including the aforementioned communication chip. For example, the communication device is shown in Figure 1. This communication device includes network devices, wireless access devices, wireless communication devices, personal computer hosts, or portable computers, etc. Network devices include switches or routers. Wireless access devices are, for example, WLAN devices. Wireless communication devices include mobile terminals such as mobile phones and tablet computers.
[0215] Please refer to Figure 28, which shows a schematic diagram of a communication device 800 provided in an embodiment of this application. The communication device 800 includes a processor 802, a memory 804, a communication chip 806, a communication interface 808, and a bus 810. The processor 802, memory 804, communication chip 806, and communication interface 808 are communicatively connected via the bus 810. The connection method between the processor 802, memory 804, communication chip 806, and communication interface 808 shown in Figure 28 is only an example; the processor 802, memory 804, communication chip 806, and communication interface 808 can be connected using connection methods other than the bus 810.
[0216] The memory 804 stores the computer program 8042, which may include instructions and data. The memory 804 can be various types of storage media, such as random access memory (RAM), read-only memory (ROM), non-volatile RAM (NVRAM), programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), flash memory, optical storage, and registers.
[0217] The processor 802 can be a general-purpose processor, which performs specific steps and / or operations by reading and executing a computer program (e.g., computer program 8042) stored in memory (e.g., memory 804). The general-purpose processor may use data stored in memory (e.g., memory 804) during the execution of these steps and / or operations. The general-purpose processor can be a central processing unit (CPU). Alternatively, the processor 802 can be a special-purpose processor, which is specifically designed to perform specific steps and / or operations. A special-purpose processor can be a digital signal processor (DSP), an ASIC, or an FPGA, etc. The processor 802 can be a combination of multiple processors, such as a multi-core processor.
[0218] The communication chip 806 is used for forwarding and managing data streams, as well as processing the data streams received by the communication chip using AI functions. For example, the communication chip 806 includes a data stream processing device 700 as shown in FIG27 to process the data streams received by the communication chip using AI functions.
[0219] The communication interface 808 may include input / output (I / O) interfaces, physical interfaces, and logical interfaces for interconnecting devices within the communication device 800, as well as interfaces for interconnecting the communication device 800 with other devices. Physical interfaces may be POS interfaces, gigabit Ethernet (GE) interfaces, asynchronous transfer mode (ATM) interfaces, etc., used for interconnecting the communication device 800 with other devices. Logical interfaces are internal interfaces of the communication device 800, used for interconnecting devices within the communication device 800. It is easy to understand that the communication interface 808 can be used for communication between the communication device 800 and other devices; for example, the communication interface 808 is used for sending and receiving messages between the communication device 800 and other devices.
[0220] Bus 810 can be of any type, used to interconnect processor 802, memory 804, communication chip 806, and communication interface 808. For example, bus 810 is a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus. Bus 810 can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used in Figure 28, but this does not mean that there is only one bus or one type of bus.
[0221] The aforementioned devices in the communication device 800 can be disposed on separate chips, or at least partially or entirely on the same chip. Whether to dispose of the devices independently on different chips or integrate them on one or more chips often depends on the needs of the product design. This application does not limit the specific implementation of the aforementioned devices.
[0222] The communication device 800 shown in Figure 28 is merely an example. In the implementation process, the communication device 800 may also include other components, which will not be listed one by one in this article.
[0223] Based on the same inventive concept, embodiments of this application provide a computer-readable storage medium storing a computer program that, when executed (e.g., by a communication chip, a data stream processing device, etc.), implements at least some steps of the data stream processing method provided in the method embodiment shown in FIG13.
[0224] Based on the same inventive concept, this application provides a computer program product, which includes a program or code. When the program or code is executed (e.g., executed by a communication chip, a data stream processing device, etc.), it implements at least some of the steps in the data stream processing method provided in the method embodiment shown in FIG13.
[0225] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented entirely or partially as a computer program product, which includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium, or a semiconductor medium (e.g., solid-state drive), etc.
[0226] It should be understood that the term "at least one" in this application refers to one or more, and "multiple" refers to two or more. In this application, unless otherwise stated, the symbol " / " generally means "or," for example, A / B can mean A or B. The term "and / or" in this application is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Furthermore, for clarity, this application uses terms such as "first," "second," and "third" to distinguish identical or similar items with substantially the same function and effect. Those skilled in the art will understand that the terms "first," "second," and "third" do not limit the quantity or order of execution.
[0227] The different types of embodiments, such as the method embodiments and device embodiments provided in this application, can be referenced to each other. The order of operations in the method embodiments can be adjusted appropriately, and the operations can be added or removed in response to the situation. Any variations that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the protection scope of this application, and therefore will not be described in detail.
[0228] In the corresponding embodiments provided in this application, it should be understood that the disclosed devices, etc., can be implemented through other configurations. For example, the device embodiments described above are merely illustrative. For instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed between devices or modules may be through some interfaces, or indirect coupling or communication connection between devices or modules, which may be electrical or other forms. Modules described as separate components may or may not be physically separate, and components described as modules may or may not be physical modules; they may be located in one place or distributed across multiple network nodes. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.
[0229] The above description is merely an exemplary embodiment of this application, but the scope of protection of this application is not limited thereto. Any equivalent modifications or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A data stream processing method, characterized in that, Applied to communication chips, the method includes: Receive the first message, which belongs to the first data stream; A first inference model is determined based on the feature information of the first message. The first inference model is used to instruct the first inference analyzer to process the first data stream. The first inference analyzer includes at least one hardware inference unit in a hardware inference resource pool. The communication chip includes the hardware inference resource pool. The first data stream is processed using the first inference analyzer.
2. The method according to claim 1, characterized in that, The method further includes: Based on the configuration information of the first inference model, the first inference analyzer is constructed based on the hardware inference resource pool.
3. The method according to claim 2, characterized in that, The first inference model includes at least one inference component, and the configuration information of the first inference model includes the mapping relationship between the at least one inference component and the at least one hardware inference unit; The step of constructing the first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model includes: The at least one hardware inference unit is determined based on the mapping relationship between the at least one inference component and the at least one hardware inference unit; The first inference analyzer is constructed based on the at least one hardware inference unit.
4. The method according to claim 3, characterized in that, The at least one inference component is a plurality of inference components, the at least one hardware inference unit is a plurality of hardware inference units, and the configuration information of the first inference model further includes connectivity information of the plurality of inference components, wherein the connectivity information is used to indicate the connectivity of the plurality of inference components; The step of constructing the first inference analyzer based on the at least one hardware inference unit includes: configuring the plurality of hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components.
5. The method according to claim 4, characterized in that, The hardware inference units in the hardware inference resource pool are connected via a gating mechanism. The configuration of the plurality of hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components includes: controlling the selection state of each selector used to connect the plurality of hardware inference units, so that the plurality of hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components.
6. The method according to claim 3, characterized in that, The first inference model includes a first expert sub-model and a second expert sub-model. The first expert sub-model includes at least one first inference component, and the second expert sub-model includes at least one second inference component. The at least one inference component includes the at least one first inference component and the at least one second inference component. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes at least one first hardware inference unit, and the second sub-inference analyzer includes at least one second hardware inference unit. The at least one hardware inference unit includes the at least one first hardware inference unit and the at least one second hardware inference unit. The construction of the first inference analyzer based on the at least one hardware inference unit includes: The first sub-inference analyzer is constructed based on the at least one first hardware inference unit; The second sub-inference analyzer is constructed based on the at least one second hardware inference unit.
7. The method according to claim 6, characterized in that, The at least one first inference component is a plurality of first inference components, the at least one second inference component is a plurality of second inference components, the at least one first hardware inference unit is a plurality of first hardware inference units, the at least one second hardware inference unit is a plurality of second hardware inference units, and the configuration information of the first inference model further includes connectivity information of the plurality of first inference components and connectivity information of the plurality of second inference components. The connectivity information of the plurality of first inference components is used to indicate the connectivity of the plurality of first inference components, and the connectivity information of the plurality of second inference components is used to indicate the connectivity of the plurality of second inference components. The step of constructing the first sub-inference analyzer based on the at least one first hardware inference unit includes: configuring the plurality of first hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of first inference components; The step of constructing the second sub-inference analyzer based on the at least one second hardware inference unit includes: configuring the plurality of second hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components.
8. The method according to claim 7, characterized in that, The hardware inference units in the hardware inference resource pool are connected via a gating mechanism. The configuration of the plurality of first hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of first inference components includes: controlling the selection state of each selector used to connect the plurality of first hardware inference units, so that the plurality of first hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of first inference components; The configuration of the plurality of second hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components includes: controlling the selection state of each selector used to connect the plurality of second hardware inference units, so that the plurality of second hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components.
9. The method according to any one of claims 6 to 8, characterized in that, The configuration information of the first inference model also includes the serial-parallel relationship information between the first expert sub-model and the second expert sub-model, and the serial-parallel relationship information is used to indicate the serial-parallel relationship between the first expert sub-model and the second expert sub-model; The step of constructing the first inference analyzer based on the at least one hardware inference unit further includes: constructing the series-parallel relationship between the first sub-inference analyzer and the second sub-inference analyzer according to the series-parallel relationship information indicated by the series-parallel relationship information of the first expert sub-model and the second expert sub-model.
10. The method according to claim 9, characterized in that, The hardware inference units in the hardware inference resource pool are connected by a gate, and the serial-parallel connection information is used to indicate that the first expert sub-model and the second expert sub-model are connected in series. The step of constructing the series-parallel relationship between the first sub-inference analyzer and the second sub-inference analyzer based on the series-parallel relationship information indicated by the first expert sub-model and the second expert sub-model includes: controlling the selection state of the gate used to connect the first sub-inference analyzer and the second sub-inference analyzer to connect them in series.
11. The method according to any one of claims 3 to 10, characterized in that, The configuration information of the first inference model also includes parameter configuration information of the at least one inference component; The step of constructing the first inference analyzer based on the at least one hardware inference unit includes: configuring inference parameters for the hardware inference unit corresponding to each inference component according to the parameter configuration information of each inference component in the at least one inference component.
12. The method according to any one of claims 1 to 11, characterized in that, The process of processing the first data stream using the first inference analyzer includes: The first inference analyzer is used to perform inference analysis on the first data stream to obtain the inference analysis result of the first data stream; Perform processing operations related to the inference analysis results of the first data stream.
13. The method according to claim 12, characterized in that, The processing operation includes at least one of the following: Editing the messages in the first data stream; The operation of modifying the traffic management policy of the first data stream; The operation of modifying the inference analysis strategy of the first data stream; The operation of announcing the inference and analysis results of the first data stream to a remote device.
14. The method according to any one of claims 1 to 13, characterized in that, The at least one hardware inference unit constitutes a first inference path, and the first inference analyzer is used to process the first data stream according to the first inference path.
15. The method according to any one of claims 1 to 14, characterized in that, The step of processing the first data stream using the first inference analyzer includes: processing the first data stream using the first inference analyzer based on the relevant state information of the first data stream; The relevant status information of the first data stream includes at least one of the following: The status information of the first data stream; Status information of the data stream associated with the first data stream; Status information of resources associated with the first data stream.
16. The method according to claim 15, characterized in that, For any one of the first data stream and the data streams related to the first data stream, the status information of the any one data stream includes at least one of the following: the message information of the any one data stream; the forwarding information of the any one data stream; the traffic statistics information of the any one data stream; and the historical reasoning analysis results of the any one data stream obtained from the remote device. The status information of the resources associated with the first data stream includes at least one of the following: statistical information of cache resources used to cache the first data stream; statistical information of queue resources used to cache the first data stream; statistical information of bandwidth resources used to forward the first data stream; and statistical information of processing resources used to process packets of the first data stream.
17. The method according to claim 15 or 16, characterized in that, The first data stream and the data streams associated with the first data stream satisfy at least one of the following: The first data stream competes for cache resources with data streams related to the first data stream. The first data stream competes for queue resources with data streams related to the first data stream. The first data stream competes for bandwidth resources with data streams related to the first data stream. The first data stream competes for processing resources with data streams related to the first data stream.
18. A data stream processing apparatus, characterized in that, The data stream processing device, applied to communication chips, includes: The receiving module is used to receive a first message, which belongs to the first data stream; The determination module is used to determine a first inference model based on the feature information of the first message. The first inference model is used to indicate the use of a first inference analyzer to process the first data stream. The first inference analyzer includes at least one hardware inference unit in a hardware inference resource pool. The communication chip includes the hardware inference resource pool. A processing module is used to process the first data stream using the first inference analyzer.
19. The data stream processing apparatus according to claim 18, characterized in that, The data stream processing device further includes: A construction module is used to construct the first inference analyzer based on the hardware inference resource pool according to the configuration information of the first inference model.
20. The data stream processing apparatus according to claim 19, characterized in that, The first inference model includes at least one inference component, and the configuration information of the first inference model includes the mapping relationship between the at least one inference component and the at least one hardware inference unit; The building module is used for: The at least one hardware inference unit is determined based on the mapping relationship between the at least one inference component and the at least one hardware inference unit; The first inference analyzer is constructed based on the at least one hardware inference unit.
21. The data stream processing apparatus according to claim 20, characterized in that, The at least one inference component is a plurality of inference components, the at least one hardware inference unit is a plurality of hardware inference units, and the configuration information of the first inference model further includes connectivity information of the plurality of inference components, wherein the connectivity information is used to indicate the connectivity of the plurality of inference components; The construction module is configured to: configure the plurality of hardware inference units to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components.
22. The data stream processing apparatus according to claim 21, characterized in that, The hardware inference units in the hardware inference resource pool are connected via a gating mechanism. The construction module is configured to: control the gating state of each gating device used to connect the plurality of hardware inference units, so that the plurality of hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of inference components.
23. The data stream processing apparatus according to claim 20, characterized in that, The first inference model includes a first expert sub-model and a second expert sub-model. The first expert sub-model includes at least one first inference component, and the second expert sub-model includes at least one second inference component. The at least one inference component includes the at least one first inference component and the at least one second inference component. The first inference analyzer includes a first sub-inference analyzer and a second sub-inference analyzer. The first sub-inference analyzer includes at least one first hardware inference unit, and the second sub-inference analyzer includes at least one second hardware inference unit. The at least one hardware inference unit includes the at least one first hardware inference unit and the at least one second hardware inference unit. The building module is used for: The first sub-inference analyzer is constructed based on the at least one first hardware inference unit; The second sub-inference analyzer is constructed based on the at least one second hardware inference unit.
24. The data stream processing apparatus according to claim 23, characterized in that, The at least one first inference component is a plurality of first inference components, the at least one second inference component is a plurality of second inference components, the at least one first hardware inference unit is a plurality of first hardware inference units, the at least one second hardware inference unit is a plurality of second hardware inference units, and the configuration information of the first inference model further includes connectivity information of the plurality of first inference components and connectivity information of the plurality of second inference components. The connectivity information of the plurality of first inference components is used to indicate the connectivity of the plurality of first inference components, and the connectivity information of the plurality of second inference components is used to indicate the connectivity of the plurality of second inference components. The building module is used for: The plurality of first hardware inference units are configured to be connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of first inference components; The plurality of second hardware inference units are configured to be connected according to the connectivity information indicated by the plurality of second inference components.
25. The data stream processing apparatus according to claim 24, characterized in that, The hardware inference units in the hardware inference resource pool are connected via a gating mechanism. The building module is used for: By controlling the gating state of each gating device used to connect the plurality of first hardware inference units, the plurality of first hardware inference units are connected according to the connection relationship indicated by the connection relationship information of the plurality of first inference components; By controlling the gating state of each gating device used to connect the plurality of second hardware inference units, the plurality of second hardware inference units are connected according to the connectivity relationship indicated by the connectivity relationship information of the plurality of second inference components.
26. The data stream processing apparatus according to any one of claims 23 to 25, characterized in that, The configuration information of the first inference model also includes the serial-parallel relationship information between the first expert sub-model and the second expert sub-model, and the serial-parallel relationship information is used to indicate the serial-parallel relationship between the first expert sub-model and the second expert sub-model; The construction module is used to: construct the series-parallel relationship between the first sub-inference analyzer and the second sub-inference analyzer based on the series-parallel relationship information indicated by the series-parallel relationship information between the first expert sub-model and the second expert sub-model.
27. The data stream processing apparatus according to claim 26, characterized in that, The hardware inference units in the hardware inference resource pool are connected by a gate, and the serial-parallel connection information is used to indicate that the first expert sub-model and the second expert sub-model are connected in series. The construction module is used to: connect the first sub-inference analyzer and the second sub-inference analyzer in series by controlling the gating state of the gating device used to connect the first sub-inference analyzer and the second sub-inference analyzer.
28. The data stream processing apparatus according to any one of claims 20 to 27, characterized in that, The configuration information of the first inference model also includes parameter configuration information of the at least one inference component; The construction module is configured to: configure inference parameters for the hardware inference unit corresponding to each inference component based on the parameter configuration information of each inference component in the at least one inference component.
29. The data stream processing apparatus according to any one of claims 18 to 28, characterized in that, The processing module is used for: The first inference analyzer is used to perform inference analysis on the first data stream to obtain the inference analysis result of the first data stream; Perform processing operations related to the inference analysis results of the first data stream.
30. The data stream processing apparatus according to claim 29, characterized in that, The processing operation includes at least one of the following: Editing the messages in the first data stream; The operation of modifying the traffic management policy of the first data stream; The operation of modifying the inference analysis strategy of the first data stream; The operation of announcing the inference and analysis results of the first data stream to a remote device.
31. The data stream processing apparatus according to any one of claims 18 to 30, characterized in that, The at least one hardware inference unit constitutes a first inference path, and the first inference analyzer is used to process the first data stream according to the first inference path.
32. The data stream processing apparatus according to any one of claims 18 to 31, characterized in that, The processing module is configured to: process the first data stream using the first inference analyzer based on the relevant status information of the first data stream; The relevant status information of the first data stream includes at least one of the following: The status information of the first data stream; Status information of the data stream associated with the first data stream; Status information of resources associated with the first data stream.
33. The data stream processing apparatus according to claim 32, characterized in that, For any one of the first data stream and the data streams related to the first data stream, the status information of the any one data stream includes at least one of the following: the message information of the any one data stream; the forwarding information of the any one data stream; the traffic statistics information of the any one data stream; and the historical reasoning analysis results of the any one data stream obtained from the remote device. The status information of the resources associated with the first data stream includes at least one of the following: statistical information of cache resources used to cache the first data stream; statistical information of queue resources used to cache the first data stream; statistical information of bandwidth resources used to forward the first data stream; and statistical information of processing resources used to process packets of the first data stream.
34. The data stream processing apparatus according to claim 32 or 33, characterized in that, The first data stream and the data streams associated with the first data stream satisfy at least one of the following: The first data stream competes for cache resources with data streams related to the first data stream. The first data stream competes for queue resources with data streams related to the first data stream. The first data stream competes for bandwidth resources with data streams related to the first data stream. The first data stream competes for processing resources with data streams related to the first data stream.
35. A communication chip, characterized in that, Includes the data stream processing apparatus as described in any one of claims 18 to 34.
36. A communication device, characterized in that, Includes the communication chip as described in claim 35.
37. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed, implements at least some of the steps in the method as described in any one of claims 1 to 17.
38. A computer program product, characterized in that, The computer program product includes a program or code that, when executed, implements at least some of the steps in the method as described in any one of claims 1 to 17.