A stream data processing method, device, equipment and storage medium
By pre-setting graphics computing instructions in the CPU, including the spatial address of streaming data packets and the kernel execution order, the problem of low efficiency in streaming data processing caused by the high number of CPU-GPU interactions is solved, and efficient streaming data processing is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING BAIDU NETCOM SCI & TECH CO LTD
- Filing Date
- 2022-06-22
- Publication Date
- 2026-06-26
AI Technical Summary
In existing technologies, the multiple interactions between the CPU and GPU lead to low efficiency in streaming data processing, especially when processing large amounts of streaming data, where the number of interactions increases exponentially, reducing processing efficiency.
By pre-setting graphics computing instructions in the CPU, including the spatial address of streaming data packets and the kernel execution order, inference of streaming data packets can be completed with only one interaction, reducing the number of interactions between the CPU and GPU.
It improves the efficiency of streaming data processing, reduces the number of interactions between the CPU and GPU, saves space allocation and computational consumption, and enhances overall processing performance.
Smart Images

Figure CN115129488B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of computer technology, and in particular to the fields of voice technology, video technology, and other related technologies. Background Technology
[0002] With the development of computer technology, the application of streaming data such as voice and video is becoming increasingly widespread, which poses a challenge to the processing efficiency of streaming data. Summary of the Invention
[0003] This disclosure provides a method, apparatus, device, and storage medium for streaming data processing.
[0004] According to a first aspect of this disclosure, a streaming data processing method is provided, applied to a central processing unit (CPU), comprising:
[0005] Receive target streaming data packets;
[0006] From the preset graphics computing instructions, a first graphics computing instruction corresponding to the position of the target streaming data packet in the data stream is determined. The graphics computing instruction includes the space address required by the streaming data packet and the execution order of the kernels that process the streaming data packet.
[0007] The first graphics computing instruction is sent to the graphics processing unit (GPU) so that the GPU can infer the target streaming data packet based on the spatial address and kernel execution order included in the first graphics computing instruction, and obtain the first processing result of the target streaming data packet.
[0008] In some embodiments, before receiving the target streaming data packet, the method further includes:
[0009] Determine the space required to process streaming data packets at different locations in the data stream;
[0010] According to the determined maximum space size, allocate the first space required to process the streaming data packets included in the data stream, and the streaming data packets at different positions correspond to the space size of the corresponding space in the first space;
[0011] Based on the first execution order of the kernel processing streaming data packets at different locations, and the address of the first space, preset graphics calculation instructions corresponding to different locations are generated.
[0012] In some embodiments, the position of the streaming data packet in the data stream includes a fixed-length first packet and intermediate packets, and the preset graphics calculation instructions include graphics calculation instructions corresponding to the first packet and graphics calculation instructions corresponding to the intermediate packets; the method further includes:
[0013] Determine whether the target streaming data packet is the first packet or an intermediate packet in the data stream, and obtain the determination result;
[0014] In response to the judgment result indicating that the target streaming data packet is the first packet or an intermediate packet in the data stream, the step of determining the first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream from the preset graphics calculation instructions is executed.
[0015] In some embodiments, the position of the streaming data packet in the data stream further includes a tail packet of variable length; the method further includes:
[0016] In response to the determination result indicating that the target streaming data packet is the tail packet in the data stream, the second space required to process the target streaming data packet is obtained, and the second execution order of the kernel processing the target streaming data packet is read;
[0017] According to the second execution order, the kernel execution command is sent to the graphics processor so that the graphics processor, based on the kernel execution command, uses the second space to infer the target streaming data packet and obtain the second processing result of the target streaming data packet.
[0018] In some embodiments, the position of the streaming data packet in the data stream further includes a tail packet of variable length; the method further includes:
[0019] In response to the determination result indicating that the target streaming data packet is the tail packet in the data stream, the second space required to process the target streaming data packet is obtained, and the second execution order of the kernel processing the target streaming data packet is read;
[0020] Based on the second space and the second execution order, a second graphics calculation instruction is generated;
[0021] The second graphics calculation instruction is sent to the graphics processor so that the graphics processor can perform inference on the target streaming data packet based on the second graphics calculation instruction to obtain the third processing result of the target streaming data packet.
[0022] In some embodiments, the step of obtaining the second space required for processing the target streaming data packet includes:
[0023] From the space indicated by the space address included in the preset graphics calculation instructions, determine the second space required to process the target streaming data packet; or
[0024] A second space is allocated outside the space indicated by the space address included in the preset graphics calculation instructions to process the target streaming data packet.
[0025] According to a second aspect of this disclosure, an apparatus for streaming data processing is provided, comprising:
[0026] The receiving module is used to receive target streaming data packets;
[0027] The first determining module is used to determine, from the preset graphics computing instructions, a first graphics computing instruction corresponding to the position of the target streaming data packet in the data stream, wherein the graphics computing instruction includes the space address required by the streaming data packet and the execution order of the kernels processing the streaming data packet;
[0028] The first inference module is used to send the first graphics calculation instruction to the graphics processor, so that the graphics processor can infer the target streaming data packet based on the spatial address and kernel execution order included in the first graphics calculation instruction, and obtain the first processing result of the target streaming data packet.
[0029] In some embodiments, the apparatus further includes:
[0030] The second determining module is used to determine the space required to process streaming data packets at different locations in the data stream;
[0031] The first allocation module is used to allocate the first space required for processing the streaming data packets included in the data stream according to the determined maximum space size, wherein the streaming data packets at different positions correspond to the corresponding space size in the first space;
[0032] The first generation module is used to generate preset graphics calculation instructions corresponding to different locations based on the first execution order of the kernels processing streaming data packets at the different locations and the address of the first space.
[0033] In some embodiments, the streaming data packets in the streaming data stream are positioned as fixed-length first packets and intermediate packets, and the preset graphics calculation instructions include graphics calculation instructions corresponding to the first packet and graphics calculation instructions corresponding to the intermediate packets; the apparatus further includes:
[0034] The judgment module is used to determine whether the target streaming data packet is the first packet or an intermediate packet in the data stream, and obtain the judgment result.
[0035] The first determining module is specifically used to, in response to the judgment result indicating that the target streaming data packet is the first packet or an intermediate packet in the data stream, determine a first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream from a preset graphics calculation instruction.
[0036] In some embodiments, the position of the streaming data packet in the data stream further includes a tail packet of variable length; the apparatus further includes:
[0037] The second allocation module is used to, in response to the judgment result indicating that the target streaming data packet is the tail packet in the data stream, obtain the second space required to process the target streaming data packet, and read the second execution order of the kernel processing the target streaming data packet;
[0038] The second inference module is used to send the kernel execution command to the graphics processor according to the second execution order, so that the graphics processor, based on the kernel execution command, uses the second space to infer the target streaming data packet and obtain the second processing result of the target streaming data packet.
[0039] In some embodiments, the position of the streaming data packet in the data stream further includes a tail packet of variable length; the apparatus further includes:
[0040] The second allocation module is used to, in response to the judgment result indicating that the target streaming data packet is the tail packet in the data stream, obtain the second space required to process the target streaming data packet, and read the second execution order of the kernel processing the target streaming data packet;
[0041] The second generation module is used to generate a second graphics calculation instruction based on the second space and the second execution order;
[0042] The third inference module is used to send the second graphics calculation instruction to the graphics processor, so that the graphics processor can perform inference on the target streaming data packet based on the second graphics calculation instruction to obtain the third processing result of the target streaming data packet.
[0043] In some embodiments, the second allocation module is specifically used for:
[0044] From the space indicated by the space address included in the preset graphics calculation instructions, determine the second space required to process the target streaming data packet; or
[0045] A second space is allocated outside the space indicated by the space address included in the preset graphics calculation instructions to process the target streaming data packet.
[0046] According to a third aspect of this disclosure, an electronic device is provided, comprising:
[0047] At least one processor; and
[0048] A memory that is communicatively connected to the at least one processor;
[0049] The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method described in any one of the first aspects.
[0050] According to a fourth aspect of this disclosure, a non-transitory computer-readable storage medium is provided storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method described in any one of the first aspects.
[0051] According to a fifth aspect of this disclosure, a computer program product is provided, comprising a computer program that, when executed by a processor, implements the method described in any one of the first aspects.
[0052] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description
[0053] The accompanying drawings are provided to better understand this solution and do not constitute a limitation of this disclosure. Wherein:
[0054] Figure 1 This is a schematic diagram of the first packet, intermediate packet, and last packet provided in an embodiment of this disclosure;
[0055] Figure 2 This is a schematic diagram of a reasoning model provided in an embodiment of this disclosure;
[0056] Figure 3 This is a schematic diagram of space allocation provided in an embodiment of this disclosure;
[0057] Figure 4 This is a first schematic diagram of the streaming data processing method provided in this embodiment of the disclosure;
[0058] Figure 5 This is a schematic diagram of a method for constructing graphics computing instructions provided in an embodiment of this disclosure;
[0059] Figure 6 This is a second schematic diagram of the streaming data processing method provided in this embodiment of the disclosure;
[0060] Figure 7This is a third schematic diagram of the streaming data processing method provided in the embodiments of this disclosure;
[0061] Figure 8 This is a fourth schematic diagram of the streaming data processing method provided in this embodiment of the disclosure;
[0062] Figure 9 This is a schematic diagram of a streaming data processing apparatus provided in an embodiment of this disclosure;
[0063] Figure 10 This is a first block diagram of an electronic device used to implement the streaming data processing method of the embodiments of this disclosure;
[0064] Figure 11 This is a second block diagram of an electronic device used to implement the streaming data processing method of the embodiments of this disclosure. Detailed Implementation
[0065] The exemplary embodiments of this disclosure are described below with reference to the accompanying drawings, including various details of the embodiments to aid understanding, and should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this disclosure. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.
[0066] With the development of computer technology, the application of streaming data such as voice and video is becoming increasingly widespread, which poses a challenge to the processing efficiency of streaming data.
[0067] Streaming data processing includes stream data identification and synthesis. Streaming packets are categorized into starter packets, middle packets, and end packets within the data stream. For example... Figure 1 The diagram illustrates the first packet, middle packets, and last packets. The first packet represents the initial data stream, and there is only one first packet. The last packet represents the final data stream, and there is only one last packet. The middle packets are a portion of data between the first and last packets, and there can be one or more middle packets. A streaming data stream consists of multiple data frames; for example, the first packet may contain 50 data frames, and the middle packets may contain 10 data frames.
[0068] The CPU has a pre-defined inference model for streaming data. This model specifies the computation mode for streaming data packets, including the execution order of the kernel processing the packets and the processing operations. The computation mode for streaming data packets at different locations can be the same or different. When the computation modes differ, the inference model also contains branching statements, such as... Figure 2 As shown, the calculation mode 1 and calculation mode 2 under different branch conditions are executed through branch statements to complete the reasoning task of streaming data. Figure 2 This example only illustrates the existence of two calculation modes and is not intended to be limiting.
[0069] In related technologies, when a CPU processes streaming data, it allocates the space required to process the current data packet based on the packet's position and length. This space can include: intermediate input / output (I / O) space, the maximum temporary space required for each kernel operation, and the state space required for each kernel, etc. Figure 3 The diagram illustrates the space allocation. The space allocated for processing the current data packet includes: IO space (IO1-IO3, etc.), the maximum temporary space required for kernel operations (kernel 1-3, etc.), and state space (state 1-3, etc.). The CPU, based on a pre-set inference model, reads the corresponding kernel execution order and sends the kernel execution instructions to the GPU. The GPU, based on the kernel execution instructions, utilizes the allocated space to perform kernel operations and complete the inference of the streaming data.
[0070] Processing a streaming data packet requires multiple kernels to execute sequentially to complete the inference process. This necessitates multiple interactions between the CPU and GPU, i.e., sending kernel execution instructions to the GPU multiple times, resulting in low processing efficiency for streaming data. Furthermore, as the number of streaming data packets increases, the number of CPU-GPU interactions multiplies, further reducing processing efficiency.
[0071] To improve the processing efficiency of streaming data, embodiments of this disclosure provide a streaming data processing method, such as... Figure 4 As shown, it is applied to a CPU. In this embodiment of the disclosure, the CPU can be a CPU on a device with graphics processing capabilities, such as a server or mobile terminal, and is not limited thereto. The above-described streaming data processing method includes the following steps:
[0072] Step S41: Receive the target streaming data packet.
[0073] In this embodiment of the disclosure, the target streaming data packet is any data packet in the data stream of the streaming data, and the position of the data packet in the data stream can be the first packet, the middle packet, or the last packet.
[0074] When streaming data processing is required, the user can input a requested streaming data packet (i.e., the target streaming data packet) into the CPU, and the CPU will receive the requested streaming data packet.
[0075] Step S42: Determine the first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream from the preset graphics calculation instructions. The graphics calculation instruction includes the space required by the streaming data packet and the execution order of the kernels that process the streaming data packet.
[0076] The CPU has pre-defined graphics calculation instructions, which can include graphics calculation instructions corresponding to different positions, such as graphics calculation instructions for the first packet, graphics calculation instructions for the middle packet, and graphics calculation instructions for the last packet.
[0077] After receiving the target streaming data packet, the CPU can determine the position of the target streaming data packet in the data stream based on the information carried in the header of the target streaming data packet, and then determine the graphics calculation instruction corresponding to the determined position from the preset graphics calculation instructions. The determined graphics calculation instruction is the first graphics calculation instruction.
[0078] For example, if the CPU determines that the received streaming data packet is the first packet in the data stream, it will determine the graphics calculation instruction corresponding to the first packet from the preset graphics calculation instructions; if the CPU determines that the received streaming data packet is the middle packet in the data stream, it will determine the graphics calculation instruction corresponding to the middle packet from the preset graphics calculation instructions.
[0079] Step S43: Send the first graphics calculation instruction to the graphics processor so that the graphics processor can infer the target streaming data packet based on the spatial address and kernel execution order included in the first graphics calculation instruction to obtain the first processing result of the target streaming data packet.
[0080] After receiving the first graphics computation instruction, the CPU sends it to the GPU. The GPU performs kernel operations according to the execution order of the kernels included in the first graphics computation instruction, using the space corresponding to the address space to complete the inference of the target streaming data packet and obtain the first processing result of the target streaming data packet.
[0081] In the technical solution provided by this disclosure, the CPU pre-sets graphics computing instructions, which include the space address required by the streaming data packet and the execution order of the kernels processing the streaming data packet. Therefore, when processing a streaming data packet, the CPU only needs to obtain the corresponding graphics computing instructions and send them to the GPU. Subsequently, the GPU can complete the inference of the streaming data packet based on the graphics computing instructions. It can be seen that in this disclosure, the CPU only needs to interact with the GPU once to complete the inference of the streaming data packet, reducing the number of interactions between the CPU and the GPU and improving the processing efficiency of streaming data.
[0082] Before receiving and processing target streaming data packets, the CPU can pre-construct preset graphics computing instructions corresponding to different locations based on the length of the streaming data packets at different locations and the kernel execution order for processing the corresponding streaming data packets.
[0083] In some embodiments, the CPU can determine the space required to process streaming data packets at different locations based on the length of the streaming data packets at different locations; allocate the required space for the streaming data packets at different locations, read the kernel execution order of processing the streaming data packets at that location in the inference model; and generate the graphics computing instructions corresponding to that location based on the address of the allocated space and the read kernel execution order, as the preset graphics computing instructions corresponding to that location.
[0084] For example, based on the length of the first packet, the CPU determines the space required to process the first packet as k1, and allocates space 1 of k1 to the first packet, with the address range of space 1 being d1-d2; based on the length of the intermediate packet, it determines the space required to process the intermediate packet as k2, and allocates space 2 of k2 to the intermediate packet, with the address range of space 2 being d3-d4; based on the length of the last packet, it determines the space required to process the last packet as k3, and allocates space 3 of k3 to the last packet, with the address range of space 3 being d5-d56.
[0085] The CPU reads the preset inference model and obtains the calculation mode 1 of the first packet. For example, the kernel execution order for processing the first packet is kernel 1 → kernel 2 → kernel 3. Based on the address range of space 1 allocated to the first packet, which is d1-d2, and the calculation mode 1, the CPU generates the graphics calculation instruction 1 corresponding to the first packet.
[0086] Similarly, the CPU reads the preset inference model and obtains the intermediate package's computation mode 2. For example, the kernel execution order for processing the intermediate package is kernel 2 → kernel 1 → kernel 4. Based on the address range of the space 2 allocated to the intermediate package, which is d3-d4, and the computation mode 2, the CPU generates the graphics computation instruction 2 corresponding to the intermediate package.
[0087] The CPU reads the preset inference model and obtains the calculation mode 3 of the tail packet. For example, the kernel execution order for processing the tail packet is kernel 1 → kernel 4 → kernel 3. Based on the address range of space 3 allocated to the tail packet (d5-d6) and the calculation mode 3, the CPU generates the graphics calculation instruction 3 corresponding to the intermediate packet.
[0088] In other embodiments, to save space occupied by streaming data processing and improve the processing efficiency of streaming data, this disclosure also provides a method for constructing graphics computing instructions, such as... Figure 5 As shown, steps S51-S53 may be included.
[0089] Step S51: Determine the space required to process streaming data packets at different locations in the data stream.
[0090] Streaming data is processed by the front-end device before being input into the CPU. The front-end device processes the streaming data, dividing it into a first packet, a middle packet, and a last packet. The lengths of the first packet and the middle packet are fixed; for example, the length of the first packet is 50 data frames, and the length of the middle packet is 100 data frames. The length of the last packet is variable, but it is shorter than the length of the middle packet; for example, if the length of the middle packet is 100 data frames, then the length of the last packet is less than 100 data frames.
[0091] In this embodiment of the disclosure, the CPU can determine the length of the first packet and the length of the intermediate packet based on the processing flow of the front-end device, and then determine the space size that matches the length of the first packet, i.e. the space size required to process the first packet, and determine the space size that matches the length of the intermediate packet, i.e. the space size required to process the intermediate packet.
[0092] For the tail packet, the CPU can use its maximum tail packet length as the basis for determining the space required for processing the tail packet. This means determining the space required to match the length of the intermediate packet, i.e., the space needed to process the tail packet. Because the tail packet length is shorter than the intermediate packet length, using the space required to process the tail packet as the matching intermediate packet length avoids the problem of insufficient space allocation later.
[0093] In this embodiment of the disclosure, the CPU may also determine the space required to process streaming data packets at different locations in other ways, without limitation.
[0094] Step S52: Allocate the first space required for processing the streaming data packets included in the data stream according to the determined maximum space size, with streaming data packets at different locations corresponding to the space size in the first space.
[0095] In this embodiment of the disclosure, the space required for processing streaming data packets includes intermediate I / O space, the maximum temporary space required for each kernel operation, and the state space required for each kernel. Each type of space is processed in the same way; in this embodiment of the disclosure, all the spaces required for processing streaming data packets are collectively referred to as the space required for processing streaming data packets.
[0096] For example, the CPU determines the space required to process the first packet to be k1, the space required to process the intermediate packet to be k2, and the space required to process the last packet to be k3. Here, k2 is greater than both k1 and k3; therefore, k2 is the maximum determined space size. The CPU allocates space X1 for processing the streaming data packets. Space X1 has a size of k2, and its address range is dx1-dx2. The space dx1-dx3 within space X1 corresponds to the first packet, and space X1 itself corresponds to both the intermediate and last packets.
[0097] Step S53: Based on the first execution order of the kernels processing streaming data packets at different locations and the address of the first space, generate preset graphics calculation instructions corresponding to different locations.
[0098] In this embodiment of the present disclosure, the CPU can read the inference model to obtain the kernel execution order of streaming data packets at different locations, i.e. the first execution order of the kernel. Based on the first execution order of the kernel corresponding to different locations and the address of the first space, the CPU generates graphics computing instructions corresponding to different locations to obtain preset graphics computing instructions.
[0099] Subsequently, based on the obtained preset graphics calculation instructions, the CPU executes the above steps S41-S43 to complete the processing of streaming data.
[0100] In the technical solution provided by this disclosure, the CPU determines the maximum required space size based on the space required to process streaming data packets from different locations, allocates space according to this maximum space size, and generates graphics computing instructions. This reduces multiple space allocations to a single space allocation, allowing streaming data packets from different locations to reuse the allocated space, thus saving space occupied by streaming data processing. Furthermore, by allocating space according to the maximum required space size, the CPU avoids the problem of insufficient space allocation when processing streaming data packets from different locations.
[0101] In addition, the CPU has pre-built graphics computing instructions. When processing streaming data packets, the CPU only needs to send the corresponding graphics computing instructions to the GPU once, which enables the GPU to complete the inference of streaming data packets, thus improving the processing efficiency of streaming data.
[0102] In this embodiment, the lengths of the first packet and the intermediate packets are fixed, while the length of the last packet is variable. To further improve the processing efficiency of streaming data, for the fixed-length first and intermediate packets, the CPU can employ the above-described... Figure 5 The method shown constructs the preset graphics calculation instructions. For the tail portion with variable length, the CPU can use other methods to complete the tail packet processing.
[0103] In this case, the preset graphics calculation instructions include graphics calculation instructions corresponding to the first packet and graphics calculation instructions corresponding to the intermediate packets. Based on this, embodiments of this disclosure provide a streaming data processing method, such as... Figure 6 As shown, steps S61-S64 may be included.
[0104] Step S61: Receive the target streaming data packet. See the relevant description in step S41 for details.
[0105] Step S62: Determine whether the target streaming data packet is the first packet or an intermediate packet in the data stream.
[0106] If the judgment result obtained in step S62 indicates that the target streaming data packet is the first packet, or if the judgment result obtained in step S62 indicates that the target streaming data packet is an intermediate packet, the CPU executes step S63.
[0107] Step S63: From the preset graphics calculation instructions, determine the first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream. The graphics calculation instruction includes the space required by the streaming data packet and the execution order of the kernels processing the streaming data packet. See the relevant description in step S42 for details.
[0108] Step S64: The first graphics calculation instruction is sent to the graphics processor, so that the graphics processor can perform inference on the target streaming data packet based on the spatial address and kernel execution order included in the first graphics calculation instruction, and obtain the first processing result of the target streaming data packet. See the relevant description in step S43 for details.
[0109] When processing streaming data packets based on preset graphics computation instructions, the CPU needs to allocate the corresponding space based on the size of the space specified by the graphics computation instructions. However, the length of the tail packet is variable. Allocating space for the tail packet according to the maximum length will inevitably result in some tail packets being shorter than the space specified by the graphics computation instructions. In this case, the CPU needs to perform additional operations such as adding a mask to reach the space size specified by the graphics computation instructions, which will introduce additional computational overhead.
[0110] In the technical solution provided by this disclosure, the CPU pre-allocates space for the first packet and intermediate packets of fixed length, constructs graphics computing instructions, and processes the first packet and intermediate packets according to the pre-constructed graphics computing instructions. The CPU does not use the pre-constructed graphics computing instructions to process the tail packet, which reduces the introduction of additional computing overhead and improves the processing efficiency of streaming data.
[0111] In some embodiments, a streaming data processing method is provided, such as Figure 7 As shown, steps S71-S76 may be included. Steps S71-S74 are the same as steps S61-S64 described above. If the determination result obtained in step S72 indicates that the target streaming data packet is a tail packet, then step S75 is executed.
[0112] Step S75: Obtain the second space required for processing the target streaming data packet, and read the second execution order of the kernel for processing the target streaming data packet.
[0113] When the CPU receives the tail packet, it can obtain the length of the tail packet, and then obtain the space required to process the tail packet, that is, the second space required to process the target streaming data packet.
[0114] In this embodiment of the disclosure, the second space can be space reallocated by the CPU. That is, step S75 above can be: allocating a second space required for processing the target streaming data packet from outside the space indicated by the space address included in the preset graphics computing instructions.
[0115] The second space can also be a part or all of the first space already allocated. That is, step S75 can be: determining the second space required for processing the target streaming data packet from the space indicated by the space address included in the preset graphics calculation instruction. This reduces the space occupied and the number of space allocations, further improving the processing efficiency of streaming data.
[0116] Step S76: In accordance with the second execution order, the kernel execution command is sent to the graphics processor so that the graphics processor can infer the target streaming data packet based on the kernel execution command and the address of the second space to obtain the second processing result of the target streaming data packet.
[0117] In this embodiment, the CPU can read the inference model to obtain the kernel execution order for processing tail packets, i.e., the second kernel execution order, and send kernel execution commands to the GPU according to the second kernel execution order. Based on the kernel execution commands, the GPU uses the second space to perform kernel operations, i.e., inference on the target streaming data packet, and obtain the second processing result of the target streaming data packet.
[0118] In the technical solution provided by this disclosure, the CPU processes the tail packet according to the computing model in related technologies. That is, it sends the kernel execution commands to the GPU sequentially according to the second execution order of the kernel, and interacts with the GPU multiple times to complete the processing of streaming data. In this way, the CPU does not need to introduce additional computing overhead, thus improving the processing efficiency of streaming data.
[0119] In some embodiments, a streaming data processing method is provided, such as Figure 8 As shown, this may include steps S81-S87. Steps S81-S84 are the same as steps S61-S64 described above. If the determination result obtained in step S82 indicates that the target streaming data packet is a tail packet, then step S85 is executed.
[0120] Step S85: Obtain the second space required for processing the target streaming data packet, and read the second execution order of the kernel for processing the target streaming data packet.
[0121] When the CPU receives the tail packet, it can determine its length and thus the space required to process it—the second space needed to process the target streaming data packet. This second space can be space reallocated by the CPU. Alternatively, it can be part or all of the previously allocated first space, thereby reducing the space occupied and the number of space allocations, further improving the efficiency of streaming data processing.
[0122] In addition, the CPU can read the inference model to obtain the kernel execution order for processing the tail packet, that is, the second execution order of the kernel.
[0123] Step S86: Based on the second space and the second execution order, generate the second graphics calculation instruction.
[0124] In this embodiment of the disclosure, the CPU generates the graphics calculation instruction corresponding to the target streaming data packet (i.e., the tail packet), i.e., the second graphics calculation instruction, based on the second space and the second execution order.
[0125] Step S87: A second graphics calculation instruction is sent to the graphics processor, so that the graphics processor performs inference on the target streaming data packet based on the second graphics calculation instruction to obtain a third processing result of the target streaming data packet. Similar to step S43, please refer to the relevant description in the section on step S43 for details.
[0126] In the technical solution provided in this embodiment, when the CPU processes the tail packet, it temporarily generates a second graphics calculation instruction corresponding to the tail packet. The space size indicated by the second graphics calculation instruction matches the length of the tail packet. Therefore, the CPU processes the tail packet based on the second graphics calculation instruction without introducing additional computational overhead, thus improving the processing efficiency of streaming data.
[0127] In this embodiment of the disclosure, after the CPU obtains the processing results from the GPU (such as the first, second, and third processing results mentioned above), it can perform subsequent processing as needed, such as feeding back the processing results to the user, or determining the inference effect of the inference model based on the processing results, and updating the inference strategy based on the inference effect, etc. No limitations are imposed on this.
[0128] Corresponding to the above-described streaming data processing method, this disclosure also provides a streaming data processing apparatus, such as... Figure 9 As shown, it includes:
[0129] Receiver module 91 is used to receive target streaming data packets;
[0130] The first determining module 92 is used to determine, from the preset graphics calculation instructions, a first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream. The graphics calculation instruction includes the space address required by the streaming data packet and the execution order of the kernel that processes the streaming data packet.
[0131] The first inference module 93 is used to send the first graphics calculation instruction to the graphics processor, so that the graphics processor can infer the target streaming data packet based on the spatial address and kernel execution order included in the first graphics calculation instruction, and obtain the first processing result of the target streaming data packet.
[0132] In some embodiments, the above-described streaming data processing apparatus may further include:
[0133] The second determining module is used to determine the space required to process streaming data packets at different locations in the data stream;
[0134] The first allocation module is used to allocate the first space required for processing the streaming data packets included in the data stream according to the determined maximum space size, and the streaming data packets at different positions correspond to the corresponding space size in the first space.
[0135] The first generation module is used to generate preset graphics computing instructions corresponding to different locations based on the first execution order of the kernel that processes streaming data packets at different locations and the address of the first space.
[0136] In some embodiments, the position of the streaming data packet in the data stream includes a fixed-length first packet and intermediate packets, and the preset graphics calculation instructions include graphics calculation instructions corresponding to the first packet and graphics calculation instructions corresponding to the intermediate packets; the above-mentioned streaming data processing device may further include:
[0137] The judgment module is used to determine whether the target streaming data packet is the first packet or an intermediate packet in the data stream, and obtain the judgment result.
[0138] The first determining module is specifically used to determine, in response to the judgment result indicating that the position of the target streaming data packet in the data stream is the first packet or an intermediate packet, a first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream from a preset graphics calculation instruction.
[0139] In some embodiments, the position of the streaming data packet in the data stream further includes a tail packet of variable length; the above-described streaming data processing apparatus may further include:
[0140] The second allocation module is used to obtain the second space required to process the target streaming data packet in response to the judgment result indicating that the target streaming data packet is the tail packet in the data stream, and to read the second execution order of the kernel that processes the target streaming data packet;
[0141] The second inference module is used to send kernel execution commands to the graphics processor according to the second execution order, so that the graphics processor can use the second space to infer the target streaming data packet based on the kernel execution commands and obtain the second processing result of the target streaming data packet.
[0142] In some embodiments, the position of the streaming data packet in the data stream further includes a tail packet of variable length; the above-described streaming data processing apparatus may further include:
[0143] The second allocation module is used to obtain the second space required to process the target streaming data packet in response to the judgment result indicating that the target streaming data packet is the tail packet in the data stream, and to read the second execution order of the kernel that processes the target streaming data packet;
[0144] The second generation module is used to generate second graphics calculation instructions based on the second space and the second execution order;
[0145] The third inference module is used to send the second graphics calculation instruction to the graphics processor, so that the graphics processor can perform inference on the target streaming data packet based on the second graphics calculation instruction to obtain the third processing result of the target streaming data packet.
[0146] In the technical solution provided by this disclosure, the CPU pre-sets graphics computing instructions, which include the space address required by the streaming data packet and the execution order of the kernels processing the streaming data packet. Therefore, when processing a streaming data packet, the CPU only needs to obtain the corresponding graphics computing instructions and send them to the GPU. Subsequently, the GPU can complete the inference of the streaming data packet based on the graphics computing instructions. It can be seen that in this disclosure, the CPU only needs to interact with the GPU once to complete the inference of the streaming data packet, reducing the number of interactions between the CPU and the GPU and improving the processing efficiency of streaming data.
[0147] The collection, storage, use, processing, transmission, provision, and disclosure of user personal information involved in the technical solutions of this disclosure comply with the provisions of relevant laws and regulations and do not violate public order and good morals.
[0148] According to embodiments of this disclosure, this disclosure also provides an electronic device, a readable storage medium, and a computer program product.
[0149] Figure 10A schematic block diagram of an electronic device 100 that can be used to implement the streaming data processing method of embodiments of the present disclosure is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the present disclosure described and / or claimed herein.
[0150] like Figure 10 As shown, device 100 includes a computing unit 101, which can perform various appropriate actions and processes according to a computer program stored in read-only memory (ROM) 102 or a computer program loaded from storage unit 108 into random access memory (RAM) 103. The RAM 103 may also store various programs and data required for the operation of device 100. The computing unit 101, ROM 102, and RAM 103 are interconnected via bus 104. Input / output (I / O) interface 105 is also connected to bus 104.
[0151] Multiple components in device 100 are connected to I / O interface 105, including: input unit 106, such as keyboard, mouse, etc.; output unit 107, such as various types of monitors, speakers, etc.; storage unit 108, such as disk, optical disk, etc.; and communication unit 109, such as network card, modem, wireless transceiver, etc. Communication unit 109 allows device 100 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.
[0152] The computing unit 101 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 101 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 101 performs the various methods and processes described above, such as streaming data processing methods. For example, in some embodiments, the streaming data processing method may be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 108. In some embodiments, part or all of the computer program may be loaded and / or installed on device 100 via ROM 102 and / or communication unit 109. When the computer program is loaded into RAM 103 and executed by the computing unit 101, one or more steps of the streaming data processing method described above may be performed. Alternatively, in other embodiments, the computing unit 101 may be configured to perform streaming data processing methods by any other suitable means (e.g., by means of firmware).
[0153] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), complex programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.
[0154] This disclosure also provides an electronic device, such as... Figure 11 As shown, it includes:
[0155] At least one processor 111; and
[0156] The memory 112 is communicatively connected to the at least one processor 111; wherein,
[0157] The memory 112 stores instructions that can be executed by the at least one processor 111, which, when executed by the at least one processor 111, enable the at least one processor 111 to perform any of the streaming data processing methods described above.
[0158] This disclosure also provides a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the streaming data processing method according to any of the above descriptions.
[0159] This disclosure also provides a computer program product, including a computer program that, when executed by a processor, implements the streaming data processing method according to any of the above descriptions.
[0160] The program code used to implement the methods of this disclosure may be written in any combination of one or more programming languages. This program code may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus, such that when executed by the processor or controller, the program code causes the functions / operations specified in the flowcharts and / or block diagrams to be implemented. The program code may be executed entirely on a machine, partially on a machine, as a standalone software package partially on a machine and partially on a remote machine, or entirely on a remote machine or server.
[0161] In the context of this disclosure, a machine-readable medium can be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
[0162] To provide interaction with a user, the systems and techniques described herein can be implemented on a computer having: a display device for displaying information to the user (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor); and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the computer. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).
[0163] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as a data server), or computing systems that include middleware components (e.g., an application server), or computing systems that include frontend components (e.g., a user computer with a graphical user interface or web browser through which a user can interact with embodiments of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.
[0164] Computer systems can include clients and servers. Clients and servers are generally located far apart and typically interact via communication networks. Client-server relationships are created by computer programs running on the respective computers and having a client-server relationship with each other. Servers can be cloud servers, servers in distributed systems, or servers incorporating blockchain technology.
[0165] It should be understood that the various forms of processes shown above can be used to rearrange, add, or delete steps. For example, the steps described in this disclosure can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution disclosed in this disclosure can be achieved, and this is not limited herein.
[0166] The specific embodiments described above do not constitute a limitation on the scope of protection of this disclosure. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this disclosure should be included within the scope of protection of this disclosure.
Claims
1. A streaming data processing method applied to a central processing unit (CPU), wherein the CPU has pre-set graphics computation instructions corresponding to streaming data packets at different locations, and the execution order of the kernels for the streaming data packets at different locations may be the same or different; comprising: Receive target streaming data packets; From the preset graphics computing instructions, a first graphics computing instruction corresponding to the position of the target streaming data packet in the data stream is determined. The graphics computing instruction includes the space address required by the streaming data packet and the execution order of the kernel processing the streaming data packet. The space corresponding to the space address required by the streaming data packet is the space required to process the streaming data packet. The first graphics calculation instruction is sent to the graphics processor, so that the graphics processor can infer the target streaming data packet based on the spatial address and kernel execution order included in the first graphics calculation instruction, and obtain the first processing result of the target streaming data packet.
2. The method according to claim 1, further comprising, before receiving the target streaming data packet: Determine the space required to process streaming data packets at different locations in the data stream; According to the determined maximum space size, allocate the first space required to process the streaming data packets included in the data stream, and the streaming data packets at different positions correspond to the space size of the corresponding space in the first space; Based on the first execution order of the kernel processing streaming data packets at different locations, and the address of the first space, preset graphics calculation instructions corresponding to different locations are generated.
3. The method according to claim 1, wherein, The streaming data packets are positioned within the data stream by a fixed-length first packet and intermediate packets; the preset graphics calculation instructions include graphics calculation instructions corresponding to the first packet and graphics calculation instructions corresponding to the intermediate packets; the method further includes: Determine whether the target streaming data packet is the first packet or an intermediate packet in the data stream, and obtain the determination result; In response to the judgment result indicating that the target streaming data packet is the first packet or an intermediate packet in the data stream, the step of determining the first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream from the preset graphics calculation instructions is executed.
4. The method according to claim 3, wherein, The position of the streaming data packet in the data stream also includes a tail packet of variable length; the method further includes: In response to the determination result indicating that the target streaming data packet is the tail packet in the data stream, the second space required to process the target streaming data packet is obtained, and the second execution order of the kernel processing the target streaming data packet is read; According to the second execution order, the kernel execution command is sent to the graphics processor so that the graphics processor, based on the kernel execution command, uses the second space to infer the target streaming data packet and obtain the second processing result of the target streaming data packet.
5. The method according to claim 3, wherein, The position of the streaming data packet in the data stream also includes a tail packet of variable length; the method further includes: In response to the determination result indicating that the target streaming data packet is the tail packet in the data stream, the second space required to process the target streaming data packet is obtained, and the second execution order of the kernel processing the target streaming data packet is read; Based on the second space and the second execution order, a second graphics calculation instruction is generated; The second graphics calculation instruction is sent to the graphics processor so that the graphics processor can perform inference on the target streaming data packet based on the second graphics calculation instruction to obtain the third processing result of the target streaming data packet.
6. The method according to claim 4 or 5, wherein, The step of acquiring the second space required for processing the target streaming data packet includes: From the space indicated by the space address included in the preset graphics calculation instructions, determine the second space required to process the target streaming data packet; or A second space is allocated outside the space indicated by the space address included in the preset graphics calculation instructions to process the target streaming data packet.
7. A streaming data processing apparatus, applied to a central processing unit (CPU), wherein the CPU has pre-set graphics computation instructions corresponding to streaming data packets at different locations, and the execution order of the kernels for the streaming data packets at different locations may be the same or different; comprising: The receiving module is used to receive target streaming data packets; The first determining module is used to determine, from the preset graphics calculation instructions, a first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream. The graphics calculation instruction includes the space address required by the streaming data packet and the execution order of the kernel processing the streaming data packet. The space corresponding to the space address required by the streaming data packet is the space required to process the streaming data packet. The first inference module is used to send the first graphics calculation instruction to the graphics processor, so that the graphics processor can infer the target streaming data packet based on the spatial address and kernel execution order included in the first graphics calculation instruction, and obtain the first processing result of the target streaming data packet.
8. The apparatus according to claim 7, further comprising: The second determining module is used to determine the space required to process streaming data packets at different locations in the data stream; The first allocation module is used to allocate the first space required for processing the streaming data packets included in the data stream according to the determined maximum space size, wherein the streaming data packets at different positions correspond to the corresponding space size in the first space; The first generation module is used to generate preset graphics calculation instructions corresponding to different locations based on the first execution order of the kernels processing streaming data packets at the different locations and the address of the first space.
9. The apparatus according to claim 7, wherein, The streaming data packets are positioned within the data stream by a fixed-length first packet and intermediate packets; the preset graphics calculation instructions include graphics calculation instructions corresponding to the first packet and graphics calculation instructions corresponding to the intermediate packets; the device further includes: The judgment module is used to determine whether the target streaming data packet is the first packet or an intermediate packet in the data stream, and obtain the judgment result. The first determining module is specifically used to, in response to the judgment result indicating that the target streaming data packet is the first packet or an intermediate packet in the data stream, determine a first graphics calculation instruction corresponding to the position of the target streaming data packet in the data stream from a preset graphics calculation instruction.
10. The apparatus according to claim 9, wherein, The position of the streaming data packet in the data stream also includes a tail packet of variable length; the device further includes: The second allocation module is used to, in response to the judgment result indicating that the target streaming data packet is the tail packet in the data stream, obtain the second space required to process the target streaming data packet, and read the second execution order of the kernel processing the target streaming data packet; The second inference module is used to send the kernel execution command to the graphics processor according to the second execution order, so that the graphics processor, based on the kernel execution command, uses the second space to infer the target streaming data packet and obtain the second processing result of the target streaming data packet.
11. The apparatus according to claim 9, wherein, The position of the streaming data packet in the data stream also includes a tail packet of variable length; the device further includes: The second allocation module is used to, in response to the judgment result indicating that the target streaming data packet is the tail packet in the data stream, obtain the second space required to process the target streaming data packet, and read the second execution order of the kernel processing the target streaming data packet; The second generation module is used to generate a second graphics calculation instruction based on the second space and the second execution order; The third inference module is used to send the second graphics calculation instruction to the graphics processor, so that the graphics processor can perform inference on the target streaming data packet based on the second graphics calculation instruction to obtain the third processing result of the target streaming data packet.
12. The apparatus according to claim 10 or 11, wherein, The second allocation module is specifically used for: From the space indicated by the space address included in the preset graphics calculation instructions, determine the second space required to process the target streaming data packet; or A second space is allocated outside the space indicated by the space address included in the preset graphics calculation instructions to process the target streaming data packet.
13. An electronic device, comprising: At least one processor; as well as A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer-readable storage medium storing computer instructions, wherein, The computer instructions are used to cause the computer to perform the method according to any one of claims 1-6.
15. A computer program product comprising a computer program that, when executed by a processor, implements the method according to any one of claims 1-6.