Kernel function deployment method and device, electronic equipment and storage medium
By setting up an object kernel function cache group in memory and searching for or generating the target kernel function in memory or a kernel function library, the problem of kernel function generation delay is solved, and the computing performance of computing devices is improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI BIREN TECH CO LTD
- Filing Date
- 2022-10-28
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies suffer from significant delays in generating kernel functions, leading to a decrease in the computing performance of computing devices, especially when parameter information changes dynamically, making it impossible to efficiently find and generate kernel functions.
By setting up an object kernel function cache group in memory, the system queries and provides target kernel functions to perform computational tasks, and searches for or generates target kernel functions in memory or a kernel function library, thereby reducing latency and improving the efficiency of kernel function usage.
This significantly reduces the latency of kernel function lookup, improves the efficiency of kernel function usage, and thus enhances the computing performance of computing devices.
Smart Images

Figure CN115905635B_ABST
Abstract
Description
Technical Field
[0001] Embodiments of this disclosure relate to a kernel function deployment method and apparatus, electronic device, and storage medium. Background Technology
[0002] The development of fields such as Artificial Intelligence (AI) and Neural Networks (NN) has placed higher demands on the computing performance of computing devices. Current computing devices widely utilize processors such as Central Processing Units (CPUs), Graphics Processing Units (GPUs), and General-purpose Computing on Graphics Processing Units (GPGPUs). By processing different data on different processors, and by performing data migration, transfer, and storage between different processors, the computing performance of computing devices has been improved. Summary of the Invention
[0003] At least one embodiment of this disclosure provides a kernel function deployment method, which includes: obtaining parameter information of a computational task to be processed; querying one or more object kernel function cache groups to see if a target kernel function corresponding to the computational task and the parameter information is cached, based on the parameter information; and, in response to the one or more object kernel function cache groups caching the target kernel function, providing the target kernel function to a device to execute the computational task; wherein each object kernel function cache group in the one or more object kernel function cache groups includes at least one kernel function with the same computational task but different parameter information.
[0004] For example, at least one embodiment of the present disclosure provides a kernel function deployment method that further includes: in response to the one or more object kernel function cache groups not caching the target kernel function, querying the kernel function library to see if the target kernel function corresponding to the computing task and the parameter information is stored; and in response to the kernel function library storing the target kernel function, providing the target kernel function to the device to execute the computing task.
[0005] For example, at least one embodiment of the kernel function deployment method provided in this disclosure further includes: in response to the fact that the target kernel function is not stored in the kernel function library, generating the target kernel function corresponding to the computing task and the parameter information.
[0006] For example, at least one embodiment of the kernel function deployment method provided in this disclosure further includes: storing the generated target kernel function in the kernel function library.
[0007] For example, at least one embodiment of the kernel function deployment method provided in this disclosure further includes: selecting a target kernel function cache group corresponding to the computing task in the kernel function library, or creating the target kernel function cache group in the kernel function library in response to the absence of a target kernel function cache group corresponding to the computing task in the kernel function library; and storing the generated target kernel function in the target kernel function cache group.
[0008] For example, at least one embodiment of the kernel function deployment method provided in this disclosure further includes: loading the target kernel function cache group from the kernel function library into memory.
[0009] For example, in the kernel function deployment method provided in at least one embodiment of this disclosure, the kernel function library includes multiple kernel function cache groups, and each kernel function cache group includes at least one kernel function with the same computational task but different parameter information.
[0010] For example, in the kernel function deployment method provided in at least one embodiment of this disclosure, the one or more object kernel function cache groups are one or more kernel function cache groups read from the kernel function library into memory.
[0011] For example, in at least one embodiment of the kernel function deployment method provided in this disclosure, a kernel function index table is created corresponding to the kernel function library. The kernel function index table includes a plurality of first kernel function indexes, each of which corresponds to a plurality of kernel functions stored in the kernel function library. The step of querying the kernel function library to see if the target kernel function corresponding to the computation task and the parameter information is stored includes: calculating the target kernel function index corresponding to the target kernel function based on the computation task and the parameter information; searching the kernel function index table based on the target kernel function index; and, in response to the existence of a first kernel function index in the kernel function index table that is equivalent to the target kernel function index, obtaining the target kernel function in the kernel function library using the equivalent first kernel function index.
[0012] For example, in the kernel function deployment method provided in at least one embodiment of this disclosure, the kernel function library is stored on a hard disk.
[0013] For example, in a kernel function deployment method provided in at least one embodiment of this disclosure, the step of querying whether a target kernel function corresponding to the computation task and the parameter information is cached in one or more object kernel function cache groups includes: calculating the target kernel function index corresponding to the target kernel function based on the computation task and the parameter information; and querying whether the target kernel function is cached in the one or more object kernel function cache groups based on the target kernel function index; wherein, the one or more object kernel function cache groups are stored in memory.
[0014] For example, in at least one embodiment of the kernel function deployment method provided in this disclosure, a kernel function index table is created corresponding to the kernel function library. The kernel function index table includes a plurality of second kernel function indexes, each of which corresponds to a plurality of kernel functions cached in the one or more object kernel function cache groups. The step of querying whether the target kernel function is cached in the one or more object kernel function cache groups based on the target kernel function index includes: searching the kernel function index table based on the target kernel function index; and in response to the existence of a second kernel function index in the kernel function index table that is equivalent to the target kernel function index, obtaining the target kernel function in the one or more object kernel function cache groups using the equivalent second kernel function index.
[0015] For example, at least one embodiment of the kernel function deployment method provided in this disclosure further includes, before querying in one or more object kernel function cache groups whether a target kernel function corresponding to the computing task and the parameter information is cached, querying in the one or more object kernel function cache groups whether a target kernel function cache group corresponding to the computing task is cached according to the computing task.
[0016] For example, at least one embodiment of the kernel function deployment method provided in this disclosure further includes, before querying whether a target kernel function corresponding to the computation task and the parameter information is cached in one or more object kernel function cache groups: in response to the fact that the target kernel function cache group is not cached in the one or more object kernel function cache groups, searching in the kernel function library whether the target kernel function cache group corresponding to the computation task is stored according to the computation task; in response to the fact that the target kernel function cache group is stored in the kernel function library, reading the target kernel function cache group into memory as one of the one or more object kernel function cache groups.
[0017] For example, in a kernel function deployment method provided in at least one embodiment of this disclosure, the step of querying whether a target kernel function corresponding to the computation task and the parameter information is cached in one or more object kernel function cache groups includes: querying whether the target kernel function corresponding to the computation task and the parameter information is cached in the target kernel function cache group.
[0018] For example, in a kernel function deployment method provided in at least one embodiment of this disclosure, the memory includes multiple object kernel function cache groups, each of the multiple object kernel function cache groups has a corresponding first priority value, the multiple object kernel function cache groups are sorted according to their corresponding multiple first priority values, and the step of providing the target kernel function to the device to execute the computing task in response to the one or more object kernel function cache groups caching the target kernel function includes: in response to the multiple object kernel function cache groups including a target kernel function cache group having the target kernel function, modifying the first priority value corresponding to the target kernel function cache group to increase the first priority of the target kernel function cache group.
[0019] For example, in a kernel function deployment method provided in at least one embodiment of this disclosure, the memory includes multiple object kernel function cache groups, each of the multiple object kernel function cache groups has a corresponding first priority value, and the multiple object kernel function cache groups are sorted according to their corresponding multiple first priority values. The step of providing the target kernel function to the device to execute the computation task in response to the kernel function library storing the target kernel function includes: in response to the kernel function library including a target kernel function cache group having the target kernel function, loading the target kernel function cache group into the memory as one of the multiple object kernel function cache groups, and setting the first priority value corresponding to the target kernel function cache group to increase the first priority of the target kernel function cache group.
[0020] For example, in a kernel function deployment method provided in at least one embodiment of this disclosure, the step of providing the target kernel function to the device to execute the computation task in response to the kernel function library storing the target kernel function further includes: in response to loading the target kernel function cache group into the memory as one of the plurality of object kernel function cache groups, replacing at least one of the plurality of object kernel function cache groups included in the memory to the hard disk.
[0021] For example, in the kernel function deployment method provided in at least one embodiment of this disclosure, replacing at least one of the plurality of object kernel function cache groups included in the memory with the disk includes: replacing the object kernel function cache group with the smallest first priority value among the plurality of object kernel function cache groups with the disk.
[0022] For example, in a kernel function deployment method provided in at least one embodiment of this disclosure, the hard disk includes multiple kernel function cache groups, each of the multiple kernel function cache groups has a corresponding second priority value, the multiple kernel function cache groups are sorted according to their corresponding multiple second priority values, and at least one of the multiple object kernel function cache groups included in the memory is replaced to the hard disk, including: in response to the object kernel function cache group with the smallest corresponding first priority value among the multiple object kernel function cache groups being replaced to the hard disk, modifying the second priority value corresponding to the kernel function cache group on the hard disk that is equivalent to the object kernel function cache group with the smallest corresponding first priority value, so as to increase the second priority of the equivalent kernel function cache group.
[0023] At least one embodiment of this disclosure also provides a kernel function deployment apparatus, which includes: an acquisition module configured to acquire parameter information of a computational task to be processed; a query module configured to query whether a target kernel function corresponding to the computational task and the parameter information is cached in one or more object kernel function cache groups; and a providing module configured to provide the target kernel function to a device to execute the computational task in response to the one or more object kernel function cache groups caching the target kernel function; wherein each object kernel function cache group in the one or more object kernel function cache groups includes at least one kernel function with the same computational task but different parameter information.
[0024] For example, in the kernel function deployment apparatus provided in at least one embodiment of this disclosure, the query module is further configured to, in response to the one or more object kernel function cache groups not caching the target kernel function, query in the kernel function library whether the target kernel function corresponding to the computing task and the parameter information is stored. The providing module is further configured to, in response to the kernel function library storing the target kernel function, provide the target kernel function to the device to execute the computing task.
[0025] At least one embodiment of this disclosure also provides an electronic device. The electronic device includes: a processor; and a memory including one or more computer program modules; wherein the one or more computer program modules are stored in the memory and configured to be executed by the processor, and the one or more computer program modules are used to implement the kernel function deployment method provided in any embodiment of this disclosure.
[0026] At least one embodiment of this disclosure also provides a storage medium storing non-transitory computer-readable instructions that, when executed by a computer, implement the kernel function deployment method provided in any embodiment of this disclosure. Attached Figure Description
[0027] To more clearly illustrate the technical solutions of the embodiments of this disclosure, the accompanying drawings of the embodiments will be briefly described below. Obviously, the drawings described below only relate to some embodiments of this disclosure and are not intended to limit this disclosure.
[0028] Figure 1 An exemplary flowchart illustrating a kernel function deployment method provided for at least one embodiment of this disclosure;
[0029] Figure 2 A schematic diagram illustrating an example of a kernel function deployment method provided in at least one embodiment of this disclosure;
[0030] Figure 3 Another exemplary flowchart of a kernel function deployment method provided in at least one embodiment of this disclosure;
[0031] Figure 4 A schematic diagram illustrating another example of a kernel function deployment method provided in at least one embodiment of this disclosure;
[0032] Figure 5 for Figure 1 An exemplary flowchart of one example of step S150;
[0033] Figure 6 for Figure 1 An exemplary flowchart of one example of step S120;
[0034] Figure 7 A schematic diagram illustrating an example of a query target kernel function provided in at least one embodiment of this disclosure;
[0035] Figure 8 for Figure 1 An exemplary flowchart of another example in step S150;
[0036] Figure 9 This is a schematic diagram illustrating an example of a priority swapping mechanism for multiple object kernel function cache groups in memory provided in at least one embodiment of this disclosure;
[0037] Figure 10 This is a schematic diagram of another example of a priority swapping mechanism for multiple object kernel function cache groups in memory provided in at least one embodiment of this disclosure;
[0038] Figure 11 A schematic block diagram of a kernel function deployment apparatus provided for at least one embodiment of the present disclosure;
[0039] Figure 12 A schematic block diagram of an electronic device provided for at least one embodiment of this disclosure;
[0040] Figure 13A schematic block diagram of another electronic device provided for at least one embodiment of this disclosure; and
[0041] Figure 14 This is a schematic diagram of a storage medium provided for at least one embodiment of the present disclosure. Detailed Implementation
[0042] To make the objectives, technical solutions, and advantages of the embodiments of this disclosure clearer, the technical solutions of the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this disclosure. All other embodiments obtained by those skilled in the art based on the described embodiments of this disclosure without creative effort are within the scope of protection of this disclosure.
[0043] Unless otherwise defined, the technical or scientific terms used in this disclosure shall have the ordinary meaning understood by one of ordinary skill in the art to which this disclosure pertains. The terms “first,” “second,” and similar terms used in this disclosure do not indicate any order, quantity, or importance, but are merely used to distinguish different components. Similarly, the terms “an,” “a,” or “the,” and similar terms do not indicate a quantity limitation, but rather indicate the presence of at least one. The terms “including,” “comprising,” or “containing,” and similar terms mean that the element or object preceding the word encompasses the elements or objects listed following the word and their equivalents, without excluding other elements or objects. The terms “connected,” “linked,” or similar terms are not limited to physical or mechanical connections, but can include electrical connections, whether direct or indirect. The terms “upper,” “lower,” “left,” and “right,” etc., are used only to indicate relative positional relationships, and these relative positional relationships may change accordingly when the absolute position of the described objects changes.
[0044] The present disclosure will now be described through several specific embodiments. To keep the following description of the embodiments of the present disclosure clear and concise, detailed descriptions of known functions and components may be omitted. When any component of the embodiments of the present disclosure appears in more than one drawing, the component is represented by the same or similar reference numerals in each drawing.
[0045] Artificial Neural Networks (ANNs), also simply called neural networks, are mathematical models that mimic the behavioral characteristics of animal neural networks to perform distributed parallel information processing. These networks rely on the complexity of the system, adjusting the connections between a large number of internal nodes to achieve information processing. Neural networks can include Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, Deep Belief Networks (DBNs), and Convolutional Neural Networks (CNNs). Regardless of the type, artificial neural networks share common characteristics such as large-scale parallel processing, distributed storage, flexible topology, high redundancy, and nonlinear operations, possessing capabilities in areas such as processing speed, associative ability, adaptability, fault tolerance, and self-organization. These characteristics and capabilities constitute the technological foundation for artificial neural networks to simulate intelligent activities and have found important applications in various technological fields. For example, artificial neural networks can be used for data compression, image processing, video coding, and signal processing. For instance, graphics processing units (GPUs), tensor processing units (TPUs), and data processing units (DPUs) are widely used in computations related to artificial intelligence. The following explanation uses GPUs as an example, but this disclosure is not limited to GPUs.
[0046] The program architecture of a GPU (the following description of GPU also applies to other types of processors such as GPGPU) can be divided into two parts: 1) a host program running on the CPU, and 2) a device program running on the GPU. The functions running on the GPU are also called kernel functions. When performing a computational task using the GPU, the CPU calls the kernel function and provides the kernel function and input data to the device. The GPU then executes the kernel function to process the input data; the CPU then controls the GPU to copy the processing result back to the host, thus completing the computational task.
[0047] For example, on a GPU, to generate a high-performance kernel function, it's necessary to obtain parameter information such as the size and type of the data to be processed before generating the kernel function. This allows for the generation of the best-performing kernel function based on that set of parameters. However, during GPU runtime, the parameter information of the input data may differ each time. Therefore, during GPU runtime, new kernel functions need to be continuously generated based on different parameter information, resulting in significant latency (overhead).
[0048] For example, in some cases, such as when processing convolutional neural networks, for a set of data to be processed with fixed parameters such as size and type, all kernel functions corresponding to that set of data can be generated in advance based on the parameters of each data point, and these pre-generated kernel functions can be saved on the hard drive for later retrieval. However, each time data to be processed is retrieved, the kernel function corresponding to that data's parameter information must be searched on the hard drive, and this process of repeatedly searching for kernel functions on the hard drive will also introduce significant latency.
[0049] For example, in other cases, parameters such as the size and type of the data to be processed may change dynamically (i.e., are not fixed), and the range of values for these dynamically changing parameters cannot be obtained in advance. Because the parameter information changes dynamically, it is impossible to generate a kernel function corresponding to the data to be processed in advance based on the parameter information; the definite parameter information can only be obtained when the GPU program runs, and then the corresponding kernel function is generated based on the parameter information. Therefore, it is necessary to continuously generate kernel functions during GPU program execution; this continuous generation of kernel functions also introduces significant latency and reduces the running performance of the GPU program.
[0050] At least one embodiment of this disclosure provides a kernel function deployment method, which includes: obtaining parameter information of a computation task to be processed; querying one or more object kernel function cache groups to see if a target kernel function corresponding to the computation task and parameter information is cached, based on the parameter information; and providing the target kernel function to the device to execute the computation task in response to the one or more object kernel function cache groups caching the target kernel function.
[0051] At least one embodiment of this disclosure also provides a kernel function deployment apparatus, an electronic device, and a storage medium.
[0052] The method, apparatus, device, and storage medium provided in at least one embodiment of this disclosure greatly reduce the latency caused by searching for kernel functions in the object kernel function cache group, improve the efficiency of kernel function usage, improve the running performance of GPU programs, and thus improve the computing power of computing devices.
[0053] At least one embodiment of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that the same reference numerals will be used to refer to the same elements described in different drawings.
[0054] Figure 1 An exemplary flowchart illustrating a kernel function deployment method provided for at least one embodiment of this disclosure; Figure 2 This is a schematic diagram illustrating an example of a kernel function deployment method provided in at least one embodiment of this disclosure. For example, Figure 2 for Figure 1A specific example of a kernel function deployment method.
[0055] For example, such as Figure 1 As shown, at least one embodiment of this disclosure provides a kernel function deployment method, which includes the following steps S110 to S130.
[0056] Step S110: Obtain parameter information for the computational task to be processed;
[0057] Step S120: Based on the parameter information, query one or more object kernel function cache groups to see if there is a cached target kernel function corresponding to the computation task and parameter information;
[0058] Step S130: In response to one or more object kernel function cache groups caching the target kernel function, the target kernel function is provided to the device to perform the computation task.
[0059] For example, in step S110, the computation task corresponds to the data to be processed and its included algorithm information; for the computation task to be processed, it is also necessary to obtain parameter information such as the size and type of the data to be processed. For example, the parameter information may include the number of inputs, the input shapes, the number of outputs, the output shapes, and other parameters of the data to be processed. The embodiments disclosed herein are not limited in scope.
[0060] For example, in some examples, parameter information can be obtained at the same time as the computation task is acquired; in other examples, the parameter information may change dynamically (i.e., not fixed) when the computation task is acquired, and the parameter information can be set as a placeholder until the program starts running to obtain the fixed parameter information.
[0061] For example, in step S120, the target kernel function is the kernel function corresponding to the computation task and parameter information in step S110. For example, as... Figure 2 As shown, to obtain the target kernel function, one can first query one or more object kernel function cache groups (KernelCache) to see if the target kernel function is cached. For example, each of the one or more object kernel function cache groups includes at least one kernel function with the same computational task but different parameter information.
[0062] For example, multiple kernel functions with the same algorithm information but different parameter information are cached in the same kernel cache group (KernelCache). That is, multiple kernel functions are grouped and stored in a kernel cache group manner. For example, one or more object kernel cache groups can be one or more kernel cache groups read into memory from, for example, a hard disk or storage server (not shown in the figure). That is, for example, the most likely kernel functions currently used in the kernel function library stored on the hard disk or storage server, along with their respective kernel cache groups, are copied into memory. For example, if feasible, the entire kernel function library stored on the hard disk or storage server can also be copied into memory. The following description uses a hard disk as an example.
[0063] For example, the number of object kernel function cache sets held in memory is determined by specific processor parameters. Processors with larger memory can store more object kernel function cache sets in memory, and can store the remaining kernel function cache sets in binary form on disk.
[0064] For example, in step S130, as Figure 2 As shown, in response to one or more object kernel function cache groups caching the target kernel function (that is, the target kernel function cache group caching the target kernel function can be found in one or more object kernel function cache groups), the target kernel function can be provided to the device to execute the corresponding computation task.
[0065] For example, such as Figure 1 As shown, in some examples, the kernel function deployment method may further include steps S140 to S150:
[0066] Step S140: In response to one or more object kernel function cache groups not caching the target kernel function, query the kernel function library to see if the target kernel function corresponding to the computation task and parameter information is stored;
[0067] Step S150: In response to the presence of a target kernel function in the kernel function library, the target kernel function is provided to the device to perform the computation task.
[0068] For example, a kernel library may include multiple kernel function cache groups, each of which contains at least one kernel function with the same computational task but different parameter information. For example, one or more object kernel function cache groups may be one or more kernel function cache groups read from the kernel library into memory. For example, in some examples, the kernel library may query whether the target kernel function is stored in memory or on disk.
[0069] For example, in step S140, as Figure 2As shown, in response to one or more object kernel function cache groups not caching the target kernel function (that is, no target kernel function cache group caching the target kernel function is found in one or more object kernel function cache groups), the system further queries the kernel function library (KernelDataBase) to see if the target kernel function corresponding to the computation task and parameter information is stored.
[0070] For example, in some examples, a kernel function index table can be created corresponding to the kernel function library. This kernel function index table can include multiple first kernel function indexes corresponding to memory and multiple second kernel function indexes corresponding to disk. A target kernel function index can be generated based on the target kernel function (e.g., query keywords, etc.). The kernel function index table is then queried to see if there is a first kernel function index or a second kernel function index that is equivalent to the target kernel function index. If it exists, the target kernel function can be found in memory based on the equivalent first kernel function index, or on disk based on the equivalent second kernel function index.
[0071] For example, in step S150, as Figure 2 As shown, in response to the presence of a target kernel function stored in the kernel function library (that is, the target kernel function and its corresponding target kernel function cache group can be found in the kernel function library in memory or on the hard disk), the target kernel function can be provided to the device to execute the corresponding computation task.
[0072] For example, such as Figure 2 As shown, in response to the absence of a target kernel function in the kernel function library, a target kernel function corresponding to the computation task and parameter information is generated. Then, the generated target kernel function is saved in the kernel function library, and the generated target kernel function is provided to the device to execute the corresponding computation task.
[0073] For example, in some cases, if a target kernel function cache group corresponding to the computation task exists in the kernel function library (that is, the target kernel function cache group caches other kernel functions with the same algorithm information as the target kernel function but different parameter information), then the target kernel function cache group corresponding to the computation task can be selected in the kernel function library; in other cases, in response to the fact that a target kernel function cache group corresponding to the computation task does not yet exist in the kernel function library, the target kernel function cache group is created in the kernel function library.
[0074] For example, based on the target kernel function cache group selected or created in the kernel function library, the generated target kernel functions are stored in that target kernel function cache group. For example, if the target kernel function cache group is not in memory but is stored on disk, the target kernel function cache group can be loaded from the kernel function library into memory.
[0075] The kernel function deployment method provided in at least one embodiment of this disclosure saves the time of searching for the target kernel function cache group in the kernel function library by setting one or more object kernel function cache groups in memory, greatly reduces the latency caused by searching for kernel functions, improves the efficiency of kernel function usage, improves the running performance of device-side programs (such as GPU programs), and thus improves the computing power of computing devices.
[0076] For example, in some cases, the parameter information obtained when acquiring the computation task in step S110 may change dynamically (i.e., it is not fixed), and the value range of this dynamically changing parameter information cannot be obtained in advance. Because the parameter information changes dynamically, it is not possible to directly search for the target kernel function in the object kernel function cache group or kernel function library, nor is it possible to generate a target kernel function based on the parameter information.
[0077] For example, when parameter information changes dynamically, it can be set as a placeholder until the program starts running and the fixed parameter information is available. Alternatively, before obtaining the fixed parameter information, it can be checked whether a cached set of the target kernel function corresponding to the computation task is stored.
[0078] Figure 3 Another exemplary flowchart of a kernel function deployment method provided in at least one embodiment of this disclosure; Figure 4 This is a schematic diagram illustrating another example of a kernel function deployment method provided in at least one embodiment of this disclosure. For example, Figure 4 for Figure 1 and Figure 3 A specific example of a kernel function deployment method.
[0079] For example, such as Figure 3 As shown, before querying whether a target kernel function corresponding to the computation task and parameter information is cached in one or more object kernel function cache groups, the kernel function deployment method provided in at least one embodiment of this disclosure may further include the following steps S1101 to S1103.
[0080] Step S1101: Based on the computation task, query one or more object kernel function cache groups to see if there is a cache group of target kernel functions corresponding to the computation task.
[0081] Step S1102: In response to the fact that the target kernel function cache group is not cached in one or more object kernel function cache groups, search in the kernel function library to see if the target kernel function cache group corresponding to the computation task is stored, according to the computation task.
[0082] Step S1103: In response to the presence of a target kernel function cache group stored in the kernel function library, the target kernel function cache group is read into memory as one of one or more object kernel function cache groups.
[0083] For example, in step S1101, due to the dynamic changes in parameter information, such as Figure 4 As shown, based on the computation task, one can first query one or more object kernel function cache groups to see if a target kernel function cache group corresponding to the computation task is cached. For example, if the target kernel function cache group can be found in one or more object kernel function cache groups, then after obtaining the fixed parameter information, one can directly query the target kernel function cache group to see if the target kernel function is cached.
[0084] For example, in step S1102, as Figure 4 As shown, if the target kernel function cache group is not cached in one or more object kernel function cache groups, the target kernel function cache group can be searched in the kernel function library according to the computation task. Then, in step S1103, in response to the target kernel function cache group being stored in the kernel function library, the target kernel function cache group is read into memory as one of one or more object kernel function cache groups. For example, after obtaining fixed parameter information, the target kernel function cache group is queried to see if the target kernel function corresponding to the computation task and parameter information is cached.
[0085] For example, such as Figure 4 As shown, if the target kernel function is cached in the target kernel function cache group, it can be directly provided to the device to execute the computation task. Therefore, in the kernel function deployment method provided in this embodiment, when the parameter information obtained in step S110 changes dynamically, the delay time is only the time from when the parameter information is not fixed to when it is fixed, thereby saving the time to search for the target kernel function in the kernel function library after the parameter information is fixed; in particular, if the target kernel function cache group can be directly found in the target kernel function cache group after step S1101, the time to search for the target kernel function cache group in the kernel function library will be further saved, thereby greatly improving the computational efficiency.
[0086] It should be noted that, as Figure 4 As shown, if the target kernel function cache group is not stored in the kernel function library after querying in step S1102, or if the target kernel function cache group can be found in the object kernel function cache group or kernel function library after querying in steps S1101 or S1102, but the target kernel function is not cached in the target kernel function cache group, then after obtaining the fixed parameter information, steps S140 to S150 are executed to query the target kernel function in the kernel function library; furthermore, if the target kernel function is not stored in the kernel function library, then the target kernel function is generated. For example, as... Figure 4 As shown, after obtaining the fixed parameter information, the detailed process for querying or generating the target kernel function can be found in [link to documentation]. Figure 2 The description in the text will not be repeated here.
[0087] Figure 5 for Figure 1 An exemplary flowchart of one of the steps in S150.
[0088] For example, a kernel function index table is created corresponding to the kernel function library. This table includes multiple first kernel function indices, each corresponding to a kernel function stored in the kernel function library. For example, such as... Figure 5 As shown, Figure 1 The step S150 of querying the target kernel function in the kernel function library may include the following steps S1501 to S1503.
[0089] Step S1501: Based on the computation task and parameter information, calculate the target kernel function index corresponding to the target kernel function;
[0090] Step S1502: Look up the kernel function index table based on the target kernel function index;
[0091] Step S1503: In response to the existence of a first kernel function index in the kernel function index table that is equivalent to the target kernel function index, the target kernel function is obtained from the kernel function library using the equivalent first kernel function index.
[0092] For example, in step S1501, the target kernel function index corresponding to the target kernel function can be calculated based on the calculation task and parameter information of the target kernel function; in step S1502, the calculated target kernel function index is compared with multiple first kernel function indices in the kernel function index table; then in step S1503, if a first kernel function index that is equivalent to the target kernel function index can be found in the kernel function index table after comparison, the target kernel function can be obtained from the kernel function library using the equivalent first kernel function index.
[0093] For example, in some examples, the kernel function library is stored on the hard disk, and multiple first kernel function indices correspond one-to-one with multiple kernel functions stored on the hard disk. For example, steps S1501 to S1503 can be used to query whether the target kernel function is stored on the hard disk. For example, if the target kernel function is found on the hard disk, the target kernel function cache group containing the target kernel function can be loaded from the hard disk into memory as a newly added object kernel function cache group.
[0094] For example, the target kernel function index and the first kernel function index can be query keywords, or they can be other types of index information used to query kernel functions. The specific selection can be made according to actual needs, and the embodiments of this disclosure do not limit this.
[0095] Figure 6 for Figure 1 An exemplary flowchart of one example of step S120.
[0096] For example, such as Figure 5 As shown, Figure 1 The step S120 of querying the target kernel function in one or more object kernel function cache groups may include the following steps S1201 to S1202.
[0097] Step S1201: Based on the computation task and parameter information, calculate the target kernel function index corresponding to the target kernel function;
[0098] Step S1202: Based on the target kernel function index, query one or more object kernel function cache groups to see if the target kernel function is cached.
[0099] For example, in step S1201, similar to step S1501, the target kernel function index can be calculated based on the calculation task and parameter information of the target kernel function; in step S1202, the calculated target kernel function index is used to query whether the target kernel function is cached in one or more object kernel function cache groups in memory. For example, one or more object kernel function cache groups are stored in memory.
[0100] For example, in some examples, a kernel function index table is created corresponding to the kernel function library. The kernel function index table includes multiple second kernel function indices, each of which corresponds to multiple kernel functions cached in one or more object kernel function cache groups in memory.
[0101] For example, step S1202 may further include: searching the kernel function index table based on the target kernel function index; in response to the existence of a second kernel function index in the kernel function index table that is equivalent to the target kernel function index, retrieving the target kernel function from one or more object kernel function cache groups using the equivalent second kernel function index.
[0102] For example, multiple second kernel function indices correspond one-to-one with multiple kernel functions cached in one or more object kernel function cache groups in memory. For example, steps S1201 to S1202 can be used to query whether the target kernel function is stored in one or more object kernel function cache groups in memory.
[0103] For example, the target kernel function index and the second kernel function index can be query keywords or other types of index information used to query kernel functions. The specific selection can be made according to actual needs, and the embodiments of this disclosure do not limit this.
[0104] Figure 7 This is a schematic diagram illustrating an example of a query target kernel function provided in at least one embodiment of this disclosure. For example, Figure 7 for Figure 5 and / or Figure 6 A specific example.
[0105] For example, such as Figure 7 As shown, the memory may include, for example, m+1 object kernel function cache groups (object kernel function cache group 0, ..., object kernel function cache group m, where m is an integer greater than or equal to 0), and each object kernel function cache group caches multiple kernel functions with the same algorithm information but different parameter information (for example, object kernel function cache group 0 caches kernel functions -00, -01, ... with the same algorithm information but different parameter information); or, the hard disk may include, for example, n kernel function cache groups (kernel function cache group 1, ..., kernel function cache group n, where n is a positive integer), and each kernel function cache group caches multiple kernel functions with the same algorithm information but different parameter information (for example, kernel function cache group 1 caches kernel functions -10, -11, ... with the same algorithm information but different parameter information).
[0106] For example, such as Figure 7 As shown, a kernel function index table is created corresponding to the kernel function library. The kernel function index table includes multiple first kernel function indexes and multiple second kernel function indexes. The multiple first kernel function indexes correspond one-to-one with multiple kernel functions stored on the hard disk, and the multiple second kernel function indexes correspond one-to-one with multiple kernel functions cached in one or more object kernel function cache groups in memory.
[0107] For example, such as Figure 7 As shown, obtain the parameter information of the computational task to be processed (corresponding to...). Figure 1 After step S110), based on the obtained computation task and parameter information, the target kernel function index corresponding to the target kernel function is calculated (corresponding to step S1501 or step S1201); the calculated target kernel function index is used to search the kernel function index table in the kernel function library. For example, the target kernel function index is compared with multiple first kernel function indices and multiple second kernel function indices in the kernel function index table; if a first kernel function index or a second kernel function index that is equivalent to the target kernel function index can be found in the kernel function index table after comparison, the target kernel function can be obtained from the hard disk or memory using the equivalent first kernel function index or second kernel function index.
[0108] For example, in some examples, such as Figure 7 As shown, if a first kernel function index 1, which corresponds to the target kernel function index, is found in the kernel function index table after comparison, the target kernel function can be retrieved from memory using the first kernel function index 1. For example, the first kernel function index 1 corresponds to kernel function -00 in memory. Kernel function -00 is cached in object kernel function cache group 0. Kernel function -00 is the target kernel function, and object kernel function cache group 0 is the target kernel function cache group.
[0109] For example, in other examples, such as Figure 7As shown, if a second kernel function index 1, which corresponds to the target kernel function index, is found in the kernel function index table through comparison, the target kernel function can be retrieved from the hard disk using the second kernel function index 1. For example, the second kernel function index 1 corresponds to kernel function -10 on the hard disk. Kernel function -10 is cached in kernel function cache group 1, and kernel function -10 is the target kernel function. Kernel function cache group 1 is the target kernel function cache group. For example, after finding kernel function -10 as the target kernel function on the hard disk, kernel function cache group 1 containing kernel function -10 can be loaded from the hard disk into memory as a newly added object kernel function cache group.
[0110] For example, the memory includes multiple object kernel function cache groups, each of which has a corresponding first priority value, and the multiple object kernel function cache groups are sorted according to their corresponding first priority values. For example, step S130 may further include: in response to the multiple object kernel function cache groups including a target kernel function cache group with a target kernel function, modifying the first priority value corresponding to the target kernel function cache group to increase the first priority of the target kernel function cache group.
[0111] For example, in some examples, each object kernel function cache group in memory includes a priority counter, the value of which is the first priority value, which indicates the frequency of use of the object kernel function cache group and the kernel functions within it.
[0112] For example, such as Figure 7 As shown, each object kernel function cache group in memory has a corresponding first priority value, and multiple object kernel function cache groups are sorted according to their corresponding first priority values. For example, the first priority value corresponding to object kernel function cache group 0 is 100, the first priority value corresponding to object kernel function cache group 1 is 99, ..., and the first priority value corresponding to object kernel function cache group m is 20.
[0113] For example, such as Figure 7 As shown, if a first kernel function index 2, which corresponds to the target kernel function index, is found in the kernel function index table after comparison, and the first kernel function index 2 corresponds to kernel function -m0 in memory, then the object kernel function cache group m where kernel function -m0 is located is the target kernel function cache group. For example, the first priority of the object kernel function cache group m can be increased by adding 2 to the priority counter in the object kernel function cache group m (it can also be modified to other values according to actual needs, and the embodiments of this disclosure do not limit this), that is, modifying the first priority value corresponding to the object kernel function cache group m to 22, thereby increasing the first priority of the object kernel function cache group m.
[0114] For example, a hard disk includes multiple kernel function cache groups, each of which has a corresponding second priority value, and the multiple kernel function cache groups are sorted according to their corresponding second priority values.
[0115] For example, in some cases, each kernel function cache group in the hard disk includes a priority counter, and the value on the priority counter is the second priority value, which indicates the frequency of use of the kernel function cache group and the kernel functions within it.
[0116] For example, such as Figure 7 As shown, each kernel function cache group in the hard disk has a corresponding second priority value, and multiple kernel function cache groups are sorted according to their corresponding second priority values. For example, the second priority value corresponding to kernel function cache group 1 is 20, the first priority value corresponding to kernel function cache group 2 is 19, ..., and the first priority value corresponding to object kernel function cache group n is 1.
[0117] For example, such as Figure 7 As shown, if a second kernel function index 1, which corresponds to the target kernel function index, is found in the kernel function index table after comparison, and this second kernel function index 1 corresponds to kernel function -10 on the hard disk, then kernel function cache group 1, where kernel function -10 is located, is the target kernel function cache group. For example, the second priority of kernel function cache group 1 can be increased by adding 2 to the priority counter in kernel function cache group 1 (it can also be modified to other values as needed, and the embodiments of this disclosure do not limit this), that is, modifying the second priority value corresponding to kernel function cache group 1 to 22, thereby increasing the second priority of kernel function cache group 1.
[0118] It should be noted that the above-described modification of the first priority value or the second priority value by a counter is only an example. Other methods can be selected to modify the first priority value or the second priority value according to actual needs. The embodiments disclosed herein do not limit this.
[0119] For example, the object kernel function cache group in memory is used more frequently than the kernel function cache group on the hard disk. When the target kernel function cache group is found on the hard disk, the target kernel function cache group can be loaded into memory as a new object kernel function cache group, increasing the priority of the target kernel function cache group in memory, and replacing the object kernel function cache group with lower priority in memory to the hard disk; that is, when the priority changes, kernel function cache groups will be swapped between memory and hard disk.
[0120] Figure 8 for Figure 1 An exemplary flowchart of another example of step S150.
[0121] For example, in some cases, the kernel library is stored on the hard drive. Figure 1 Step S150 may further include the following steps S1511 to S1512.
[0122] Step S1511: In response to the kernel function library including a target kernel function cache group with the target kernel function, load the target kernel function cache group into memory as one of multiple object kernel function cache groups, and set the first priority value corresponding to the target kernel function cache group to increase the first priority of the target kernel function cache group;
[0123] Step S1512: In response to loading the target kernel function cache group into memory as one of a plurality of object kernel function cache groups, at least one of the plurality of object kernel function cache groups included in memory is replaced to disk.
[0124] For example, in step S1511, if the kernel function library (e.g., stored on the hard disk) includes a target kernel function cache group, the target kernel function cache group is loaded into memory as a newly added object kernel function cache group; the second priority value of the target kernel function cache group on the hard disk is converted into the first priority value in memory, and the first priority value of the target kernel function cache group in memory is reset, thereby increasing the first priority of the target kernel function cache group.
[0125] For example, in step S1512, since the number of object kernel function cache groups in memory is fixed, after loading the target kernel function cache group into memory, at least one of the multiple object kernel function cache groups included in memory can be replaced to the hard disk. For example, the object kernel function cache group with the smallest first priority value among the multiple object kernel function cache groups can be replaced to the hard disk.
[0126] For example, if the kernel function cache group with the smallest first priority value in multiple kernel function cache groups is replaced to the hard disk, the second priority value of the kernel function cache group on the hard disk that corresponds to the kernel function cache group with the smallest first priority value can be modified to increase the second priority of the corresponding kernel function cache group; that is, the replaced kernel function cache group becomes a kernel function cache group on the hard disk, and its first priority value is converted into a second priority value on the hard disk.
[0127] Figure 9 This is a schematic diagram illustrating an example of a priority swapping mechanism for multiple object kernel function cache groups in memory provided in at least one embodiment of this disclosure.
[0128] For example, such as Figure 9As shown, each object kernel function cache group in memory has a corresponding first priority value, and multiple object kernel function cache groups are sorted according to their corresponding first priority values. For example, the first priority value corresponding to object kernel function cache group 0 is 100, the first priority value corresponding to object kernel function cache group 1 is 99, ..., and the first priority value corresponding to object kernel function cache group m is 20, where m is a positive integer.
[0129] For example, such as Figure 9 As shown, if the memory includes a target kernel function cache group containing the target kernel function, that is, if the object kernel function cache group k is the target kernel function cache group (here k is a positive integer less than or equal to m), the first priority value corresponding to the object kernel function cache group k can be modified to 101, thereby increasing the first priority of the object kernel function cache group k. For example, in Figure 9 In the example, the object kernel function cache group k corresponding to the target kernel function cache group is set to the highest first priority value, thus setting object kernel function cache group k to the highest first priority. That is, the first priority of multiple object kernel function cache groups in memory has been reordered.
[0130] Figure 10 This is a schematic diagram of another example of a priority swapping mechanism for multiple object kernel function cache groups in memory provided in at least one embodiment of this disclosure.
[0131] For example, such as Figure 10 As shown, each object kernel function cache group in memory has a corresponding first priority value, and multiple object kernel function cache groups are sorted according to their corresponding first priority values. For example, the first priority value corresponding to object kernel function cache group 0 is 100, the first priority value corresponding to object kernel function cache group 1 is 99, ..., and the first priority value corresponding to object kernel function cache group m is 20, where m is a positive integer.
[0132] For example, such as Figure 10 As shown, if the target kernel function cache group is not included in memory, but is included in the kernel function library (e.g., hard disk), the target kernel function cache group is loaded into memory as a new object kernel function cache group; the second priority value of the target kernel function cache group in hard disk is converted to the first priority value in memory, and the first priority value of the target kernel function cache group in memory is reset. For example, in Figure 10 In the example shown, the first priority value of the target kernel function cache group in memory is set to the highest first priority value of 101, thus setting the target kernel function cache group as the highest first priority. In other words, the first priorities of multiple object kernel function cache groups in memory have been reordered.
[0133] For example, such as Figure 10 As shown, since the number of object kernel function cache groups in memory is fixed (m+1), after loading the target kernel function cache group into memory, at least one of the m+1 object kernel function cache groups included in memory can be replaced to disk. For example, as... Figure 10 As shown, the object kernel function cache group m with the smallest first priority value can be replaced on the hard disk.
[0134] For example, such as Figure 10 As shown, when object kernel function cache group m is replaced on disk, the first priority value of object kernel function cache group m is converted to the second priority value on disk; that is, the replaced object kernel function cache group m becomes the kernel function cache group on disk, and its first priority value is converted to the second priority value on disk. For example, as... Figure 10 As shown, the second priority value of the object kernel function cache group m in the hard disk can be modified to 21 (it can also be modified to other values as needed, and the embodiments of this disclosure do not limit this) to improve its second priority in the hard disk; that is, the second priority of multiple kernel function cache groups in the hard disk is reordered.
[0135] The kernel function deployment method provided in at least one embodiment of this disclosure sets priorities in multiple object kernel function cache groups and kernel function cache groups in the kernel function library, loads the kernel function cache group with higher priority into memory, saves the time of searching for the target kernel function cache group in the kernel function library, greatly reduces the latency caused by searching for kernel functions, improves the efficiency of kernel function usage, improves the running performance of device-side programs (GPU programs), and thus improves the computing power of computing devices.
[0136] It should be noted that, Figure 7 , Figure 9 or Figure 10 The setting of the first priority value or the second priority value of the target kernel function cache group is merely exemplary. The specific setting method can be selected according to actual needs, and the embodiments disclosed herein do not limit this.
[0137] For example, in some cases, when the program on the device ends, all object kernel function cache groups in memory are saved to disk for use in the next program run, and the kernel function index information in the kernel function library is updated. After all object kernel function cache groups in memory are saved to disk as kernel function cache groups, all kernel function cache groups are reordered on disk according to their priority values, with higher priority values placed first and lower priority values placed last.
[0138] Figure 11A schematic block diagram of a kernel function deployment apparatus provided for at least one embodiment of this disclosure.
[0139] For example, such as Figure 11 As shown, the kernel function deployment device 200 includes an acquisition module 210, a query module 220, and a provision module 230.
[0140] For example, the acquisition module 210 is configured to acquire parameter information of the computational task to be processed. That is, the acquisition module 210 can be configured to perform, for example... Figure 1 The step S110 is shown.
[0141] For example, query module 220 is configured to query one or more object kernel function cache groups to see if a target kernel function corresponding to the computation task and parameter information is cached. For example, each of the one or more object kernel function cache groups includes at least one kernel function with the same computation task but different parameter information. For example, query module 220 is further configured to, in response to one or more object kernel function cache groups not caching the target kernel function, query the kernel function library to see if a target kernel function corresponding to the computation task and parameter information is stored. That is, query module 220 can be configured to perform, for example... Figure 1 Steps S120 and S140 are shown.
[0142] For example, the providing module 230 is configured to provide the target kernel function to the device for performing a computation task in response to one or more object kernel function cache groups caching the target kernel function. The providing module 230 is also configured to provide the target kernel function to the device for performing a computation task in response to the target kernel function being stored in a kernel function library. That is, the providing module 230 can be configured to perform, for example... Figure 1 Steps S130 and S150 are shown.
[0143] Due to the above description, for example Figure 1 The kernel function deployment method shown has already described in detail the operation of the kernel function deployment device 200, so for the sake of brevity, it will not be repeated here. For relevant details, please refer to the above description. Figures 1-10 The description.
[0144] It should be noted that, Figure 11 The modules described above in the kernel function deployment apparatus 200 shown can be configured as software, hardware, firmware, or any combination thereof to perform specific functions. For example, these modules may correspond to dedicated integrated circuits, pure software code, or modules combining software and hardware. As an example, see [reference needed]. Figure 11The device described may be a PC computer, tablet device, personal digital assistant, smartphone, web application or other device capable of executing program instructions, but is not limited thereto.
[0145] Furthermore, although the kernel function deployment apparatus 200 described above is divided into modules for executing corresponding processes, it will be clear to those skilled in the art that the processes executed by each module can also be executed without any specific module division in the apparatus or without clear boundaries between the modules. In addition, the above references... Figure 11 The kernel function deployment apparatus 200 described is not limited to the modules described above, but may also include other modules (e.g., writing modules, control modules, etc.) as needed, or the above modules may be combined.
[0146] At least one embodiment of this disclosure also provides an electronic device including a processor and a memory; the memory includes one or more computer program modules; the one or more computer program modules are stored in the memory and configured to be executed by the processor, and the one or more computer program modules include methods for implementing the kernel function deployment method provided by the embodiments of this disclosure described above.
[0147] Figure 12 A schematic block diagram of an electronic device provided for at least one embodiment of the present disclosure.
[0148] For example, such as Figure 12 As shown, the electronic device 300 includes a processor 310 and a memory 320. For example, the memory 320 is used to store non-transitory computer-readable instructions (e.g., one or more computer program modules). The processor 310 is used to execute the non-transitory computer-readable instructions, which, when executed by the processor 310, can perform one or more steps according to the kernel function deployment method described above. The memory 320 and the processor 310 can be interconnected via a bus system and / or other forms of connection mechanism (not shown).
[0149] For example, processor 310 may be a central processing unit (CPU), a digital signal processor (DSP), or other processing unit with data processing and / or program execution capabilities, such as a field-programmable gate array (FPGA); for example, the central processing unit (CPU) may be based on x86, RISC-V, or ARM architectures. Processor 310 may be a general-purpose processor or a special-purpose processor, capable of controlling other components in electronic device 300 to perform desired functions.
[0150] For example, memory 320 may include any combination of one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and / or non-volatile memory. Volatile memory may include, for example, random access memory (RAM) and / or cache memory. Non-volatile memory may include, for example, read-only memory (ROM), hard disk, erasable programmable read-only memory (EPROM), portable compact disc read-only memory (CD-ROM), USB memory, flash memory, etc. One or more computer program modules may be stored on the computer-readable storage medium, and processor 310 may run one or more computer program modules to implement various functions of electronic device 300. Various application programs and various data, as well as various data used and / or generated by the application programs, may also be stored in the computer-readable storage medium.
[0151] It should be noted that, in the embodiments of this disclosure, the specific functions and technical effects of the electronic device 300 can be referred to the description above of the kernel function deployment method provided in at least one embodiment of this disclosure, and will not be repeated here.
[0152] Figure 13 A schematic block diagram of another electronic device provided for at least one embodiment of the present disclosure.
[0153] For example, such as Figure 13 As shown, the electronic device 400 is, for example, suitable for implementing the kernel function deployment method provided in the embodiments of this disclosure. It should be noted that... Figure 13 The illustrated electronic device 400 is merely an example and does not impose any limitation on the functionality and scope of use of the embodiments disclosed herein.
[0154] For example, such as Figure 13As shown, electronic device 400 may include a processing unit (e.g., a central processing unit, a graphics processing unit, etc.) 41, which can perform various appropriate actions and processes according to a program stored in read-only memory (ROM) 42 or a program loaded from storage device 48 into random access memory (RAM) 43. RAM 43 also stores various programs and data required for caching the operation of device 400 in a system simulation. Processing unit 41, ROM 42, and RAM 43 are interconnected via bus 44. Input / output (I / O) interface 45 is also connected to bus 44. Typically, the following devices can be connected to I / O interface 45: input devices 46 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 47 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 48 including, for example, magnetic tape, hard disks, etc.; and communication devices 49. Communication device 49 allows electronic device 400 to communicate wirelessly or wiredly with other electronic devices to exchange data.
[0155] Although Figure 13 An electronic device 400 with various devices is shown, but it should be understood that it is not required to implement or have all of the devices shown, and the electronic device 400 may alternatively implement or have more or fewer devices.
[0156] For detailed information and technical effects regarding Electronic Device 400, please refer to the description of kernel function deployment methods above; it will not be repeated here.
[0157] Figure 14 This is a schematic diagram of a storage medium provided for at least one embodiment of the present disclosure.
[0158] For example, such as Figure 14 As shown, storage medium 500 stores non-transitory computer-readable instructions 510. For example, when non-transitory computer-readable instructions 510 are executed by a computer, one or more steps in the kernel function deployment method described above are performed.
[0159] For example, this storage medium 500 can be applied to Figure 12 In the illustrated electronic device 300, for example, the storage medium 500 can be the memory 320 within the electronic device 300. For example, a description of the storage medium 500 can be found here. Figure 12 The corresponding description of the memory 320 in the illustrated electronic device 300 will not be repeated here.
[0160] The following points need to be clarified regarding this disclosure:
[0161] (1) The accompanying drawings of the embodiments of this disclosure only involve the structures involved in the embodiments of this disclosure. Other structures can be referred to the general design.
[0162] (2) Where there is no conflict, features of the same embodiment and different embodiments of this disclosure can be combined with each other.
[0163] The above are merely specific embodiments of this disclosure, but the scope of protection of this disclosure is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this disclosure should be included within the scope of protection of this disclosure. Therefore, the scope of protection of this disclosure should be determined by the scope of the claims.
Claims
1. A method for deploying kernel functions, comprising: obtaining parameter information of a computing task to be processed, wherein the parameter information is associated with data to be processed; querying, according to the parameter information, whether a target kernel function corresponding to the computing task and the parameter information is cached in one or more object kernel function cache groups; in response to the one or more object kernel function cache groups caching the target kernel function, providing the target kernel function to a device end to execute the computing task; wherein each of the one or more object kernel function cache groups comprises at least one kernel function with the same computing task but different parameter information, and the target kernel function is a function running on a processor of the device end for executing the computing task, wherein a plurality of object kernel function cache groups are included in a memory, each of the plurality of object kernel function cache groups has a corresponding first priority value, and the plurality of object kernel function cache groups are sorted according to the corresponding plurality of first priority values, the response to the one or more object kernel function cache groups caching the target kernel function, providing the target kernel function to the device end to execute the computing task, comprises: in response to the plurality of object kernel function cache groups including a target kernel function cache group having the target kernel function, modifying the first priority value corresponding to the target kernel function cache group to improve the first priority of the target kernel function cache group. 2.The method of claim 1, further comprising: in response to the one or more object kernel function cache groups not caching the target kernel function, querying whether the target kernel function corresponding to the computing task and the parameter information is stored in a kernel function library; in response to the kernel function library storing the target kernel function, providing the target kernel function to the device end to execute the computing task. 3.The method of claim 2, further comprising: in response to the kernel function library not storing the target kernel function, generating the target kernel function corresponding to the computing task and the parameter information. 4.The method of claim 3, further comprising: saving the generated target kernel function in the kernel function library. 5.The method of claim 4, further comprising: selecting a target kernel function cache group corresponding to the computing task in the kernel function library, or in response to the kernel function library not yet having a target kernel function cache group corresponding to the computing task, creating the target kernel function cache group in the kernel function library; saving the generated target kernel function in the target kernel function cache group. 6.The method of claim 5, further comprising: loading the target kernel function cache group from the kernel function library to the memory.
7. The kernel function deployment method according to claim 2, wherein The kernel function library comprises a plurality of kernel function cache groups, and each of the plurality of kernel function cache groups comprises at least one kernel function with the same computing task but different parameter information.
8. The kernel function deployment method according to claim 7, wherein The one or more object kernel function cache groups are one or more kernel function cache groups read from the kernel function library into memory.
9. The kernel function deployment method according to claim 2, wherein, A kernel function index table is created corresponding to the kernel function library. The kernel function index table includes multiple first kernel function indices, each of which corresponds one-to-one with a kernel function stored in the kernel function library. The step of querying the kernel function library to see if the target kernel function corresponding to the computation task and the parameter information is stored includes: Based on the computational task and the parameter information, calculate the target kernel function index corresponding to the target kernel function; The kernel function index table is searched based on the target kernel function index; In response to the existence of a first kernel function index in the kernel function index table that corresponds to the target kernel function index, the target kernel function is obtained from the kernel function library using the corresponding first kernel function index.
10. The kernel function deployment method of claim 9, wherein, The kernel function library is stored on the hard disk.
11. The kernel function deployment method of claim 1, wherein, The step of querying one or more object kernel function cache groups to see if a target kernel function corresponding to the computation task and the parameter information is cached includes: Based on the computational task and the parameter information, calculate the target kernel function index corresponding to the target kernel function; Based on the target kernel function index, query whether the target kernel function is cached in one or more object kernel function cache groups; The one or more object kernel function cache groups are stored in memory.
12. The kernel function deployment method of claim 11, wherein, A kernel function index table is created corresponding to the kernel function library. The kernel function index table includes multiple second kernel function indices, each of which corresponds one-to-one with multiple kernel functions cached in the one or more object kernel function cache groups. The step of querying whether the target kernel function is cached in one or more object kernel function cache groups based on the target kernel function index includes: The kernel function index table is searched based on the target kernel function index; In response to the existence of a second kernel function index in the kernel function index table that corresponds to the target kernel function index, the target kernel function is retrieved from one or more object kernel function cache groups using the corresponding second kernel function index.
13. The kernel function deployment method according to claim 1, further comprising, before querying in one or more object kernel function cache groups whether a target kernel function corresponding to the computation task and the parameter information is cached: Based on the computation task, query whether a target kernel function cache group corresponding to the computation task is cached in one or more object kernel function cache groups.
14. The kernel function deployment method according to claim 13, further comprising, before querying in one or more object kernel function cache groups whether a target kernel function corresponding to the computation task and the parameter information is cached: In response to the fact that the target kernel function cache group is not cached in one or more object kernel function cache groups, the kernel function library is searched to see if the target kernel function cache group corresponding to the computing task is stored. In response to the fact that the target kernel function cache group is stored in the kernel function library, the target kernel function cache group is read into memory as one of the one or more object kernel function cache groups.
15. The kernel function deployment method of claim 13, wherein, The step of querying one or more object kernel function cache groups to see if a target kernel function corresponding to the computation task and the parameter information is cached includes: Check in the target kernel function cache group whether the target kernel function corresponding to the computation task and the parameter information is cached.
16. The kernel function deployment method of claim 2, wherein, The memory includes multiple object kernel function cache groups, each of which has a corresponding first priority value. The multiple object kernel function cache groups are sorted according to their corresponding first priority values. The step of responding to the fact that the target kernel function is stored in the kernel function library, and providing the target kernel function to the device to execute the computation task, includes: In response to the kernel function library including a target kernel function cache group having the target kernel function, the target kernel function cache group is loaded into the memory as one of the plurality of object kernel function cache groups, and Set the first priority value corresponding to the target kernel function cache group to increase the first priority of the target kernel function cache group.
17. The kernel function deployment method of claim 16, wherein, The step of responding to the fact that the target kernel function is stored in the kernel function library, and providing the target kernel function to the device to execute the computation task, further includes: In response to loading the target kernel function cache group into the memory as one of the plurality of object kernel function cache groups, at least one of the plurality of object kernel function cache groups included in the memory is replaced to the hard disk.
18. The kernel function deployment method of claim 17, wherein, The step of replacing at least one of the multiple object kernel function cache groups included in the memory to the hard disk includes: Replace the object kernel function cache group with the smallest first priority value in the plurality of object kernel function cache groups with the one on the hard disk.
19. The kernel function deployment method of claim 17, wherein, The hard disk includes multiple kernel function cache groups, each of which has a corresponding second priority value. The multiple kernel function cache groups are sorted according to their corresponding second priority values. Replacing at least one of the multiple object kernel function cache groups included in the memory to the hard disk includes: In response to the object kernel function cache group with the smallest first priority value among the plurality of object kernel function cache groups being replaced in the hard disk, the second priority value of the kernel function cache group in the hard disk that corresponds to the object kernel function cache group with the smallest first priority value is modified to increase the second priority of the corresponding kernel function cache group.
20. A kernel function deployment apparatus, comprising: The acquisition module is configured to acquire parameter information of the computational task to be processed, wherein the parameter information is associated with the data to be processed; The query module is configured to query one or more object kernel function cache groups to see if there is a cached target kernel function corresponding to the computation task and the parameter information; A module is configured to provide the target kernel function to the device to execute the computation task in response to the one or more object kernel function cache groups caching the target kernel function; Each of the one or more object kernel function cache groups includes at least one kernel function with the same computational task but different parameter information, wherein the target kernel function is a function that runs on the processor at the device side to perform the computational task. The memory includes multiple object kernel function cache groups, each of which has a corresponding first priority value. The multiple object kernel function cache groups are sorted according to their corresponding first priority values. The providing module is also configured to: In response to the plurality of object kernel function cache groups including a target kernel function cache group having the target kernel function, the first priority value corresponding to the target kernel function cache group is modified to increase the first priority of the target kernel function cache group.
21. The kernel function deployment apparatus according to claim 20, wherein, The query module is further configured to, in response to the one or more object kernel function cache groups not caching the target kernel function, query the kernel function library to see if the target kernel function corresponding to the computation task and the parameter information is stored. The providing module is further configured to provide the target kernel function to the device to execute the computation task in response to the kernel function library storing the target kernel function.
22. An electronic device, comprising: processor; Memory, including one or more computer program modules; The one or more computer program modules are stored in the memory and configured to be executed by the processor, and the one or more computer program modules are used to implement the kernel function deployment method according to any one of claims 1-19.
23. A storage medium storing non-transitory computer-readable instructions that, when executed by a computer, implement the kernel function deployment method of any one of claims 1-19.