Virtual machine optimization method, device and equipment based on tensor data computation inference
By acquiring the design structure information of the virtual machine and setting the target intermediate transformation layer, and optimizing the virtual machine using an automatic parameter tuning strategy, the problem of mismatch between the virtual machine and the input model is solved, and the efficiency of tensor data inference is improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- OPENBAYES (TIANJIN) IT CO LTD
- Filing Date
- 2022-08-19
- Publication Date
- 2026-06-16
AI Technical Summary
Existing technologies cannot optimize the fit between the virtual machine and the specified input model in a timely manner, resulting in low efficiency of tensor data inference.
By obtaining the design structure information of the virtual machine to be optimized, setting the target intermediate transformation layer and execution operation, identifying the object to be optimized, and using the target automatic parameter tuning strategy to tune and optimize it until it matches the specified input model.
It improves the efficiency of tensor data inference, makes the virtual machine more suitable for operations on a specified input model, and enhances computational performance.
Smart Images

Figure CN115423102B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of deep learning technology, and in particular to virtual machine optimization methods, apparatus and devices based on tensor data computation and inference. Background Technology
[0002] With the continuous development of artificial intelligence, deep learning technology has been widely applied in various fields. The application of deep learning is inseparable from inference frameworks, such as TensorFlow, PyTorch, and TNN. However, different inference frameworks have different functions. For example, TensorFlow and PyTorch are platform-level frameworks that can be used for both training and inference, while the TNN framework can only be used for inference. Regardless of the function of the framework, it is adapted to some related acceleration devices. In order to facilitate the inference of tensor data, the concept of virtual machine was introduced. That is, the virtual machine performs a series of operations on the model developed by the developer and quickly deploys the developed model to different acceleration devices. However, the virtual machine operates differently for different models. If the virtual machine is not optimized in time, it will cause the virtual machine to be incompatible with the specified input model, resulting in low efficiency of tensor data inference.
[0003] The above content is only used to help understand the technical solution of the present invention and does not represent an admission that the above content is prior art. Summary of the Invention
[0004] The main objective of this invention is to provide a virtual machine optimization method, apparatus, and device based on tensor data computation and reasoning, aiming to solve the technical problem that the existing technology cannot obtain a virtual machine that fits the specified input model, resulting in low efficiency of tensor data reasoning.
[0005] To achieve the above objectives, the present invention provides a virtual machine optimization method based on tensor data computation and inference, the virtual machine optimization method based on tensor data computation and inference comprising the following steps:
[0006] Obtain the design structure information of the virtual machine to be optimized, and obtain the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information;
[0007] Set the target parameter tuning and learning parameters according to the target intermediate transformation layer;
[0008] The target intermediate transformation layer performs operations to determine the object to be optimized.
[0009] The target-based automatic parameter tuning strategy optimizes the target object based on the learned parameters.
[0010] Optionally, obtaining the design structure information of the virtual machine to be optimized, and obtaining the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information, includes:
[0011] Obtain the design structure information of the virtual machine to be optimized, and obtain the corresponding layer set based on the design structure information;
[0012] Based on the hierarchical characteristics, the target intermediate conversion layer is selected from the hierarchical set;
[0013] The process of generating target self-decoding from target unified format intermediate representation elements is obtained according to the target intermediate conversion layer;
[0014] The target intermediate conversion layer is executed according to the process of generating target self-decoding from target unified format intermediate representation.
[0015] Optionally, the target hyperparameter tuning learning parameters include the hyperparameter tuning learning type, the target learning object, and the target number of hyperparameter tuning iterations;
[0016] The step of setting target parameter tuning learning parameters according to the target intermediate transformation layer includes:
[0017] Obtain the number of data exchanges and data decomposition performance for generating the target self-decoding;
[0018] The parameter tuning learning type and target learning object are set according to the number of data exchanges and the data decomposition performance.
[0019] The target range of parameter settings is obtained based on the target intermediate conversion layer, and the maximum and minimum values of parameter settings are obtained based on the target range.
[0020] The target number of parameter tuning attempts is set based on the maximum value and minimum value of the parameters.
[0021] Optionally, determining the object to be optimized based on the operation performed by the target intermediate transformation layer includes:
[0022] The execution node of the target intermediate conversion layer is obtained by performing operations based on the intermediate conversion layer.
[0023] Obtain the data optimization strategies set in each layer of the target intermediate transformation layer;
[0024] The execution operations for each layer are obtained based on the execution node and the data optimization strategy for each layer;
[0025] The objects to be optimized are determined based on the operations performed at each layer.
[0026] Optionally, determining the object to be optimized based on the operations performed at each layer includes:
[0027] The computational graph of the specified input model is obtained by performing operations on each layer;
[0028] The memory loading data for each operation is obtained based on the computation graph of the specified input model;
[0029] Multi-level operation interaction data is obtained based on the memory loading data;
[0030] The multi-level operations of the multi-level operation interaction data are taken as the objects to be optimized.
[0031] Optionally, determining the object to be optimized based on the operations performed at each layer includes:
[0032] The intermediate representation element of the unified format to be processed is obtained by performing operations at each layer;
[0033] Construct the target feature matrix based on the intermediate representation elements of the unified format to be processed;
[0034] The matrix computation time and matrix computation resources are obtained based on the target feature matrix.
[0035] When the matrix computation time exceeds a preset time threshold and / or the matrix computation resource consumption exceeds a preset storage resource threshold, the matrix computation time and the matrix computation resource consumption are taken as objects to be optimized.
[0036] Optionally, the step of optimizing the object to be optimized based on the target parameter tuning learning parameters using the target automatic parameter tuning strategy includes:
[0037] Set the corresponding initial optimization strategy according to the object to be optimized;
[0038] The initial optimization strategy is adjusted based on the target parameter tuning learning parameters;
[0039] The parameter optimization range is obtained based on the adjusted initial optimization strategy;
[0040] The target automated parameter tuning strategy is used to tune and optimize the object to be optimized within the parameter optimization range to obtain a virtual machine that fits the specified input model.
[0041] Furthermore, to achieve the above objectives, the present invention also proposes a virtual machine optimization device based on tensor data computation and inference, wherein the virtual machine optimization device based on tensor data computation and inference includes:
[0042] The acquisition module is used to acquire the design structure information of the virtual machine to be optimized, and to obtain the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information.
[0043] The setting module is used to set the target parameter tuning learning parameters according to the target intermediate conversion layer;
[0044] The determination module is used to determine the object to be optimized based on the operation performed by the target intermediate conversion layer;
[0045] The optimization module is used to perform parameter tuning optimization on the object to be optimized based on the target parameter tuning learning parameters using a target automatic parameter tuning strategy.
[0046] Furthermore, to achieve the above objectives, the present invention also proposes a virtual machine optimization device based on tensor data computation and inference. The virtual machine optimization device based on tensor data computation and inference includes: a memory, a processor, and a virtual machine optimization program based on tensor data computation and inference stored in the memory and executable on the processor. The virtual machine optimization program based on tensor data computation and inference is configured to implement the virtual machine optimization method based on tensor data computation and inference as described above.
[0047] Furthermore, to achieve the above objectives, the present invention also proposes a storage medium storing a virtual machine optimization program based on tensor data computation and inference, wherein the virtual machine optimization program based on tensor data computation and inference implements the virtual machine optimization method based on tensor data computation and inference as described above when executed by a processor.
[0048] The proposed virtual machine optimization method based on tensor data computation and inference involves: acquiring the design structure information of the virtual machine to be optimized; obtaining the target intermediate transformation layer and its execution operations based on the design structure information; setting target parameter tuning learning parameters based on the target intermediate transformation layer; determining the object to be optimized based on the execution operations of the target intermediate transformation layer; and optimizing the object to be optimized using a target automatic parameter tuning strategy based on the target parameter tuning learning parameters. By setting target parameter tuning learning parameters based on the target intermediate transformation layer of the virtual machine to be optimized, determining the object to be optimized based on the execution operations of the target intermediate transformation layer, and then optimizing the object to be optimized from the perspective of the target parameter tuning learning parameters using an automatic parameter tuning strategy, a virtual machine that fits the specified input model can be obtained, thereby effectively improving the efficiency of tensor data inference. Attached Figure Description
[0049] Figure 1 This is a schematic diagram of the structure of a virtual machine optimization device based on tensor data computation and inference in the hardware operating environment involved in the embodiments of the present invention;
[0050] Figure 2 This is a flowchart illustrating the first embodiment of the virtual machine optimization method for computational inference based on tensor data according to the present invention.
[0051] Figure 3 This is a flowchart illustrating the second embodiment of the virtual machine optimization method based on tensor data computation and inference of the present invention.
[0052] Figure 4 This is a schematic diagram of the functional modules of the first embodiment of the virtual machine optimization device for tensor data computation and inference according to the present invention.
[0053] The realization of the objective, functional features and advantages of the present invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation
[0054] It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the invention.
[0055] Reference Figure 1 , Figure 1 This is a schematic diagram of the virtual machine optimization device structure for the hardware operating environment based on tensor data computation and inference involved in the embodiments of the present invention.
[0056] like Figure 1 As shown, the virtual machine optimization device based on tensor data computation and inference may include: a processor 1001, such as a central processing unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used to implement communication between these components. The user interface 1003 may include a display screen or an input unit such as a keyboard; optionally, the user interface 1003 may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a Wireless-Fidelity (Wi-Fi) interface). The memory 1005 may be a high-speed random access memory (RAM) or a stable non-volatile memory (NVM), such as a disk drive. The memory 1005 may also optionally be a storage device independent of the aforementioned processor 1001.
[0057] Those skilled in the art will understand that Figure 1 The structure shown does not constitute a limitation on virtual machine optimization devices for tensor data computation inference and may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0058] like Figure 1 As shown, the memory 1005, which serves as a storage medium, may include an operating system, a network communication module, a user interface module, and a virtual machine optimization program for tensor data computation and inference.
[0059] exist Figure 1 In the virtual machine optimization device based on tensor data computation and inference shown, the network interface 1004 is mainly used for data communication with the network integrated platform workstation; the user interface 1003 is mainly used for data interaction with the user; the processor 1001 and memory 1005 in the virtual machine optimization device based on tensor data computation and inference of the present invention can be set in the virtual machine optimization device based on tensor data computation and inference. The virtual machine optimization device based on tensor data computation and inference calls the virtual machine optimization program based on tensor data computation and inference stored in the memory 1005 through the processor 1001 and executes the virtual machine optimization method based on tensor data computation and inference provided in the embodiment of the present invention.
[0060] Based on the above hardware structure, an embodiment of the virtual machine optimization method based on tensor data computation and inference of the present invention is proposed.
[0061] Reference Figure 2 , Figure 2 This is a flowchart illustrating the first embodiment of the virtual machine optimization method for computational inference based on tensor data according to the present invention.
[0062] In the first embodiment, the virtual machine optimization method based on tensor data computation and inference includes the following steps:
[0063] Step S10: Obtain the design structure information of the virtual machine to be optimized, and obtain the target intermediate conversion layer and the target intermediate conversion layer execution operation based on the design structure information.
[0064] It should be noted that the execution subject of this embodiment is a virtual machine optimization device based on tensor data computation and inference. It can also be other devices that can achieve the same or similar functions, such as a controller for tensor computation virtual machines. This embodiment does not limit this. In this embodiment, a controller for tensor computation virtual machines will be used as an example for explanation.
[0065] It should be understood that the virtual machine to be optimized refers to the virtual machine for tensor data computation and inference that needs to be optimized. By optimizing the virtual machine, the optimized virtual machine is more suitable for the input model. The design structure information refers to the structural information of the virtual machine to be optimized. The design structure information includes the front-end access layer, the target intermediate transformation layer and the terminal execution layer, and is applied to the field of deep learning inference. The target intermediate transformation layer execution operation refers to the execution operation of the target intermediate transformation layer to generate the target self-decoding.
[0066] Further, step S10 includes: obtaining the design structure information of the virtual machine to be optimized, obtaining the corresponding layer set according to the design structure information; selecting the target intermediate conversion layer in the layer set according to the layer characteristics; obtaining the process of generating the target self-decoder from the target unified format intermediate representation according to the target intermediate conversion layer; and obtaining the target intermediate conversion layer execution operation according to the process of generating the target self-decoder from the target unified format intermediate representation.
[0067] It is understandable that the layered set refers to the set of layers of the virtual machine to be optimized. For example, the front-end access layer is located in the first layer of the virtual machine to be optimized, the target intermediate conversion layer is located in the second layer of the virtual machine to be optimized, and the terminal execution layer is located in the third layer of the virtual machine to be optimized. The layered characteristics refer to the processing characteristics of each layer. For example, the characteristic of the front-end access layer is a unified format, the characteristic of the target intermediate conversion layer is to generate target self-decoding, and the characteristic of the terminal execution layer is to generate target device executable files.
[0068] It should be understood that after selecting the target intermediate conversion layer from the layered set, the process of generating the target self-decoder from the target unified format intermediate representation is obtained. Specifically, the front-end access layer transmits the target unified format intermediate representation to the target intermediate conversion layer, and the target intermediate conversion layer optimizes according to the set data optimization strategy. After optimization, a tensor computation model file is generated, and then the target self-decoder is generated based on the tensor computation model file. At this time, the series of processes for generating the target self-decoder are executed by the target intermediate conversion layer.
[0069] Step S20: Set the target parameter tuning learning parameters according to the target intermediate conversion layer.
[0070] It is understandable that the target parameter tuning learning parameters refer to the parameters for automatically tuning the object to be optimized within a set range. The target parameter tuning learning parameters include the parameter tuning learning type, the target learning object, and the target number of parameter tunings. The parameter tuning learning type refers to the type of problem to be tuned, the target learning object refers to the target of parameter tuning, and the target number of parameter tunings refers to the number of times required to achieve the target. The target number of parameter tunings is not set infinitely, but is the maximum number of parameter tunings within the target range set by the parameters.
[0071] Further, the target parameter tuning learning parameters include the parameter tuning learning type, the target learning object, and the target parameter tuning count. Step S20 includes: obtaining the number of data exchanges and data decomposition performance for generating the target self-decoding; setting the parameter tuning learning type and the target learning object based on the number of data exchanges and the data decomposition performance; obtaining the target range of parameter settings based on the target intermediate transformation layer, and obtaining the maximum and minimum values of parameter settings based on the target range; and setting the target parameter tuning count based on the maximum and minimum values of parameter settings.
[0072] It should be understood that the number of data exchanges refers to the number of times the data of the target unified format intermediate representation is exchanged between the video memory and the main memory when the target intermediate transformation layer is optimized. The data decomposition performance refers to the performance of matrix decomposition when tensor calculation is performed on the data of the target unified format intermediate representation. Then, the parameter tuning learning type and target learning object are set according to the number of data exchanges and the data decomposition performance, and the target parameter tuning number is set according to the maximum value and minimum value of the parameter setting within the target range.
[0073] Step S30: Perform operations based on the target intermediate conversion layer to determine the object to be optimized.
[0074] It should be understood that the object to be optimized refers to the object within the virtual machine to be optimized. That is, the optimization of the virtual machine is achieved by optimizing the object to be optimized. Specifically, the object to be optimized is determined based on the operation performed by the target intermediate transformation layer. The object to be optimized includes, but is not limited to, multi-level operations of multi-level operation interaction data, matrix calculation time, and matrix calculation resource consumption.
[0075] Step S40: The target object to be optimized is optimized by means of the target automatic parameter tuning strategy based on the target parameter tuning learning parameters.
[0076] Further, step S40 includes: setting a corresponding initial optimization strategy according to the object to be optimized; adjusting the initial optimization strategy according to the target parameter tuning learning parameters; obtaining the parameter optimization range according to the adjusted initial optimization strategy; and performing parameter tuning optimization on the object to be optimized within the parameter optimization range through the target automated parameter tuning strategy to obtain a virtual machine that fits the specified input model.
[0077] It should be understood that the target automated parameter tuning strategy refers to the strategy for tuning and optimizing the parameter of the object to be optimized. After obtaining the target parameter tuning learning parameters, the initial optimization strategy is adjusted based on the target parameter tuning learning parameters. Then, the parameter optimization range is obtained based on the adjusted initial optimization strategy. That is, within the parameter optimization range, a virtual machine suitable for the input model can be obtained. For example, if the parameter optimization range is [0, 300], after adjusting to 300 times, the virtual machine with the highest fit is selected as the virtual machine that fits the specified input model during the adjustment process.
[0078] This embodiment obtains the design structure information of the virtual machine to be optimized, and obtains the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information; sets target parameter tuning learning parameters based on the target intermediate transformation layer; determines the object to be optimized based on the target intermediate transformation layer execution operation; and optimizes the object to be optimized based on the target parameter tuning learning parameters using a target automatic parameter tuning strategy. Through the above method, by setting target parameter tuning learning parameters based on the target intermediate transformation layer of the virtual machine to be optimized, and determining the object to be optimized based on the target intermediate transformation layer execution operation, and then optimizing the object to be optimized from the perspective of the target parameter tuning learning parameters using a target automatic parameter tuning strategy, a virtual machine that fits the specified input model can be obtained, thereby effectively improving the efficiency of tensor data inference.
[0079] In one embodiment, such as Figure 3 The second embodiment of the virtual machine optimization method based on tensor data computation and inference proposed in the first embodiment of the present invention includes step S30, which includes:
[0080] Step S301: Obtain the execution node of the target intermediate conversion layer by performing the operation according to the intermediate conversion layer.
[0081] It should be understood that an execution node refers to the node in each layer of the target intermediate transformation layer that executes the target unified format intermediate representation. There can be multiple execution nodes, meaning that the same node can perform multiple operations.
[0082] Step S302: Obtain the data optimization strategies set in each layer of the target intermediate conversion layer.
[0083] It is understandable that data optimization strategies refer to strategies for optimizing the intermediate representations of the target in a unified format at each layer of the target intermediate transformation layer. Data optimization strategies include, but are not limited to, operator fusion optimization strategies, matrix factorization optimization strategies, and convolution operator optimization strategies.
[0084] Step S303: Obtain the execution operations for each layer based on the execution node and the data optimization strategy for each layer.
[0085] It should be understood that after obtaining the execution nodes, the execution operations at each layer of the execution nodes are obtained according to the data optimization strategy. For example, there are three execution nodes in the target intermediate transformation layer, namely the first execution node, the second execution node, and the third execution node. The layer where the first execution node is located is set with an operator fusion optimization strategy and a matrix factorization optimization strategy. The layer where the second execution node is located is set with a matrix factorization optimization strategy. The layer where the third execution node is located is set with a matrix factorization optimization strategy and a convolution operator optimization strategy. Then, the corresponding execution operations are obtained when each execution node is executed.
[0086] Step S304: Determine the object to be optimized based on the operations performed at each layer.
[0087] It is understandable that after obtaining the execution operations of each layer, the objects to be optimized are determined based on the execution operations of each layer. The objects to be optimized include, but are not limited to, multi-level operations of multi-level operation interaction data, matrix calculation time, and matrix calculation resource consumption.
[0088] Further, step S304 includes: obtaining a computation graph of a specified input model based on the operations performed at each layer; obtaining memory loading data for each operation based on the computation graph of the specified input model; obtaining multi-layer operation interaction data based on the memory loading data; and taking the multi-layer operations of the multi-layer operation interaction data as objects to be optimized.
[0089] It should be understood that a computation graph refers to the computational structure diagram of the file corresponding to the specified input model. In the computation graph of the model, each operation needs to load data from memory, then perform calculations on the loaded data, and return the calculation results to memory after the calculation is completed. The above operations involve multiple layers, which inevitably brings a large amount of data interaction, i.e., multi-layer operation data interaction. Therefore, it is necessary to optimize the multi-layer operations that interact with the multi-layer operation data. That is, the multi-layer operations are the objects to be optimized. The specific optimization method is to use three commonly used operators: conv (convolution), bias (bias), and relu (activation function). Then, the three operators are fused, and the fused operators are grouped. For example, 5*5 is one group, and 3*3 and four 1*1 fused operators are another group. Through the above optimization method, the hardware bandwidth resources can be maximized and the data transmission speed can be maximized. At this time, the optimized virtual machine is most suitable for the specified input model.
[0090] Further, step S304 also includes: obtaining intermediate representations of the unified format to be processed according to the operations performed at each layer; constructing a target feature matrix according to the intermediate representations of the unified format to be processed; obtaining the matrix computation time and matrix computation resources occupied according to the target feature matrix; and when the matrix computation time is greater than a preset time threshold and / or the matrix computation resources occupied are greater than a preset storage resource threshold, taking the matrix computation time and the matrix computation resources occupied as objects to be optimized.
[0091] Understandably, after obtaining the intermediate representation of the unified format to be processed transmitted by the front-end access layer, the intermediate representation of the unified format to be processed is first constructed into a target feature matrix. Matrix multiplication is an essential operation in tensor computation. For example, multiplying two 32 matrices results in a multiplication complexity of cube. The higher the complexity, the more resources and time the matrix computation consumes. Therefore, the matrix computation time and resource consumption are optimized. Specifically, the 32*32 matrix is decomposed into multiple smaller matrices with side lengths of 32, 16, 8, 4, and 2. Then, the large matrix multiplication is transformed into a small matrix multiplication. The optimal decomposition method is determined through a target automated parameter tuning strategy. For example, if the target automated parameter tuning strategy determines that the hardware is optimized for 4*4 matrix multiplication, which will reduce the matrix computation time and resource consumption, then the 32*32 matrix will be decomposed into 64 small 4*4 matrices for computation.
[0092] It should be understood that the object to be optimized can also be a tensor for convolution operations. For example, the convolution operation is as follows: the input and output tensor sizes are both 32*32, the number of input channels is 3, the number of output channels is 16, and the convolution kernel is 3*3. Based on the above parameters, a total of 32*32*3*3*3*16*32*32 multiplication and addition operations are required. Obviously, the calculation time of the above convolution operation is relatively high. At this time, the above convolution operation is adjusted through the target automatic parameter tuning strategy to find the fastest tensor convolution calculation method. Specifically, the input and output tensors are split into different sizes of 32, 16, 8, 4, and 2, and the number of channels is split into different sizes of 16, 8, 4, and 2.
[0093] This embodiment obtains the execution node of the target intermediate conversion layer based on the operation performed by the intermediate conversion layer; acquires the data optimization strategies set in each layer of the target intermediate conversion layer; obtains the execution operations of each layer based on the execution node and the data optimization strategies of each layer; and determines the object to be optimized based on the execution operations of each layer. Through the above method, the execution node of the target intermediate conversion layer is obtained based on the execution operation of the intermediate conversion layer, and then the execution operations of each layer are obtained based on the execution node of the target intermediate conversion layer and the data optimization strategies set in each layer, and then the object to be optimized is determined based on the execution operations of each layer, thereby effectively improving the accuracy of determining the object to be optimized.
[0094] Furthermore, this embodiment of the invention also proposes a storage medium storing a virtual machine optimization program based on tensor data computation and inference. When the virtual machine optimization program based on tensor data computation and inference is executed by a processor, it implements the steps of the virtual machine optimization method based on tensor data computation and inference as described above.
[0095] Since this storage medium adopts all the technical solutions of all the above embodiments, it has at least all the beneficial effects brought about by the technical solutions of the above embodiments, which will not be repeated here.
[0096] In addition, refer to Figure 4 This invention also proposes a virtual machine optimization device based on tensor data computation and inference, the virtual machine optimization device based on tensor data computation and inference comprising:
[0097] The acquisition module 10 is used to acquire the design structure information of the virtual machine to be optimized, and to obtain the target intermediate conversion layer and the target intermediate conversion layer execution operation based on the design structure information.
[0098] The setting module 20 is used to set the target parameter tuning learning parameters according to the target intermediate conversion layer.
[0099] The determination module 30 is used to determine the object to be optimized based on the operation performed by the target intermediate conversion layer.
[0100] The optimization module 40 is used to perform parameter tuning optimization on the object to be optimized based on the target parameter tuning learning parameters through the target automatic parameter tuning strategy.
[0101] This embodiment obtains the design structure information of the virtual machine to be optimized, and obtains the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information; sets target parameter tuning learning parameters based on the target intermediate transformation layer; determines the object to be optimized based on the target intermediate transformation layer execution operation; and optimizes the object to be optimized based on the target parameter tuning learning parameters using a target automatic parameter tuning strategy. Through the above method, by setting target parameter tuning learning parameters based on the target intermediate transformation layer of the virtual machine to be optimized, and determining the object to be optimized based on the target intermediate transformation layer execution operation, and then optimizing the object to be optimized from the perspective of the target parameter tuning learning parameters using a target automatic parameter tuning strategy, a virtual machine that fits the specified input model can be obtained, thereby effectively improving the efficiency of tensor data inference.
[0102] It should be noted that the workflow described above is merely illustrative and does not limit the scope of protection of this invention. In practical applications, those skilled in the art can select some or all of the workflow to achieve the purpose of this embodiment according to actual needs, and no restrictions are imposed here.
[0103] In addition, for technical details not described in detail in this embodiment, please refer to the virtual machine optimization method based on tensor data computation and inference provided in any embodiment of the present invention, which will not be repeated here.
[0104] In one embodiment, the acquisition module 10 is further configured to acquire the design structure information of the virtual machine to be optimized, obtain a corresponding layer set based on the design structure information, select a target intermediate conversion layer in the layer set according to the layer characteristics, acquire the process of generating a target self-decoder from the target unified format intermediate representation based on the target intermediate conversion layer, and obtain the target intermediate conversion layer execution operation based on the process of generating a target self-decoder from the target unified format intermediate representation.
[0105] In one embodiment, the setting module 20 is further configured to: include the target parameter tuning learning parameters as a parameter tuning learning type, a target learning object, and a target parameter tuning count; obtain the number of data exchanges and data decomposition performance for generating the target self-decoding; set the parameter tuning learning type and the target learning object based on the number of data exchanges and the data decomposition performance; obtain the target range of parameter settings based on the target intermediate conversion layer, and obtain the maximum and minimum values of parameter settings based on the target range; and set the target parameter tuning count based on the maximum and minimum values of parameter settings.
[0106] In one embodiment, the determining module 30 is further configured to: obtain the execution node of the target intermediate conversion layer based on the operation performed by the intermediate conversion layer; obtain the data optimization strategy set in each layer of the target intermediate conversion layer; obtain the execution operation of each layer based on the execution node and the data optimization strategy of each layer; and determine the object to be optimized based on the execution operation of each layer.
[0107] In one embodiment, the determining module 30 is further configured to obtain a computation graph of a specified input model based on the operations performed at each layer; obtain memory loading data for each operation based on the computation graph of the specified input model; obtain multi-layer operation interaction data based on the memory loading data; and take the multi-layer operations of the multi-layer operation interaction data as objects to be optimized.
[0108] In one embodiment, the determining module 30 is further configured to obtain intermediate representations of the unified format to be processed based on the operations performed at each layer; construct a target feature matrix based on the intermediate representations of the unified format to be processed; obtain the matrix computation time and matrix computation resources occupied based on the target feature matrix; and when the matrix computation time is greater than a preset time threshold and / or the matrix computation resources occupied are greater than a preset storage resource threshold, use the matrix computation time and the matrix computation resources occupied as objects to be optimized.
[0109] In one embodiment, the optimization module 40 is further configured to: set a corresponding initial optimization strategy according to the object to be optimized; adjust the initial optimization strategy according to the target parameter tuning learning parameters; obtain the parameter optimization range according to the adjusted initial optimization strategy; and perform parameter tuning optimization on the object to be optimized within the parameter optimization range through the target automated parameter tuning strategy to obtain a virtual machine that fits the specified input model.
[0110] Other embodiments or implementation methods of the virtual machine optimization device based on tensor data computation and inference described in this invention can be referred to the above-described method embodiments, and will not be repeated here.
[0111] Furthermore, it should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or system. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.
[0112] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.
[0113] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as read-only memory (ROM) / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal device (which may be a mobile phone, computer, all-in-one platform workstation, or network device, etc.) to execute the methods described in the various embodiments of the present invention.
[0114] The above are merely preferred embodiments of the present invention and do not limit the scope of the patent. Any equivalent structural or procedural transformations made based on the description and drawings of the present invention, or direct or indirect applications in other related technical fields, are similarly included within the scope of patent protection of the present invention.
Claims
1. A virtual machine optimization method based on tensor data computation and inference, characterized in that, The virtual machine optimization method based on tensor data computation and inference includes the following steps: Obtain the design structure information of the virtual machine to be optimized, and obtain the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information; The target parameter tuning learning parameters are set according to the target intermediate transformation layer. The target parameter tuning learning parameters include the parameter tuning learning type, the target learning object, and the target parameter tuning number. The parameter tuning learning type is the problem type of parameter tuning learning. The step of setting target parameter tuning learning parameters according to the target intermediate transformation layer includes: Obtain the number of data exchanges and data decomposition performance for generating the target self-decoding; The parameter tuning learning type and target learning object are set according to the number of data exchanges and the data decomposition performance. The target range of parameter settings is obtained based on the target intermediate conversion layer, and the maximum and minimum values of parameter settings are obtained based on the target range. The target number of parameter tuning times is set according to the maximum value and the minimum value of the parameters set. The target intermediate transformation layer performs operations to determine the object to be optimized. The target-based automatic parameter tuning strategy optimizes the target object based on the learned parameters.
2. The virtual machine optimization method based on tensor data computation and inference as described in claim 1, characterized in that, The process of obtaining the design structure information of the virtual machine to be optimized, and obtaining the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information, includes: Obtain the design structure information of the virtual machine to be optimized, and obtain the corresponding layer set based on the design structure information; Based on the hierarchical characteristics, the target intermediate conversion layer is selected from the hierarchical set; The process of generating target self-decoding from target unified format intermediate representation elements is obtained according to the target intermediate conversion layer; The target intermediate conversion layer is executed according to the process of generating target self-decoding from target unified format intermediate representation.
3. The virtual machine optimization method based on tensor data computation and inference as described in claim 1, characterized in that, The step of determining the object to be optimized based on the operation performed by the target intermediate transformation layer includes: The execution node of the target intermediate conversion layer is obtained by performing operations based on the target intermediate conversion layer. Obtain the data optimization strategies set in each layer of the target intermediate transformation layer; The execution operations for each layer are obtained based on the execution node and the data optimization strategy for each layer; The objects to be optimized are determined based on the operations performed at each layer.
4. The virtual machine optimization method based on tensor data computation and inference as described in claim 3, characterized in that, The step of determining the object to be optimized based on the operations performed at each layer includes: The computational graph of the specified input model is obtained by performing operations on each layer; The memory loading data for each operation is obtained based on the computation graph of the specified input model; Multi-level operation interaction data is obtained based on the memory loading data; The multi-level operations of the multi-level operation interaction data are taken as the objects to be optimized.
5. The virtual machine optimization method based on tensor data computation and inference as described in claim 3, characterized in that, The step of determining the object to be optimized based on the operations performed at each layer includes: The intermediate representation element of the unified format to be processed is obtained by performing operations at each layer; Construct the target feature matrix based on the intermediate representation elements of the unified format to be processed; The matrix computation time and matrix computation resources are obtained based on the target feature matrix. When the matrix computation time exceeds a preset time threshold and / or the matrix computation resource consumption exceeds a preset storage resource threshold, the matrix computation time and the matrix computation resource consumption are taken as objects to be optimized.
6. The virtual machine optimization method based on tensor data computation and inference as described in claim 1, characterized in that, The step of optimizing the object to be optimized using the target automatic parameter tuning strategy based on the target parameter tuning learned parameters includes: Set the corresponding initial optimization strategy according to the object to be optimized; The initial optimization strategy is adjusted based on the target parameter tuning learning parameters; The parameter optimization range is obtained based on the adjusted initial optimization strategy; The target automatic parameter tuning strategy is used to tune and optimize the object to be optimized within the parameter optimization range to obtain a virtual machine that fits the specified input model.
7. A virtual machine optimization device based on tensor data computation and inference, characterized in that, The virtual machine optimization device based on tensor data computation and inference includes: The acquisition module is used to acquire the design structure information of the virtual machine to be optimized, and to obtain the target intermediate transformation layer and the target intermediate transformation layer execution operation based on the design structure information. The setting module is used to set the target parameter tuning learning parameters according to the target intermediate conversion layer; The setting module is further configured to: acquire the number of data exchanges and data decomposition performance for generating the target self-decoding; set the parameter tuning learning type and target learning object based on the number of data exchanges and the data decomposition performance; obtain the target range of parameter settings based on the target intermediate transformation layer, and obtain the maximum and minimum values of parameter settings based on the target range; set the target number of parameter tunings based on the maximum and minimum values of parameter settings; the target parameter tuning learning parameters include the parameter tuning learning type, the target learning object, and the target number of parameter tunings, wherein the parameter tuning learning type is the problem type of parameter tuning learning; The determination module is used to determine the object to be optimized based on the operation performed by the target intermediate conversion layer; The optimization module is used to perform parameter tuning optimization on the object to be optimized based on the target parameter tuning learning parameters using a target automatic parameter tuning strategy.
8. A virtual machine optimization device based on tensor data computation and inference, characterized in that, The virtual machine optimization device based on tensor data computation and inference includes: a memory, a processor, and a virtual machine optimization program based on tensor data computation and inference stored in the memory and executable on the processor. The virtual machine optimization program based on tensor data computation and inference is configured to implement the virtual machine optimization method based on tensor data computation and inference as described in any one of claims 1 to 6.
9. A storage medium, characterized in that, The storage medium stores a virtual machine optimization program based on tensor data computation and inference. When the virtual machine optimization program based on tensor data computation and inference is executed by the processor, it implements the virtual machine optimization method based on tensor data computation and inference as described in any one of claims 1 to 6.