Automatic driving model deployment method, device and equipment and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By constructing a search space and matching association operators, the system automatically adapts to different hardware platforms, solving the problem of low deployment efficiency of autonomous driving algorithm models across different hardware platforms. This enables rapid optimization and deployment, reduces development difficulty, and meets the needs of OEMs.

CN116069340BActive Publication Date: 2026-06-26GUOKE FOUNDATION STONE (CHONGQING) SOFTWARE CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: GUOKE FOUNDATION STONE (CHONGQING) SOFTWARE CO LTD
Filing Date: 2022-09-14
Publication Date: 2026-06-26

Application Information

Patent Timeline

14 Sep 2022

Application

26 Jun 2026

Publication

CN116069340B

IPC: G06F8/60; G06F16/901; G06F18/22; G06N3/0464

AI Tagging

Technology Topics

SimulationDeployment algorithm

Technical Efficacy Phrases

Improve model deployment efficiencyReduce development difficulty

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Non-feedback network photovoltaic air conditioning system and control method thereof
CN122092252AIntelligent switching and stable operationMake sure to startMechanical apparatus Single network parallel feeding arrangements Control engineering Air conditioning
A data transmission system
CN122432092Alow costReduce development difficulty
Brewing assembly for a coffee maker
CN224387234UImprove stability Improve reliability Beverage vessels Process engineering Mechanical engineering
Micro application running method, device, equipment, storage medium and program product
CN114356520BRealize componentizationReduce development difficultyProgram initiation/switching Program code adaption Event trigger Flip-flop
A personalized hairpin method based on dynamic library, storage medium and system
CN115438052Blower requirementstart fastDatabase updating Digital data protection Personalization Software engineering

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In existing technologies, the deployment efficiency of autonomous driving algorithm models across different hardware platforms is low, relying on algorithm deployment engineers' in-depth understanding of the characteristics of the hardware platforms, resulting in high development difficulty and long development cycles.

Method used

By obtaining the configuration file of the hardware platform to be deployed, a search space is constructed based on the hardware information matching association operators, the similarity between computing nodes and association operators is determined, candidate operators are selected and combined into candidate deployment algorithm models, and the target deployment algorithm model is screened using the test dataset.

Benefits of technology

It enables rapid optimization and deployment of autonomous driving algorithm models on different hardware platforms, reduces development difficulty and cycle, improves model deployment efficiency, and meets the needs of OEMs.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116069340B_ABST

Patent Text Reader

Abstract

The present disclosure relates to an automatic driving model deployment method, device, equipment and storage medium. A configuration file of a hardware platform to be deployed is obtained, a search space is constructed, and the similarity of a calculation node of an algorithm model to be deployed to a corresponding associated operator in the search space is determined. A candidate operator is determined according to the similarity. The candidate operator corresponding to the calculation node is selected, combined according to a calculation node relationship graph, and a candidate deployment algorithm model is obtained. For a hardware platform that needs to deploy an automatic driving algorithm model, an automatic driving algorithm model adapted to the hardware platform can be automatically output according to the hardware information of the hardware platform. The problem that the current model deployment work requires high professional ability and experience of algorithm deployment engineers, and the model deployment efficiency is low and the cycle is long is solved. The workload of automatic driving algorithm model deployment in the vehicle development process of the host factory is greatly reduced, and the model deployment efficiency is improved.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of autonomous driving, and in particular to an autonomous driving model deployment method, apparatus, device, and storage medium. Background Technology

[0002] In related technologies, vehicles, originally intended as modes of transportation, are gradually evolving towards intelligentization. Concepts such as "intelligent cockpit" and "autonomous driving" have been proposed and are becoming increasingly familiar to the public. Previously, when purchasing a vehicle, consumers often considered its hardware capabilities, such as power, chassis, suspension, electronic control systems, and comfort. Now, a vehicle's level of intelligence has become a crucial factor for consumers. Therefore, to better meet consumer needs and enhance product competitiveness and market share, vehicle manufacturers are developing their own autonomous driving technologies in an effort to gain consumer acceptance.

[0003] When developing autonomous driving solutions for vehicles, OEMs typically select different hardware platforms to deploy their autonomous driving algorithms based on their own needs (such as supply chain and cost-effectiveness). Because the architecture and hardware conditions of different platforms vary, the supported algorithm models may differ. Currently, to ensure that autonomous driving algorithm models are well-adapted to the hardware and achieve good deployment results, algorithm deployment engineers need to be proficient in the characteristics and advantages of the hardware platform itself, and then optimize the algorithm model accordingly and deploy it to the specific hardware platform. This is very time-consuming and labor-intensive, resulting in low model deployment efficiency. Therefore, there is an urgent need for a solution that can automatically adapt autonomous driving algorithm models to different hardware platforms, thereby reducing development difficulty and improving model deployment efficiency. Summary of the Invention

[0004] To overcome the problems existing in related technologies, this disclosure provides an autonomous driving model deployment method, apparatus, device and storage medium.

[0005] According to a first aspect of the present disclosure, an autonomous driving model deployment method is provided, comprising:

[0006] Obtain the configuration file of the hardware platform to be deployed, the configuration file including hardware information;

[0007] Based on the hardware information, at least one matching association operator is obtained; based on the association operator, a search space is constructed;

[0008] Obtain the computation node relationship graph of the algorithm model to be deployed; the algorithm model to be deployed is composed of a combination of several computation nodes and is used to realize the algorithm model for autonomous driving of vehicles.

[0009] For each computation node in the algorithm model to be deployed, determine the similarity between the computation node and the corresponding association operator in the search space;

[0010] The association operators that meet the preset similarity conditions are selected as candidate operators for the corresponding computation nodes;

[0011] Candidate operators corresponding to the computing nodes are selected and combined according to the computing node relationship graph of the algorithm model to be deployed to obtain at least one candidate deployment algorithm model.

[0012] Optionally, the autonomous driving model deployment method further includes:

[0013] After obtaining at least one candidate deployment algorithm model, each candidate deployment algorithm model is tested using a test dataset to obtain test results, and the target deployment algorithm model is determined based on the test results.

[0014] Optionally, determining the target deployment algorithm model based on the test results includes:

[0015] The test results include the test metric values of the candidate deployment algorithm model;

[0016] Based on the test index values and expected target values of each candidate deployment algorithm model, at least one candidate deployment algorithm model that meets the preset deployment requirements is selected as the target deployment algorithm model.

[0017] Optionally, the test metric values include at least one of accuracy, frame rate, time delay, and power consumption.

[0018] Optionally, the hardware information includes first parameter information;

[0019] The step of obtaining at least one matching association operator based on the hardware information, and constructing the search space based on the association operator, includes:

[0020] The first association operator is matched based on the first parameter information, and the search space is constructed based on the set of matched first association operators.

[0021] Optionally, the first parameter information includes at least one of the following: architecture type, clock speed, number of cores, and instruction set.

[0022] Optionally, the hardware information further includes second parameter information, and the model deployment method further includes:

[0023] The second association operator is matched based on the second parameter information, and the search space is constructed by taking the intersection between the set of the second association operators and the set of the first association operators.

[0024] Optionally, determining the similarity between the computation node and the corresponding association operator in the search space includes:

[0025] The first parameter of the computing node corresponds to the second parameter of the association operator;

[0026] The similarity between the first parameter and the second parameter is calculated and used as the similarity between the computing node and the association operator.

[0027] According to a second aspect of the present disclosure, an autonomous driving model deployment apparatus is provided, comprising:

[0028] The first acquisition module is used to acquire the configuration file of the hardware platform to be deployed, the configuration file including hardware information;

[0029] The association search module is used to obtain at least one matching association operator based on the hardware information, and to construct a search space based on the association operator;

[0030] The second acquisition module is used to acquire the computation node relationship graph of the algorithm model to be deployed; the algorithm model to be deployed is composed of a combination of several computation nodes and is used to realize the algorithm model for autonomous driving of vehicles.

[0031] The calculation and filtering module is used to determine the similarity between the calculation node and the corresponding association operator in the search space for each calculation node of the algorithm model to be deployed; and select the association operator whose similarity meets the preset conditions as the candidate operator for the corresponding calculation node.

[0032] The model generation module is used to select candidate operators corresponding to the computing nodes and combine them according to the computing node relationship graph of the algorithm model to be deployed to obtain at least one candidate deployment algorithm model.

[0033] According to a third aspect of the present disclosure, an electronic device is provided, comprising: a processor, a memory, and a communication bus;

[0034] The communication bus is used to enable communication between the processor and the memory;

[0035] The processor is used to execute one or more programs stored in the memory to implement the steps of the autonomous driving model deployment method provided in the first aspect of this disclosure.

[0036] According to a fourth aspect of the present disclosure, a computer-readable storage medium is provided, the computer-readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the steps of the autonomous driving model deployment method provided in the first aspect of the present disclosure.

[0037] The technical solutions provided by the embodiments of this disclosure may include the following beneficial effects: Since there are differences in architecture, instruction sets and other related hardware aspects between different hardware platforms, this solution can automatically output an autonomous driving algorithm model adapted to the corresponding hardware platform based on its hardware information, thereby improving the model deployment efficiency of the hardware platform and reducing the difficulty of vehicle development for OEMs.

[0038] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description

[0039] The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure.

[0040] Figure 1 This is a flowchart illustrating a model deployment method according to an exemplary embodiment.

[0041] Figure 2 This is a schematic diagram illustrating a model for calculating node relationship graphs according to an exemplary embodiment.

[0042] Figure 3 This is a flowchart illustrating a target deployment algorithm model selection method according to an exemplary embodiment.

[0043] Figure 4 This is a flowchart illustrating a search space construction method according to an exemplary embodiment.

[0044] Figure 5 This is a flowchart illustrating a search space filtering and optimization method according to an exemplary embodiment.

[0045] Figure 6 This is a schematic diagram illustrating another model for calculating node relationship graphs according to an exemplary embodiment.

[0046] Figure 7 This is a block diagram illustrating a model deployment apparatus according to an exemplary embodiment.

[0047] Figure 8 This is a block diagram illustrating another model deployment apparatus according to an exemplary embodiment.

[0048] Figure 9 This is a block diagram illustrating yet another model deployment apparatus according to an exemplary embodiment.

[0049] Figure 10 This is a block diagram illustrating an electronic device according to an exemplary embodiment. Detailed Implementation

[0050] The exemplary embodiments will now be described in detail with reference to the accompanying drawings.

[0051] It should be noted that the relevant embodiments and accompanying drawings are only for describing and illustrating exemplary embodiments provided by this disclosure, and not all embodiments of this disclosure, nor should this disclosure be understood to be limited to the relevant exemplary embodiments.

[0052] It should be noted that the terms "first," "second," etc., used in this disclosure are only used to distinguish different steps, devices, or modules. These terms do not represent any specific technical meaning, nor do they indicate any order or interdependence between them.

[0053] It should be noted that the terms “a,” “a plurality of,” and “at least one” used in this disclosure are illustrative rather than restrictive. Unless otherwise expressly indicated in the context, they should be understood as “one or more.”

[0054] It should be noted that the term "and / or" used in this disclosure is used to describe the relationship between related objects, and generally indicates that there are at least three relationships. For example, A and / or B can at least indicate: the existence of A alone, the existence of both A and B, and the existence of B alone.

[0055] It should be noted that the various steps described in the method embodiments of this disclosure may be performed in different orders and / or in parallel. Unless otherwise specified, the scope of this disclosure is not limited by the order in which the steps are described in the relevant embodiments.

[0056] It should be noted that all actions involving the acquisition of signals, information, or data in this disclosure are carried out in compliance with the relevant data protection laws and policies of the country where the location is situated, and with authorization from the owner of the relevant device.

[0057] Exemplary Method 1

[0058] Figure 1 This is a flowchart illustrating a model deployment method according to an exemplary embodiment, such as... Figure 1 As shown, the model deployment method is mainly used in computer equipment or servers (hereinafter referred to as the model deployment platform), and is mainly used to automatically adapt the algorithm model to the hardware platform that needs to deploy autonomous driving algorithms; it includes the following steps.

[0059] In step S110, the configuration file of the hardware platform to be deployed is obtained, and the configuration file includes hardware information.

[0060] The hardware platform to be deployed refers to the platform where autonomous driving algorithms need to be deployed, including but not limited to the vehicle side, the server side used by OEMs to provide autonomous driving services for vehicles, and other hardware platforms that need to realize / assist in realizing autonomous driving.

[0061] The model deployment platform communicates with the hardware platform to be deployed, including but not limited to wired and wireless communication methods, such as 4G (4th generation Mobile Communication Technology) and 5G (5th generation Mobile Communication Technology), to obtain the configuration file.

[0062] The configuration file must contain at least the hardware information for the corresponding hardware platform to be deployed, and the configuration file should be recognizable and readable by the model deployment platform. This hardware information includes, but is not limited to, processor information, memory information, etc., and can be the specific model number of the hardware or specific parameters.

[0063] In step S120, at least one matching association operator is obtained based on the hardware information; and a search space is constructed based on the association operator.

[0064] The process involves associating and matching hardware information with operators in the operator database to obtain associated operators that match the hardware information. This set of associated operators serves as the search space. In other words, associated operators are operators optimized for the hardware and capable of matching relevant hardware performance.

[0065] It should be understood that the operator database can be any existing operator database, and this disclosure does not impose any restrictions. For example, it could be a set of operators optimized for specific hardware. The operator database contains a large number of implemented operators, each with descriptive information corresponding to the hardware information, describing the operator's limitations or scope of application for the hardware information. Therefore, matching related operators can be obtained by association search based on the hardware information of the hardware platform to be deployed. An operator is not necessarily applicable to all hardware. This example obtains related operators that meet the usage requirements and limitations based on the hardware information of the hardware platform to be deployed, and then constructs a search space based on the set of related operators.

[0066] In step S130, the computation node relationship graph of the algorithm model to be deployed is obtained; the algorithm model to be deployed is composed of a combination of several computation nodes and is used to realize autonomous driving of vehicles.

[0067] The algorithm model to be deployed can be any existing neural network model applicable to the field of autonomous driving. OEMs can flexibly choose based on the actual situation of their own hardware platform architecture, performance and other factors. This disclosure does not impose any restrictions on this, including but not limited to Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), Support Vector Machine (SVM) and so on.

[0068] The algorithm model to be deployed can serve as a carrier for autonomous driving algorithms, thereby helping vehicles achieve autonomous driving. The autonomous driving algorithms include, but are not limited to, visual perception algorithms, radar perception algorithms, planning algorithms, etc., and this disclosure does not impose any restrictions on them.

[0069] It should be understood that the algorithm model to be deployed can be viewed as being composed of individual computing nodes, which we can call operators (OPs). In the network model, operators correspond to the computational logic in the layers. For example, a convolutional layer is an operator; a pooling layer is an operator; and the weight summation process in a fully-connected layer (FC layer) is an operator.

[0070] The computation node relationship graph can be used to describe the model's computational inference process, containing the input-output relationships between all computation nodes of the model.

[0071] For ease of understanding, this example provides a feasible approach to calculating the node relationship graph for an algorithm model to be deployed. It should be understood that this solution can be fully applied to other algorithm models to be deployed; please refer to [the relevant documentation / reference]. Figure 2 Assuming the algorithm model to be deployed contains 3 computing nodes, namely a convolutional layer (Conv), a pooling layer (Pool), and a fully connected layer (FC), the input data x first undergoes a convolution process through the computing node Conv, then the convolution result x1 is input into the computing node Pool for pooling, then the pooling result x2 is input into the fully connected layer FC for weight summation, and finally the inference result y is output.

[0072] In step S140, for each computation node in the algorithm model to be deployed, the similarity between the computation node and the corresponding association operator in the search space is determined.

[0073] For example, the similarity between the first parameter of a computation node and the second parameter of an association operator is determined as the similarity between the computation node and the association operator, wherein the first parameter of the computation node corresponds to the second parameter of the association operator.

[0074] The algorithm model to be deployed may be of one type or multiple types. The computational nodes in the algorithm model to be deployed can essentially be considered as operators. Therefore, determining the similarity between computational nodes and associated operators in the search space can be done, for example, based on dimensions such as tensor shape and size. Similarity can be calculated using Euclidean distance, average difference, etc., and this disclosure does not impose any restrictions on this. In other examples of this disclosure, any existing method for determining the similarity between computational operators can be used.

[0075] In step S150, association operators that meet the preset similarity conditions are selected as candidate operators for the corresponding calculation nodes.

[0076] Based on the similarity calculation results, it is determined whether the preset conditions are met. If they are met, the corresponding operator is used as a candidate operator for the calculation node; otherwise, no processing is performed.

[0077] The preset conditions can be flexibly set according to the actual situation, such as setting a similarity threshold range: greater than the set similarity threshold.

[0078] In step S160, candidate operators corresponding to computing nodes are selected and combined according to the computing node relationship graph of the algorithm model to be deployed to obtain at least one candidate deployment algorithm model.

[0079] Each computing node of the algorithm model to be deployed has at least one candidate operator. Each time candidate operators are combined to generate a candidate deployment algorithm model, each computing node of the algorithm model to be deployed needs to select a corresponding candidate operator. Then, the candidate operators selected by each computing node are combined according to the corresponding computing node relationship graph to obtain a candidate deployment algorithm model.

[0080] Assuming the algorithm model to be deployed contains 3 computing nodes, and each computing node has 2 candidate operators, then 8 (2*2*2) candidate deployment algorithm models will be generated.

[0081] by Figure 2Taking the algorithm model to be deployed as an example, assume that its first computing node Conv has two candidate operators, namely Conv1 and Conv2; the second computing node Pool has two candidate operators, namely Pool1 and Pool2; and the third computing node FC has two candidate operators, namely FC1 and FC2. Each computing node selects a corresponding candidate operator, and they are combined according to the computing node relationship graph of the algorithm model to be deployed, resulting in 8 candidate algorithm models to be deployed, as shown in Table 1 below:

[0082] Table 1

[0083]

[0084]

[0085] It should be understood that the candidate operators corresponding to different computing nodes may be the same or different. For example, the algorithm model to be deployed may require similar operators at different node positions, and ultimately select the same operator from the operator database, that is, the same candidate operator exists. For example, suppose a certain algorithm model to be deployed contains 4 computing nodes, where the first computing node has two candidate operators, Conv11 and Conv12; the second computing node has two candidate operators, Pool11 and Pool12; and the third computing node has Conv11 and Conv14. Here, since the first and third computing nodes both need to perform convolution calculations, the same candidate operator Conv11 is selected.

[0086] Based on the model deployment method provided in this disclosure, candidate deployment algorithm models that can be adapted to the hardware platform to be deployed and the computing node information of the algorithm model to be deployed can be automatically generated. This solves the problem that current model deployment work relies on the algorithm deployment engineer's mastery of the characteristics and advantages of the hardware platform itself, which requires high professional ability and experience from the algorithm deployment engineer, and has low deployment efficiency and long cycle. Based on this disclosure, it can be applied to various hardware platforms to be deployed, so as to automatically realize the rapid optimization and deployment of autonomous driving algorithm models. This is conducive to shortening the development cycle of the whole vehicle for OEMs, reducing development costs, and also conducive to promoting the development of autonomous driving technology, which has important theoretical and practical significance.

[0087] Exemplary Method 2

[0088] Based on the above exemplary method 1, this example further filters the candidate deployment algorithm models to obtain the target deployment algorithm model, thereby better meeting the deployment requirements of the OEM.

[0089] Please see Figure 3The screening process specifically includes:

[0090] In step S310, the candidate deployment algorithm model is tested using the test dataset to obtain the test results corresponding to the model.

[0091] The test results should include test metric values for the corresponding candidate deployment algorithm model, including but not limited to at least one value among accuracy, frame rate, latency, and power consumption. It should be understood that the process of obtaining test results by testing the model using a test dataset is not the focus of this invention and can be achieved using any existing method, which will not be elaborated upon here.

[0092] In step S320, the target deployment algorithm model is determined based on the test results.

[0093] Based on the test results, the final target deployment algorithm model is determined, thereby identifying an optimal or more suitable algorithm model for the OEM's needs. Whether it is optimal or more suitable for the OEM's needs depends on the OEM's expectations for model deployment. Therefore, this example obtains the test results of each candidate deployment algorithm model and makes a comprehensive judgment with the OEM's expected target value to select at least one candidate deployment algorithm model that meets the preset deployment requirements as the target deployment algorithm model.

[0094] It should be understood that when comprehensively judging the test metric values and the target expected values, the test metric values of the candidate deployment algorithm model should reach or exceed the expected target values in order to be selected as the target deployment algorithm model.

[0095] For example, the test accuracy and test frame rate of a candidate deployment algorithm model should be greater than or equal to the expected accuracy and expected frame rate; and the test time delay and test power consumption should be less than the expected time delay and expected power consumption in order to be selected; conversely, if any of the expected target values are not met, the corresponding candidate deployment algorithm model will be eliminated.

[0096] For candidate deployment algorithm models that meet the expected target value, the candidate deployment algorithm model with the highest accuracy is selected as the target deployment algorithm model and deployed to the hardware platform to achieve autonomous driving.

[0097] It should be understood that the preset deployment requirements can also be flexibly set according to the needs of the OEM. For example, the candidate deployment algorithm model with the highest frame rate can be selected as the target deployment algorithm model, or the candidate deployment algorithm model with the lowest power consumption can be selected as the target deployment algorithm model.

[0098] The model deployment method provided in this disclosure can fully utilize the hardware resources of the hardware platform to be deployed, obtain the optimal deployment solution that meets the needs of the OEM, greatly reduce the deployment cycle of autonomous driving algorithm models for customers on different hardware platforms, and improve development efficiency.

[0099] Exemplary Method 3

[0100] Based on the above example, the hardware information of the hardware platform to be deployed in this example includes the first parameter information, and the search space is constructed based on the first parameter information.

[0101] Please see Figure 4 The process of constructing the search space includes:

[0102] In step S410, the first association operator is matched according to the first parameter information.

[0103] Based on the first parameter information, an association search is performed with the operators in the operator database to obtain the first association operator that matches the first parameter information.

[0104] It should be understood that the operator database can be any existing operator database, such as a set of operators optimized for specific hardware, and this disclosure does not impose any restrictions. The operator database contains a large number of implemented operators, and each operator has descriptive information corresponding to the first parameter information, which describes the limitations or scope of application of the operator for the corresponding parameter information. Therefore, the matching first associated operator can be obtained by association search based on the first parameter information in the hardware information of the hardware platform to be deployed.

[0105] The first parameter information includes processor parameter information, such as architecture type, clock speed, number of cores, and at least one instruction set.

[0106] The architecture type here refers to processor architecture, including but not limited to x86 architecture, ARM (Advanced RISC Machine) architecture, MIPS (Microprocessor without interlocked piped stages architecture) architecture, and RISC-V (Reduced Instruction Set Computer-V) architecture.

[0107] Clock speed, also known as clock frequency, is the operating frequency of the processor core. For example, a clock speed of 2.0 GHz means that the processor generates 2 billion clock pulses per second, with each clock pulse lasting 0.5 nanoseconds.

[0108] The number of cores refers to the physical number of cores present. For example, a dual-core CPU consists of two relatively independent CPU core units, while a quad-core CPU consists of four relatively independent CPU core units.

[0109] An instruction set is a set of instructions used by a processor to calculate and control a computer system. It is mainly divided into two categories: Reduced Instruction Set Computer (RISC) and Complex Instruction Set Computing (CISC). RISC includes, but is not limited to, ARM, MIPS, and RISC-V, while CISC includes, but is not limited to, x86.

[0110] In some examples of this disclosure, the first association operator that matches the first parameter information can be determined by using a fuzzy search algorithm. In the implemented operator database, the operator name is associated with each first parameter information (such as processor architecture, clock speed, number of cores, cache size, instruction set, etc.) to obtain the first association operator applicable to the first parameter information, and it is put into the cache pool to form a set of first association operators, thereby obtaining the search space.

[0111] By comparing the processor's parameter information with the restrictions or applicable scope of the corresponding operator in the operator database, a match can be determined. For example, if an operator is applicable to processors running on the ARM architecture (assuming no other restrictions), and the first parameter information includes the architecture type as ARM, then this operator can be used as the first associated operator. It should be understood that if any parameter information of the processor does not meet the restrictions or applicable scope of the corresponding operator, then this operator cannot be used as the first associated operator.

[0112] In some specific application scenarios, the hardware platform to be deployed can adopt a heterogeneous architecture of CPU (Central Processing Unit) + XPU. The XPU can be one or more of the following: GPU (Graphics Processing Unit), NPU (Neural Network Processing Unit), ASIC (Application Specific Integrated Circuit), and FPGA (Field Programmable Gate Array). The specific hardware model and heterogeneous architecture chosen for the hardware platform are primarily determined by the OEM's actual needs, such as supply chain and pricing factors. That is, the hardware platform typically includes multiple processors. In this case, for each processor, a first correlation operator can be obtained based on its first parameter information. Then, a set of first correlation operators is constructed based on the first correlation operators corresponding to each processor, serving as the search space.

[0113] For example, suppose the hardware platform to be deployed includes processor 1 and processor 2. Based on the parameter information of processor 1, the first association operator that matches is operator m1; based on the parameter information of processor 2, the first association operator that matches is operator m2; then the corresponding set of first association operators is: {m1(processor 1), m2(processor 2)}, and then {m1(processor 1), m2(processor 2)} is used as the search space.

[0114] In step S420, a search space is constructed based on the matching first set of association operators.

[0115] Based on all matching first association operators, construct a set of first association operators, and then use the set of first association operators as the search space.

[0116] Exemplary Method 4

[0117] Based on the above exemplary method, this example further includes: the hardware information of the hardware platform to be deployed also includes second parameter information, and the search space constructed in exemplary method 3 is filtered and optimized based on the second parameter information. For details, please refer to [link to relevant documentation]. Figure 5 Mainly includes:

[0118] In step S510, the second association operator is matched according to the second parameter information in the hardware information of the hardware platform to be deployed.

[0119] In step S520, the intersection between the second set of association operators and the first set of association operators is taken to construct the search space.

[0120] The first association operator in the search space is filtered and updated by using the second parameter information, so that the association operator in the search space not only satisfies the hardware constraints on the first parameter information, but also satisfies the hardware constraints on the second parameter information.

[0121] Based on existing computer system architectures, the processor is one of the main hardware components affecting computer performance. Besides processors and other hardware, the hardware platform may also include memory, buses, and communication modules in specific application scenarios. It should be understood that this disclosure does not limit the specific hardware configuration or heterogeneous architecture of the hardware platform to be deployed; the model deployment method provided in this disclosure is applicable to any existing hardware platform. The more complex the architecture of the hardware platform to be deployed, the more processors it has, the better its performance, and the stronger its compatibility, the larger the data volume of the search space constructed. This means a greater workload in generating candidate deployment algorithm models and determining the target deployment algorithm model, inevitably affecting deployment efficiency. Furthermore, interactions between processors and between processors and external systems typically rely on memory for data retrieval. The deployment efficiency of the candidate deployment algorithm model depends not only on the compatibility between the candidate deployment algorithm model and the processor of the hardware platform to be deployed, but also on whether the memory parameters meet the operator applicability requirements. In some examples disclosed herein, the second parameter information includes memory parameter information, including but not limited to memory type, capacity, operating frequency, and read rate. A second association operator is matched based on the memory parameter information, and the intersection of the second association operator set and the first association operator set is used to construct the search space. This allows for the filtering and optimization of the search space constructed based on processor parameter information, improving deployment efficiency while ensuring the quality of candidate deployment algorithm models.

[0122] Taking a CPU+ASIC heterogeneous platform as an example, assuming its first set of association operators based on processor information is: {a1(CPU), a2(CPU), a3(CPU), a4(CPU), a1(ASIC), a2(ASIC), a5(ASIC)}; and its second set of association operators based on memory parameter information is: {a1, a2, a3, a4}; taking the intersection of the second set of association operators and the first set of association operators yields: {a1(CPU), a2(CPU), a3(CPU), a4(ASIC), a5(ASIC)}. The algorithm selects 4 (CPU), a1 (ASIC), and a2 (ASIC) to eliminate the correlation operator a5 (ASIC). Since similarity calculations and candidate deployment algorithm models are no longer performed on the correlation operator a5 (ASIC), deployment efficiency can be improved. At the same time, the correlation operator a5 is not suitable for running under the given memory parameters, so the deployment effect of the algorithm model built based on this correlation operator is likely to be poor. Therefore, eliminating this correlation operator will not lead to the omission of algorithm models with better deployment effect, thus affecting the quality of candidate deployment algorithm models.

[0123] Exemplary Method 5

[0124] Based on the above exemplary method, in order to better meet the OEM's personalized requirements for model deployment, in this example, the configuration file of the hardware platform to be deployed can be provided by the OEM and written in accordance with the fixed format required by the model deployment platform so that it can be obtained and recognized.

[0125] The configuration file can include hardware information about the hardware platform to be deployed and constraint information regarding model deployment. The hardware information primarily describes the characteristics of the hardware platform itself, while the constraint information primarily describes the OEM's deployment requirements / restrictions on the algorithm model.

[0126] Specifically, hardware information includes the processor model and memory information that constitute the hardware platform to be deployed; constraint information includes the algorithm model to be deployed, the test dataset, and the OEM's expected target values for model deployment.

[0127] In some examples disclosed herein, the configuration file format is shown in Table 2 below:

[0128] Table 2 Configuration Files

[0129]

[0130] The processor includes, but is not limited to, CPU, GPU, NPU, ASIC, FPGA, etc.; memory parameter information includes, but is not limited to, memory type, capacity, operating frequency, read speed, etc.

[0131] In some examples disclosed herein, the first parameter information in the hardware information can be obtained based on the corresponding processor model, and the second parameter information can be obtained based on the corresponding memory model. Optionally, based on the processor model provided in the hardware information (such as a specific model of a CPU, GPU, NPU, ASIC, or FPGA), detailed information about the processor, such as processor architecture, clock speed, number of cores, and instruction set, can be queried in the software framework using a lookup table. Similarly, based on the memory model provided in the hardware information, the memory type, capacity, operating frequency, and read speed can also be queried using existing methods.

[0132] In some examples disclosed herein, the first set of associated operators can first be determined based on the processor parameter information of the hardware platform to be deployed. Assuming that the hardware platform to be deployed is a heterogeneous architecture of CPU+ASIC, the first associated operators that match can be obtained based on the parameter information of the CPU and the parameter information of the ASIC chip, thus obtaining the first set of associated operators. Assuming that the first associated operators that match based on the parameter information of the CPU are operators a1, a2, a3, and a4, and the first associated operators that match based on the parameter information of the ASIC are operators a1, a2, and a5, then the corresponding first set of associated operators is: {a1(CPU), a2(CPU), a3(CPU), a4(CPU), a1(ASIC), a2(ASIC), a5(ASIC)}.

[0133] It should be understood that a1(CPU) means that operator a1 is associated with the CPU, and the data processing of operator a1 is executed by the CPU. That is, the execution subject of a1(CPU) is the CPU. This is similar to operator a1(ASIC) in that the operator itself is the same, but the execution subject is different. The data processing of operator a1(ASIC) is executed by ASIC.

[0134] In some examples of this disclosure, when generating candidate deployment algorithm models, operators a1(CPU) and a1(ASIC) are regarded as two different operators, that is, operators a1(CPU) and a1(ASIC) will be used as two different candidate operators to construct different candidate deployment algorithm models.

[0135] Using the above method helps to obtain more candidate deployment algorithm models that meet the conditions, avoiding the possibility of omitting some important candidate deployment algorithm model schemes. For example, if the first set of association operators is {a1(CPU), a2(CPU), a1(ASIC), a2(ASIC)}, there will be a problem of omitting candidate deployment algorithm model schemes. It can also avoid the problem of unnecessary association operators in the search space affecting deployment efficiency. For example, if the first set of association operators is {a1(CPU), a2(CPU), a3(CPU), a4(CPU), a5(CPU), a1(ASIC), a2(ASIC), a3(ASIC), a4(ASIC), a5(ASIC)}, there will be a problem of affecting deployment efficiency.

[0136] Exemplary Method 6

[0137] Based on the exemplary methods described above, this example provides a method for generating candidate deployment algorithm models based on candidate operators.

[0138] Please refer to Figure 6 This document presents a computational node relationship graph for an optional algorithm model to be deployed. The algorithm model has four computational nodes, assumed to be node 31, node 32, node 33, and node 34. The search space is {a1(CPU), a2(CPU), a3(CPU), a4(CPU), a1(ASIC), a2(ASIC)}. Calculations show that the candidate operator for node 31 is a3, for node 32 it is a4, for node 33 it is a1, and for node 34 it is a2. Since candidate operator a1 has two execution schemes (a1(CPU) and a1(ASIC)) and candidate operator a2 also has two execution schemes (a2(CPU) and a2(ASIC)), four candidate deployment algorithm models can be generated, as shown in Table 3 below.

[0139] Table 3

[0140]

[0141]

[0142] In other words, when the same operator corresponds to different processors, different candidate deployment algorithm models can be generated in some examples disclosed herein. For candidate deployment algorithm models with the same operator but different execution subjects, there may be differences in their deployment effects. Therefore, it is beneficial to obtain a target deployment algorithm model with better deployment effect.

[0143] Exemplary device

[0144] Figure 7 This is a block diagram illustrating a model deployment apparatus according to an exemplary embodiment. Please refer to... Figure 7 The device 700 includes a first acquisition module 710, an association search module 720, a second acquisition module 730, a calculation and filtering module 740, and a model generation module 750, wherein:

[0145] The first acquisition module 710 is used to acquire the configuration file of the hardware platform to be deployed, the configuration file including hardware information;

[0146] The association search module 720 is used to obtain at least one matching association operator based on hardware information, and to construct a search space based on the association operator;

[0147] The second acquisition module 730 is used to acquire the computation node relationship graph of the algorithm model to be deployed; the algorithm model to be deployed is composed of a combination of several computation nodes and is used to realize the algorithm model for autonomous driving of vehicles.

[0148] The computational filtering module 740 is used to determine the similarity between the computational nodes of each algorithm model to be deployed and the corresponding association operators in the search space; and to select association operators whose similarity meets the preset conditions as candidate operators for the corresponding computational nodes.

[0149] The model generation module 750 is used to select candidate operators corresponding to computing nodes and combine them according to the computing node relationship graph of the algorithm model to be deployed to obtain at least one candidate deployment algorithm model.

[0150] In some examples of this disclosure, the hardware information includes first parameter information, and the association search module 720 is used to match a first association operator based on the first parameter information and construct a search space based on the set of matched first association operators.

[0151] The first parameter information includes at least one of the following: architecture type, clock speed, number of cores, and instruction set.

[0152] In some examples of this disclosure, the hardware information also includes second parameter information. The association search module 720 is used to match a second association operator based on the second parameter information and construct a search space by taking the intersection between the set of second association operators and the set of first association operators. This enables the filtering and updating of the search space, improving model deployment efficiency without adversely affecting the quality of the generated candidate deployment algorithm models.

[0153] In some examples of this disclosure, the first parameter of the computation node corresponds to the second parameter of the association operator; the computation filtering module 740 is used to calculate the similarity between the first parameter and the second parameter, which serves as the similarity between the computation node and the association operator. The computation node in the algorithm model to be deployed can essentially be considered as an operator; therefore, determining the similarity between the computation node and the association operator in the search space can be done, for example, based on dimensions such as tensor shape and size. The similarity calculation method can employ Euclidean distance, average difference, etc., and this disclosure does not impose any limitations on this. In other examples of this disclosure, any existing method for calculating the similarity between computation operators can be used.

[0154] Please refer to Figure 8 In some examples of this disclosure, the model deployment apparatus 700 further includes a model testing module 760 and a model screening module 770. After obtaining candidate deployment algorithm models based on the model generation module 750, the model testing module 760 is used to test each candidate deployment algorithm model using a test dataset to obtain test results; the model screening module 770 is used to determine the target deployment algorithm model based on the test results.

[0155] In some examples disclosed herein, the test results include test metric values of the candidate deployment algorithm model, such as at least one of accuracy, frame rate, time latency, and power consumption. The model selection module 770 is used to select at least one candidate deployment algorithm model that meets the preset deployment requirements as the target deployment algorithm model based on the test metric values and expected target values of each candidate deployment algorithm model.

[0156] It should be understood that when the model screening module 770 comprehensively judges the test index values and the target expected values, the test index values of the candidate deployment algorithm model should reach or exceed the expected target values in order to be selected as the target deployment algorithm model. For example, the test accuracy and test frame rate of the candidate deployment algorithm model should be greater than or equal to the expected accuracy and expected frame rate; and the test time delay and test power consumption should be less than the expected time delay and expected power consumption in order to be selected. Conversely, if any of the expected target values are not met, the corresponding candidate deployment algorithm model will be eliminated.

[0157] For candidate deployment algorithm models that meet the expected target value, the model screening module 770 can use the candidate deployment algorithm model with the highest accuracy as the target deployment algorithm model to be deployed to the hardware platform to achieve autonomous driving.

[0158] It should be understood that the preset deployment conditions can also be flexibly set according to the needs of the OEM. For example, according to the needs of the OEM, the model screening module 770 can select the candidate deployment algorithm model with the highest frame rate as the target deployment algorithm model, or select the candidate deployment algorithm model with the lowest power consumption as the target deployment algorithm model.

[0159] Please refer to Figure 9 In some examples of this disclosure, the model deployment apparatus 700 also includes a model deployment module 780 for automatically deploying the target deployment algorithm model to the hardware platform to be deployed, thereby enabling autonomous driving of the vehicle.

[0160] In some examples disclosed herein, the model deployment module 780 may employ OTA (Over-the-Air Technology) to deploy the model.

[0161] The model deployment apparatus provided in this disclosure can fully utilize the hardware resources of the hardware platform to be deployed, obtain the optimal deployment scheme that meets the needs of the OEM, greatly reduce the deployment cycle of autonomous driving algorithm models for customers on different hardware platforms, and improve development efficiency.

[0162] Exemplary electronic devices

[0163] Figure 10 This is a block diagram illustrating an electronic device 100 according to an exemplary embodiment. The electronic device 100 may be a third-party platform, server, computer, or other type of electronic device that provides model deployment services.

[0164] Reference Figure 10 The electronic device 100 may include at least one processor 110 and a memory 120. The processor 110 can execute instructions stored in the memory 120. The processor 110 is communicatively connected to the memory 120 via a data bus. In addition to the memory 120, the processor 110 can also be communicatively connected to an input device 130, an output device 140, and a communication device 150 via the data bus.

[0165] Processor 110 can be any conventional processor, such as a commercially available CPU. The processor may also include, for example, a Graphics Processing Unit (GPU), a Field Programmable Gate Array (FPGA), a System on Chip (SOC), an Application Specific Integrated Circuit (ASIC), or a combination thereof.

[0166] The memory 120 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk or optical disk.

[0167] In this embodiment of the present disclosure, the memory 120 stores executable instructions, and the processor 110 can read the executable instructions from the memory 120 and execute the instructions to implement all or part of the steps of the autonomous driving model deployment method described in any of the exemplary embodiments above.

[0168] Exemplary computer-readable storage media

[0169] In addition to the methods and apparatus described above, exemplary embodiments of this disclosure may also be a computer program product or a computer-readable storage medium storing the computer program product. The computer product includes computer program instructions that can be executed by a processor to implement all or part of the steps of the autonomous driving model deployment method described in any of the exemplary embodiments above.

[0170] The computer program product can be written in any combination of one or more programming languages to perform the operations of the embodiments of this application. The programming languages include object-oriented programming languages such as Java and C++, as well as conventional procedural programming languages such as C or similar languages, and scripting languages (e.g., Python). The program code can be executed entirely on the user's computing device, partially on the user's device, as a standalone software package, partially on the user's computing device and partially on a remote computing device, or entirely on a remote computing device or server.

[0171] The computer-readable storage medium may be any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of readable storage media include: static random access memory (SRAM) having one or more electrically connected wires, electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk or optical disk, or any suitable combination thereof.

[0172] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of this disclosure. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.

[0173] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.

Claims

1. A method for deploying an autonomous driving model, characterized in that, include: Obtain the configuration file of the hardware platform to be deployed, the configuration file including hardware information, the hardware information including first parameter information; Based on the hardware information, obtain at least one matching association operator; Based on the aforementioned association operator, the search space is constructed as follows: The first association operator is matched according to the first parameter information, and the search space is constructed based on the set of matched first association operators. The hardware information also includes second parameter information. The second association operator is matched according to the second parameter information, and the search space is constructed by taking the intersection between the set of second association operators and the set of first association operators. Obtain the computation node relationship graph of the algorithm model to be deployed; the algorithm model to be deployed is composed of a combination of several computation nodes and is used to realize the algorithm model for autonomous driving of vehicles. For each computation node in the algorithm model to be deployed, determine the similarity between the computation node and the corresponding association operator in the search space; The association operators that meet the preset similarity conditions are selected as candidate operators for the corresponding computation nodes; Candidate operators corresponding to the computing nodes are selected and combined according to the computing node relationship graph of the algorithm model to be deployed to obtain at least one candidate deployment algorithm model.

2. The autonomous driving model deployment method as described in claim 1, characterized in that, The model deployment method also includes: After obtaining at least one candidate deployment algorithm model, each candidate deployment algorithm model is tested using a test dataset to obtain test results, and the target deployment algorithm model is determined based on the test results.

3. The autonomous driving model deployment method as described in claim 2, characterized in that, The step of determining the target deployment algorithm model based on the test results includes: The test results include the test metric values of the candidate deployment algorithm model; Based on the test index values and expected target values of each candidate deployment algorithm model, at least one candidate deployment algorithm model that meets the preset deployment requirements is selected as the target deployment algorithm model.

4. The autonomous driving model deployment method as described in claim 1, characterized in that, The first parameter information includes at least one of the following: architecture type, clock speed, number of cores, and instruction set.

5. The autonomous driving model deployment method according to any one of claims 1-4, characterized in that, Determining the similarity between the computing node and the corresponding association operator in the search space includes: The first parameter of the computing node corresponds to the second parameter of the association operator; The similarity between the first parameter and the second parameter is calculated and used as the similarity between the computing node and the association operator.

6. An autonomous driving model deployment device, characterized in that, include: The first acquisition module is used to acquire the configuration file of the hardware platform to be deployed. The configuration file includes hardware information, and the hardware information includes first parameter information. Based on the hardware information, obtain at least one matching association operator; Based on the aforementioned association operator, the search space is constructed as follows: The first association operator is matched according to the first parameter information, and the search space is constructed based on the set of matched first association operators. The hardware information also includes second parameter information. The second association operator is matched according to the second parameter information, and the search space is constructed by taking the intersection between the set of second association operators and the set of first association operators. The second acquisition module is used to acquire the computation node relationship graph of the algorithm model to be deployed; the algorithm model to be deployed is composed of a combination of several computation nodes and is used to realize the algorithm model for autonomous driving of vehicles. The calculation and filtering module is used to determine the similarity between the calculation node and the corresponding association operator in the search space for each calculation node of the algorithm model to be deployed; and select the association operator whose similarity meets the preset conditions as the candidate operator for the corresponding calculation node. The model generation module is used to select candidate operators corresponding to the computing nodes and combine them according to the computing node relationship graph of the algorithm model to be deployed to obtain at least one candidate deployment algorithm model.

7. An electronic device, characterized in that, The electronic device includes a processor, a memory, and a communication bus; The communication bus is used to enable communication between the processor and the memory; The processor is used to execute one or more programs stored in the memory to implement the steps of the autonomous driving model deployment method as described in any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores one or more programs, which can be executed by one or more processors to implement the steps of the autonomous driving model deployment method as described in any one of claims 1 to 5.

Citation Information

Patent Citations

CN111966361A
CN113779366A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

CN111966361A

CN113779366A