A data processing method, apparatus and device

By using encrypted model weights and decryption keys in a data acceleration processing device, the problem of high hardware isolation requirements in existing technologies is solved, and the security protection of model information during runtime is achieved to prevent theft.

CN119378023BActive Publication Date: 2026-06-26HUAWEI TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HUAWEI TECH CO LTD
Filing Date
2023-07-27
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing technologies for protecting the security of AI models during runtime require hardware isolation technologies such as TEE with high hardware requirements, which are difficult to apply to existing market products. Furthermore, they require significant modifications to the software stack and fail to effectively protect model information.

Method used

In the data acceleration processing device, the model weights are stored in encrypted form in the unprotected storage area and then decrypted in the acceleration processor before being stored in the protected storage area. The general processing device cannot access the plaintext form of the model weights. By combining the use of the decryption key and the signature verification public key, the security of the model weights during transmission and execution is ensured.

Benefits of technology

Without relying on TEE hardware isolation technology, the security of model weights is protected and theft is prevented, thus achieving the security of model information during runtime.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119378023B_ABST
    Figure CN119378023B_ABST
Patent Text Reader

Abstract

The application provides a data processing method, device and equipment, relates to the technical field of communication, and is used for protecting model information security in the running period without using hardware isolation technology such as TEE. The method is applied to a data acceleration processing device, the data acceleration processing device comprises an acceleration processor and a memory, and the memory comprises a non-protected storage area and a protected storage area. The method comprises the following steps: the memory receives model information sent by a general processing device, and stores the model information in the non-protected storage area, the general processing device can access the non-protected storage area, and the model information comprises model weights in the form of ciphertext; the acceleration processor decrypts the model weights in the form of ciphertext, and stores model weights in the form of plaintext obtained through decryption in the protected storage area, and the general processing device cannot access the protected storage area.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of communication technology, and in particular to a data processing method, apparatus and device. Background Technology

[0002] With the widespread application of artificial intelligence (AI) across various industries, the protection of AI models is becoming increasingly important. AI model protection mainly includes protection in three stages: storage, transmission, and execution. Among these, how to achieve protection during the execution stage has always been a challenging issue for the industry.

[0003] Currently, to protect the security of AI models, electronic devices with confidential computing capabilities are typically used. Taking an electronic device comprising a central processing unit (CPU) and a graphics processing unit (GPU), with the CPU and GPU connected via a peripheral component interconnect express (PCIe) bus, as an example, this requires the CPU to support a trusted execution environment (TEE), the GPU to support multi-instance isolation, and the PCIe bus to support encrypted / decrypted transmission capabilities. This achieves a hardware isolation environment across the CPU and GPU, ensuring the security of the AI ​​models running within it.

[0004] However, this solution places high demands on the hardware of electronic devices, such as requiring the devices to support isolation and encryption / decryption transmission capabilities, making it difficult to apply to existing related products on the market and thus limiting its use. Summary of the Invention

[0005] This application provides a data processing method, apparatus, and device for protecting the security of model information during runtime without using hardware isolation technologies such as TEE.

[0006] To achieve the above objectives, the embodiments of this application adopt the following technical solutions:

[0007] In a first aspect, a data processing method is provided, applied in a data acceleration processing device (i.e., a device with data acceleration processing capabilities, such as a GPU or NPU), the data acceleration processing device including an acceleration processor and memory, the memory including an unprotected storage area and a protected storage area, the data acceleration processing device being able to communicate with a general-purpose processing device, the method comprising: the memory receiving model information sent by the general-purpose processing device and storing the model information in the unprotected storage area, the general-purpose processing device being able to access the unprotected storage area, the model information including model weights in encrypted form; the acceleration processor decrypting the encrypted model weights and storing the decrypted plaintext model weights in the protected storage area, the general-purpose processing device being unable to access the protected storage area.

[0008] In the above technical solution, the general-purpose processing device can send model information to the data acceleration processing device, and the model weights in the model information are in encrypted form, thereby ensuring the security of the model weights within the general-purpose processing device and during transmission. Furthermore, after decrypting the encrypted model weights, the data acceleration processing device stores the decrypted plaintext model weights in a protected storage area inaccessible to the general-purpose processing device. That is, the general-purpose processing device cannot access the plaintext model weights. Thus, when the data acceleration processing device executes the corresponding computational task based on the plaintext model weights, the security of the model weights during execution is guaranteed. Therefore, compared with related technologies, the embodiments of this application can be used to protect the security of model information during runtime without using hardware isolation technologies such as TEE, thereby preventing the theft of model information.

[0009] In one possible implementation of the first aspect, the specific process of the accelerator decrypting the encrypted model weights can be executed by a corresponding accelerator or operator within the accelerator. For example, the accelerator may include a decryption operator AICPU, which can be used to decrypt the encrypted model weights. Furthermore, the data acceleration processing device also includes a TEE environment, in which the process of the accelerator decrypting the encrypted model weights can be executed.

[0010] In one possible implementation of the first aspect, the data acceleration processing device further includes persistent memory storing a decryption key; the acceleration processor decrypts the encrypted model weights by: obtaining the decryption key from the persistent memory and decrypting the encrypted model weights according to the decryption key. In the above possible implementation, by storing the decryption key in the persistent memory and decrypting the encrypted model weights according to the decryption key, the security of the model weights is ensured.

[0011] In one possible implementation of the first aspect, the data acceleration processing device further includes a persistent memory storing a signature verification public key, and the model information further includes signature information; the method further includes: the acceleration processor retrieving the signature verification public key from the persistent memory, and determining, based on the signature verification public key, that the signature information of the model information has been successfully verified. In the above possible implementations, by signing and verifying the model information, the integrity of the model information can be guaranteed.

[0012] In one possible implementation of the first aspect, the decryption key and / or signature verification public key are stored encrypted. The data acceleration processing device further includes a one-time programmable memory storing a hardware unique key HUK. The method further includes: the acceleration processor retrieving the HUK from the one-time programmable memory and encrypting / decrypting the decryption key and / or signature verification public key according to a derived key of the HUK. In the above possible implementations, by encrypting and storing the decryption key and / or signature verification public key, the confidentiality of the decryption key and / or signature verification public key can be guaranteed, thereby ensuring the security of the model weights.

[0013] In one possible implementation of the first aspect, the encrypted model weights and signature information in the model information can be generated in the development environment.

[0014] In one possible implementation of the first aspect, the model weights are stored contiguously in the protected storage area; the method further includes: the accelerator sending the base address of the plaintext model weights in the protected storage area to the general-purpose processing device, the base address being used to determine the logical address of the model weights; the accelerator receiving the logical address of the model weights sent by the general-purpose processing device, and retrieving the plaintext model weights from the protected storage area based on the logical address. In the above possible implementation, the accelerator can send the base address of the plaintext model weights in the protected storage area to the general-purpose processing device, enabling the general-purpose processing device to determine the logical address based on the base address and the address offset of the model weights, and send it to the accelerator. Therefore, when scheduling the data acceleration processing device to execute computational tasks, the general-purpose processing device does not need to read the model weights, but only needs to send the logical address of the model weights, thereby ensuring the security of the model weights.

[0015] In one possible implementation of the first aspect, the method further includes: the accelerator receiving the address offset value of the model weight sent by the general-purpose processing device; the accelerator determining the logical address of the model weight based on the base address of the plaintext model weight in the protected storage area and the address offset value. In the above possible implementation, when scheduling the data acceleration processing device to execute a computational task, the general-purpose processing device does not need to read the model weight, but only needs to send the address offset value of the model weight, thereby ensuring the security of the model weight.

[0016] In one possible implementation of the first aspect, the method further includes: the accelerator receiving memory configuration information and configuring the unprotected storage area and / or the protected storage area in the memory according to the memory configuration information. In the above possible implementations, by configuring a protected storage area in the memory of the data acceleration processing device that is inaccessible to the general-purpose processing device, the security of the model weights can be guaranteed.

[0017] Secondly, a data processing method is provided, applied in a general-purpose processing device, which communicates with a data acceleration processing device via a bus. The memory of the data acceleration processing device includes an unprotected storage area accessible by the general-purpose processing device and a protected storage area inaccessible by the general-purpose processing device. The method includes: acquiring model information, the model information including model weights in encrypted form; sending the model information to the acceleration processor, so that the memory of the data acceleration processing device stores the model information in the unprotected storage area; and causing the acceleration processor to store the model weights in plaintext form obtained by decrypting the encrypted model weights in the protected storage area.

[0018] In the above technical solution, the general-purpose processing device can transmit model information to the data acceleration processing device, and the model weights in the model information are in encrypted form, thereby ensuring the security of the model weights within the general-purpose processing device and during transmission. Furthermore, after decrypting the encrypted model weights, the data acceleration processing device stores the decrypted plaintext model weights in a protected storage area inaccessible to the general-purpose processing device. That is, the general-purpose processing device cannot access the plaintext model weights, thus ensuring the security of the model weights during execution when the data acceleration processing device performs the computation task based on the plaintext model weights. Therefore, compared with related technologies, the embodiments of this application can be used to protect the security of model information during runtime without using hardware isolation technologies such as TEE, thereby preventing the theft of model information.

[0019] In one possible implementation of the second aspect, the model information also includes signature information. In the above possible implementations, by signing and verifying the model information, the integrity of the model information can be guaranteed.

[0020] In one possible implementation of the second aspect, the method further includes: the general-purpose processing device receiving the base address of the plaintext model weight in the protected storage area sent by the data acceleration processing device; the general-purpose processing device determining the logical address of the model weight based on the base address and the address offset value of the model weight; and the general-purpose processing device sending the logical address of the model weight to the data acceleration processing device. In the above possible implementation, the general-purpose processing device can receive the base address of the plaintext model weight in the protected storage area sent by the data acceleration processing device, determine the logical address based on the base address and the address offset value of the model weight, and send it to the data acceleration processing device. Therefore, when scheduling the data acceleration processing device to execute a computational task, the general-purpose processing device does not need to read the model weight, but only needs to read the logical address of the model weight, thereby ensuring the security of the model weight.

[0021] In one possible implementation of the second aspect, the method further includes: the general-purpose processing device sending the address offset value of the model weight to the data acceleration processing device, so that the data acceleration processing device determines the logical address of the model weight based on the base address of the plaintext model weight in the protected storage area and the address offset value. In the above possible implementation, when scheduling the data acceleration processing device to execute a computation task, the general-purpose processing device does not need to read the model weight, but only needs to send the address offset value of the model weight, thereby ensuring the security of the model weight.

[0022] In one possible implementation of the second aspect, the method further includes: sending memory configuration information to the data acceleration processing device through the management channel, the memory configuration information being used to configure the unprotected storage area and / or the protected storage area. In the above possible implementations, by configuring a protected storage area in the memory of the data acceleration processing device that is inaccessible to the general-purpose processing device, the security of the model weights can be guaranteed.

[0023] Thirdly, a data processing method is provided, applied to a data acceleration processing device. The data acceleration processing device includes an accelerator and memory. The memory includes an unprotected storage area and a protected storage area. The unprotected storage area includes model information, which includes model weights in encrypted form. The protected storage area includes model weights in plaintext form after decryption of the encrypted model weights. The method includes: the accelerator acquiring a target task sent by a general processing device based on the model information; the accelerator acquiring a first logical address of the target weights and reading the target weights from the protected storage area according to the first logical address, wherein the target weights are weights in the model weights required for the execution of the target task, and the general processing device cannot access the protected storage area; and the accelerator executing the target task according to the target weights.

[0024] In the above technical solution, the data acceleration processing device stores model information, including model weights in encrypted form, in an unprotected storage area, and stores the corresponding decrypted plaintext model weights in a protected storage area that is inaccessible to the general processing device. That is, the general processing device cannot access the plaintext model weights. In this way, when the data acceleration processing device executes the corresponding target task based on the plaintext model weights, it can ensure the security of the model weights during execution.

[0025] In one possible implementation of the third aspect, the accelerator obtains the logical address of the target weight by: determining the base address of the plaintext model weight in the protected storage area; sending the base address to the general-purpose processing device, the base address being used to determine the first logical address of the target weight; and receiving the first logical address of the target weight sent by the general-purpose processing device. In the above possible implementations, when executing the corresponding target task, the data acceleration processing device does not need to read the target weight, but only needs to send the first logical address of the target weight, thereby ensuring the security of the model weight.

[0026] Fourthly, a data acceleration processing apparatus is provided, comprising a processing unit and a storage unit. The storage unit includes an unprotected storage area and a protected storage area. The storage unit is configured to receive model information sent by a general-purpose processing device and store the model information in the unprotected storage area, which is accessible to the general-purpose processing device. The model information includes model weights in encrypted form. The processing unit is configured to decrypt the encrypted model weights and store the decrypted plaintext model weights in the protected storage area, which is inaccessible to the general-purpose processing device.

[0027] In one possible implementation of the fourth aspect, the data acceleration processing apparatus further includes a persistent memory storing a decryption key; the processing unit is further configured to: retrieve the decryption key from the persistent memory and decrypt the encrypted model weights based on the decryption key.

[0028] In one possible implementation of the fourth aspect, the data acceleration processing device further includes a persistent memory storing a signature verification public key, and the model information further includes signature information; the processing unit is further configured to: retrieve the signature verification public key from the persistent memory, and determine, based on the signature verification public key, that the signature information of the model information has been successfully verified.

[0029] In one possible implementation of the fourth aspect, the decryption key and / or signature verification public key are stored in encrypted form, and the data acceleration processing device further includes a one-time programmable memory storing a hardware unique key HUK; the processing unit is further configured to: retrieve the HUK from the one-time programmable memory, and encrypt and decrypt the decryption key and / or signature verification public key according to a derived key of the HUK.

[0030] In one possible implementation of the fourth aspect, the model weights are stored contiguously in the protected storage area; the processing unit is further configured to: send the base address of the plaintext model weights in the protected storage area to the general processing device, the base address being used to determine the logical address of the model weights; receive the logical address of the model weights sent by the general processing device, and retrieve the plaintext model weights from the protected storage area according to the logical address.

[0031] In one possible implementation of the fourth aspect, the processing unit is further configured to: receive memory configuration information and configure the unprotected storage area and / or the protected storage area according to the memory configuration information.

[0032] Fifthly, a general-purpose processing apparatus is provided for communicating with a data acceleration processing apparatus. The data acceleration processing apparatus includes an acceleration processor and memory. The memory includes an unprotected storage area accessible by the general-purpose processing apparatus and a protected storage area inaccessible by the general-purpose processing apparatus. The general-purpose processing apparatus includes: a processing unit for acquiring model information, the model information including model weights in encrypted form; and a sending unit for sending the model information to the data acceleration processing apparatus, so that the memory in the data acceleration processing apparatus stores the model information in the unprotected storage area, and the acceleration processor stores the model weights in plaintext form after decrypting the encrypted model weights in the protected storage area.

[0033] In one possible implementation of the fifth aspect, the model information also includes signature information used to verify the model information.

[0034] In one possible implementation of the fifth aspect, the model weights are stored contiguously in the protected storage area, and the general-purpose processing device further includes a receiving unit; the receiving unit is configured to receive the base address of the plaintext model weights in the protected storage area sent by the data acceleration processing device; the processing unit is further configured to determine the logical address of the model weights based on the base address and the address offset value of the model weights; and the sending unit is further configured to send the logical address of the model weights to the data acceleration processing device.

[0035] In one possible implementation of the fifth aspect, the sending unit is further configured to: send memory configuration information to the data acceleration processing device, the memory configuration information being used to configure the unprotected storage area and / or the protected storage area.

[0036] In a sixth aspect, a data acceleration processing apparatus is provided, comprising a processing unit and a storage unit, the storage unit including an unprotected storage area and a protected storage area; the storage unit is configured to store model information in the unprotected storage area, the model information including model weights in encrypted form, the protected storage area being inaccessible to a general-purpose processing device; the storage unit is further configured to store the model weights in plaintext form corresponding to the encrypted model weights in the protected storage area, the protected storage area being inaccessible to the general-purpose processing device; the processing unit is configured to obtain a target task sent by the general-purpose processing device based on the model information; the processing unit is further configured to obtain a first logical address of the target weights and read the target weights from the protected storage area according to the first logical address, the target weights being weights in the model weights required for the execution of the target task; the processing unit is further configured to execute the target task according to the target weights.

[0037] In one possible implementation of the sixth aspect, the data acceleration processing apparatus further includes a sending unit and a receiving unit; the processing unit is further configured to determine the base address of the plaintext model weight in the protected storage area; the sending unit is further configured to send the base address of the target weight to the general processing apparatus, the base address being used to determine a first logical address of the target weight; and the receiving unit is configured to receive the first logical address of the target weight sent by the general processing apparatus.

[0038] A seventh aspect provides a data acceleration processing apparatus, the data acceleration processing apparatus including an acceleration processor and a memory, the memory storing instructions that, when the acceleration processor executes the instructions, cause the apparatus to perform a data processing method provided by the first aspect or any possible implementation thereof, or to perform a data processing method provided by the third aspect or any possible implementation thereof.

[0039] Eighthly, a general-purpose processing apparatus is provided, the general-purpose processing apparatus including a processor and a memory storing instructions that, when executed by the processor, cause the apparatus to perform a data processing method as provided in the second aspect or any possible implementation thereof.

[0040] In another aspect of this application, an electronic device is provided, comprising: a data acceleration processing device as provided in any of the preceding aspects, and a general processing device as provided in any of the preceding aspects.

[0041] In another aspect of this application, a terminal device is provided, comprising: a data acceleration processing apparatus as provided in any of the foregoing aspects, and a general processing apparatus as provided in any of the foregoing aspects. For example, the terminal device may be a mobile phone, tablet, computer, camera, wearable device, or in-vehicle device, etc.

[0042] In another aspect, this application provides a computer-readable storage medium storing instructions that, when executed by a device, cause the device to perform a data processing method as provided in the first aspect or any possible implementation thereof.

[0043] In another aspect, this application provides a computer-readable storage medium storing instructions that, when executed by a device, cause the device to perform a data processing method as provided in the second aspect or any possible implementation thereof.

[0044] In another aspect, this application provides a computer-readable storage medium storing instructions that, when executed by a device, cause the device to perform a data processing method as provided in the third aspect or any possible implementation thereof.

[0045] In another aspect of this application, a computer program product is provided, comprising: a computer program (also referred to as code or instructions) that, when run, causes a computer to perform a data processing method as provided by the first aspect or any possible implementation thereof.

[0046] In another aspect of this application, a computer program product is provided, comprising: a computer program (also referred to as code or instructions) that, when run, causes a computer to perform a data processing method as provided by the second aspect or any possible implementation thereof.

[0047] In another aspect of this application, a computer program product is provided, comprising: a computer program (also referred to as code or instructions) that, when run, causes a computer to perform a data processing method as provided by the third aspect or any possible implementation thereof.

[0048] It is understood that the beneficial effects achieved by any of the data acceleration processing devices, general processing devices, electronic devices, terminal devices, computer-readable storage media and computer program products provided above can be referred to in accordance with the beneficial effects of the data processing methods provided above, and will not be repeated here. Attached Figure Description

[0049] Figure 1 This is a schematic diagram illustrating how to implement runtime protection for an AI model.

[0050] Figure 2 This is a schematic diagram illustrating another method for protecting AI models during runtime.

[0051] Figure 3 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application;

[0052] Figure 4 This is a schematic diagram of the structure of another electronic device provided in an embodiment of this application;

[0053] Figure 5 A flowchart illustrating a data processing method provided in an embodiment of this application;

[0054] Figure 6 A schematic diagram of model information provided in an embodiment of this application;

[0055] Figure 7 A flowchart illustrating another data processing method provided in an embodiment of this application;

[0056] Figure 8 A schematic diagram of a plaintext model weight provided for an embodiment of this application;

[0057] Figure 9 A schematic diagram of a model weight in encrypted form provided for an embodiment of this application;

[0058] Figure 10This is a schematic diagram of a data processing method provided in an embodiment of this application;

[0059] Figure 11 A schematic diagram of a data acceleration processing device provided in an embodiment of this application;

[0060] Figure 12 This is a schematic diagram illustrating another data processing method provided in an embodiment of this application;

[0061] Figure 13 A flowchart illustrating another data processing method provided in an embodiment of this application;

[0062] Figure 14 This is a schematic diagram of the structure of a data acceleration processing device provided in an embodiment of this application;

[0063] Figure 15 A schematic diagram of another data acceleration processing device provided in the embodiments of this application;

[0064] Figure 16 This is a schematic diagram of the structure of a general processing device provided in an embodiment of this application;

[0065] Figure 17 This is a schematic diagram of another general processing device provided in an embodiment of this application. Detailed Implementation

[0066] The technical solutions in the embodiments of this application will be described below with reference to the accompanying drawings. In this application, "at least one" means one or more, and "more than one" means two or more. "And / or" describes the relationship between related objects, indicating that there can be three relationships. For example, A and / or B can mean: A exists alone, A and B exist simultaneously, or B exists alone, where A and B can be singular or plural. The character " / " generally indicates that the related objects before and after are in an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can mean: a, b, c, a and b, a and c, b and c, or a, b, and c, where a, b, and c can be single or multiple.

[0067] The embodiments of this application use terms such as "first" and "second" to distinguish objects with similar names, functions, or effects. Those skilled in the art will understand that the terms "first" and "second" do not limit the quantity or order of execution. The term "coupling" is used to indicate an electrical connection, including direct connection via wires or terminals or indirect connection via other devices. Therefore, "coupling" should be considered as a broad type of electronic communication connection.

[0068] It should be noted that, in this application, the terms "exemplary" or "for example" are used to indicate that something is being described as an example, illustration, or illustration. Any embodiment or design described as "exemplary" or "for example" in this application should not be construed as being more preferred or advantageous than other embodiments or design solutions. Specifically, the use of terms such as "exemplary" or "for example" is intended to present the relevant concepts in a concrete manner.

[0069] Before introducing the embodiments of this application, the relevant background technology involved in this application will be described first.

[0070] With the widespread application of artificial intelligence (AI) across various industries, the protection of AI models is becoming increasingly important. AI model protection mainly includes protection in three stages: storage, transmission, and execution. Among these, how to achieve protection during the execution stage has always been a challenging issue in the industry. Related technologies typically employ the following two methods to achieve protection of AI models during the execution stage, which can also be referred to as runtime protection.

[0071] The first method involves using electronic devices with confidential computing capabilities. For example... Figure 1As shown, taking an electronic device comprising a central processing unit (CPU) and a graphics processing unit (GPU), with the CPU and GPU connected via a peripheral component interconnect express (PCIe) bus, the requirements are: the CPU must support a trusted execution environment (TEE) capability; the GPU must support multi-instance isolation; and the PCIe bus must support encrypted and decrypted transmission capabilities. Specifically, each TEE of the CPU is configured with a confidential virtual machine (VM). This confidential VM contains the GPU driver, which is transmitted encrypted via the PCIe bus. The GPU includes secure GPU instances isolated by a firewall. Each secure GPU instance is configured with a corresponding PCIe virtual function (VF), which enables the secure GPU instance to transmit data via the PCIe bus. Therefore, in this electronic device, a confidential VM in the CPU, a secure GPU instance in the GPU, and the corresponding PCIe VF can form a hardware isolation environment (or a secure environment) across the CPU and GPU, thereby ensuring the security of the AI ​​model running in it.

[0072] In the above scheme, the CPU can be used to process AI application inputs and load model files; the GPU can be used to execute corresponding AI computing tasks; and the PCIe bus can be used to transmit and interact with the data flow between the CPU and GPU. In other words, this scheme covers three parts of the data flow design. First, TEE isolation capability is used on the CPU, preventing insecure environments from accessing AI applications and AI models. By connecting to a remote verification server or local verification, the operating status of the GPU environment connected to the TEE can be guaranteed to meet requirements, such as correct firmware version. Second, the operating status on the GPU can be measured and perceived by the CPU; multi-instance isolation provides hardware isolation capabilities, ensuring that AI computing tasks in secure environments and AI computing tasks in insecure environments are isolated and do not interfere with each other. Third, the PCIe bus uses encrypted transmission, ensuring the confidentiality and integrity of the data flow between the CPU and GPU.

[0073] However, this solution places high demands on the hardware of electronic devices, such as requiring the hardware of electronic devices to support isolation capabilities and encryption / decryption transmission capabilities. It is difficult to apply to existing related products on the market (such as electronic devices or chips), meaning that the runtime protection capability of AI models cannot be achieved solely through software enhancement, thus limiting its use.

[0074] The second approach involves abstracting model-related operation units within a TEE (Time-Based Environment) and combining this with existing computing and storage resources within the TEE to achieve parameter protection for the AI ​​model during runtime. Specifically, for example... Figure 2 As shown, the electronic device includes a Trusted Execution Environment (TEE) and a Non-Trusted Execution Environment (non-TEE). The TEE includes a configuration manager, a model processor, a model manager, an authentication manager, secure storage, and a computation processor. The non-TEE includes an AI service, applications, and an accelerator. The model manager communicates with the configuration manager, model processor, authentication manager, and secure storage. The model processor communicates with the AI ​​service and schedules the computation processor and accelerator to perform corresponding computations. The AI ​​model in the electronic device can be configured after development and training.

[0075] The above solution involves significant modifications to the software stack and does not protect the information related to the AI ​​model sent between the model processing unit and the accelerator.

[0076] Based on this, this application provides a data processing method that can protect the security of AI models and prevent them from being stolen during offline inference operations without using hardware isolation technologies such as TEE. This method can be applied to electronic devices, including but not limited to: mobile phones, tablets, laptops, PDAs, mobile internet devices (MIDs), wearable devices (such as smartwatches and smart bracelets), in-vehicle devices (such as cars, bicycles, electric vehicles, airplanes, ships, trains, and high-speed trains), virtual reality (VR) devices, augmented reality (AR) devices, wireless terminals in industrial control, smart home devices (such as refrigerators, televisions, air conditioners, and electricity meters), intelligent robots, workshop equipment, wireless terminals in self-driving, remote medical surgery, smart grids, transportation safety, smart cities, or smart homes, and flying equipment (such as intelligent robots, hot air balloons, drones, and airplanes). These electronic devices can also be referred to as terminal devices.

[0077] The specific structure of this electronic device will be described below.

[0078] Figure 3 This is a schematic diagram of an electronic device provided in an embodiment of this application. The electronic device includes a general-purpose processing device 100, a data acceleration processing device 200, and a bus 300. The general-purpose processing device 100 and the data acceleration processing device 200 are coupled through the bus 300. The general-purpose processing device 100 can refer to a device with general data processing, instruction or command execution, and task issuance capabilities. The general-purpose processing device 100 can also be referred to as a host, a general-purpose server, or a master device, etc. The data acceleration processing device 200 can refer to a device with data acceleration processing capabilities. The data acceleration processing device 200 can also be referred to as a device, a dedicated processing device, or a slave device, etc.

[0079] The general-purpose processing device 100 may include one or more general-purpose processors, including but not limited to: a central processing unit (CPU), a digital signal processor (DSP), a microcontroller, or a microprocessor. The general-purpose processor can be used to execute various functions of the electronic device and process data to perform overall monitoring of the electronic device, such as processing the operating system, user interface, and applications of the electronic device. In some embodiments, the general-purpose processing device 100 may also include a memory (e.g., main memory and external memory), which can be used to store data, software programs, and modules related to the general-purpose processing device 100. For example, the software program may include an inference framework. For instance, the memory may include a program storage area and a data storage area; the program storage area may store software programs, including instructions formed by code, including but not limited to an operating system and applications required for at least one function, such as sound playback or image playback; the data storage area may store data created based on the use of the electronic device, such as audio data, image data, and text data. Figure 3 The following description uses the general-purpose processing device 100, which includes a CPU and a first memory, as an example. The first memory can be DRAM, DDR, or other memory.

[0080] The data acceleration processing device 200 may include one or more accelerators, which may include, but are not limited to, graphics processing units (GPUs), neural network processing units (NPUs), DSPs, CPUs, application-specific integrated circuits (ASICs), and complex programmable logic devices (CPLDs). Each accelerator may include one or more accelerators (e.g., vector computation units, matrix computation units, and / or scalar computation units) and one or more operators (e.g., decryption operators such as AICPUs, or other custom operators). The accelerator can be used to perform corresponding calculations (e.g., vector computation, matrix computation, and / or scalar computation) under the invocation of the general-purpose processing device 100. In some embodiments, the data acceleration processing device 200 may also include a memory for storing data, software programs, and modules related to the data acceleration processing device 200, such as firmware. In one example, the memory in the data acceleration processing device 200 may include a second memory, which may include an unprotected storage area accessible to the general-purpose processing device 100 and a protected storage area inaccessible to the general-purpose processing device 100. Furthermore, the memory may also include persistent memory and one-time programmable memory (efuse). For example, the persistent memory may include flash memory, which can be used to store decryption keys and / or signature public keys, etc., as described below. One-time programmable memory refers to memory that can only be written to once (or programmed only once). Alternatively, in the embodiments of this application, the one-time programmable memory may be used to store a hardware unique key (HUK), as described below. Figure 3 The following description uses the data acceleration processing device 200, which includes an NPU and a second memory, as an example.

[0081] It is understood that the memory (e.g., the first memory and the second memory) in the embodiments of this application can refer to the internal memory of the corresponding device, which can be used to temporarily store data during the operation of the processor (e.g., a general-purpose processor and an accelerator processor) of the corresponding device. It has high access efficiency but small capacity, and the stored data is usually lost after power failure. The aforementioned persistent memory and one-time programmable memory can refer to external memory, which can be used for long-term data storage. It has relatively low access efficiency but large capacity, and the stored data is not lost after power failure. The persistent memory can be a memory that supports multiple writes, and the one-time programmable memory refers to a memory that only allows writing once.

[0082] The bus 300 can be used to transmit information between the general-purpose processing device 100 and the data acceleration processing device 200, thereby enabling communication between the general-purpose processing device 100 and the data acceleration processing device 200. In some embodiments, the bus 300 may be a PCIe bus or an extended industry standard architecture (EISA) bus, etc.

[0083] Although not shown, the general processing device 100 may also include sensor components (e.g., accelerometer, gyroscope, pressure sensor and / or temperature sensor), multimedia components (e.g., display panel and camera), input / output devices (e.g., mouse and keyboard), communication modules (e.g., Bluetooth module and WiFi module), and power supply components, etc., which will not be described in detail in the embodiments of this application.

[0084] Furthermore, the electronic device processing includes the aforementioned hardware resources, and may also include a software architecture running on the aforementioned hardware resources (e.g., processor and memory). In some embodiments of this application, the software architecture of the general-purpose processing device 100 in the electronic device may include an application layer (e.g., AI application), an application framework layer, a function library layer, and a kernel layer. In one example, such as... Figure 4 As shown, the software architecture may include an inference framework and a device management driver; the inference framework may reside in the application framework layer and can be used to provide various APIs used by the application for access; the device management driver may reside in the kernel layer and can be used to configure the second memory in the data acceleration processing device 200, such as configuring the protected and unprotected memory areas. In other embodiments of this application, such as Figure 4 As shown, the data acceleration processing device 200 may also include a corresponding software architecture, such as firmware running on the acceleration processor.

[0085] It is understandable that the above Figure 3 and Figure 4 The hardware resources and software architecture of the electronic device shown are merely exemplary. In practical applications, the electronic device may include more or different components or software than those shown in the figure. The above examples do not constitute a limitation on the embodiments of this application.

[0086] Figure 5 This is a flowchart illustrating a data processing method provided in an embodiment of this application. The method includes the following steps. This data processing can be applied to the electronic device described above, which includes a general-purpose processing device and a data acceleration processing device. The following description uses the example of the general-purpose processing device including a general-purpose processor (e.g., CPU) and first memory, and the data acceleration processing device including an acceleration processor (e.g., NPU) and second memory.

[0087] S301: The general processing device acquires model information, which includes model weights in encrypted form.

[0088] The model information can be the model information of an AI model. This model information can be stored in the first memory of the general-purpose processing device. The encrypted model weights in the model weights can refer to encrypted model weights. Optionally, the model information can also include multiple computational tasks. The model weights include the weights required by these multiple computational tasks during the computation process. These multiple computational tasks can be in plaintext in the model information, and specifically, they can be computational tasks that need to be executed by the data acceleration processing device, such as AI computational tasks. Optionally, the model information can also include other parameters besides the model weights, such as the input and output parameters required by the model. These other parameters can also be in encrypted form; this embodiment uses the encrypted model weights as an example for illustration.

[0089] In one possible embodiment, the model information is stored in the first memory of the general-purpose processing device, and the general-purpose processor of the general-purpose processing device can retrieve the model information from the first memory. Furthermore, the general-purpose processor of the general-purpose processing device can also parse and transform the model information so that the transformed model information can be recognized and processed by the data acceleration processing device.

[0090] Optionally, the model weights can be stored contiguously in the first memory and can be arranged according to the computation order in the multiple computation tasks. The model weights can include multiple weights, and the order in which the multiple weights are arranged in the first memory is consistent with the computation order of the multiple weights in the multiple computation tasks.

[0091] Furthermore, the model information may also include plaintext computation graph information, which may include multiple computation nodes and the computation order corresponding to these nodes. The multiple weights of the model weights may include the weights used by these computation nodes during the computation process. These multiple computation tasks can also be referred to as the multiple computation tasks corresponding to the computation graph information. Therefore, the order in which the model weights are arranged in the first memory location can also be consistent with the computation order of the model weights in the computation graph information.

[0092] Optionally, the model information may also include signature information, which can be used to verify other information in the model information besides the signature information (e.g., computation graph information, model weights, and computation tasks) to ensure the integrity of the model information. For example, the signature information may be calculated by using a preset algorithm to calculate other information in the model information besides the signature information.

[0093] In one possible embodiment, the encrypted model weights and signature information in the model information can be generated in the development environment. For example, such as... Figure 6 As shown, when model information is obtained in the development environment, the plaintext model weights in the model information can be encrypted to obtain model information containing ciphertext model weights; then the encrypted model information is signed to obtain model information containing signature information and ciphertext model weights.

[0094] It is understood that the above-described encryption of the model weights in the model information within the development environment and the signing of the encrypted model information are merely illustrative. In practical applications, the encryption and signing of the model weights can be performed during subsequent configuration processes, rather than in the development environment. The above examples do not constitute a limitation on the embodiments of this application.

[0095] Furthermore, before acquiring the model information, the general-purpose processing device can first load the model information into the first memory. In one possible embodiment, the general-purpose processing device may include external memory, in which the model information can be configured (or stored) after development, so that the general-purpose processor of the general-purpose processing device can load the model information from the external memory into the first memory.

[0096] Optionally, if the model weights and the encrypted signature are not encrypted in the development environment, the model weights in the model information can be encrypted and the encrypted model information can be signed during the process of configuring the model information in the external storage.

[0097] S302: The general processing device sends the model information to the data acceleration processing device.

[0098] In one possible embodiment, the general-purpose processing device (GPSD) can transmit the model information to the data acceleration processing device via a bus. For example, the bus could be a PCIe bus. When the CPU of the GPSD obtains the model information, it can send the model information to the data acceleration processing device via the PCIe bus, or directly store the model information in the second memory of the data acceleration processing device via the bus, specifically in the unprotected storage area of ​​the second memory. The GPSD can access this unprotected storage area. For instance, the CPU of the GPSD can copy the model information from the GPSD to the unprotected storage area in the second memory of the data acceleration processing device.

[0099] S303: The second memory of the data acceleration processing device receives the model information and stores the model information in the unprotected storage area of ​​the second memory.

[0100] The second memory may include an unprotected storage area accessible to the general-purpose processing device, or an unprotected storage area operable by the general-purpose processing device, meaning the general-purpose processing device can read and write to this unprotected storage area. In one possible embodiment, when the second memory of the data acceleration processing device receives the model information via the bus, it can store the model information in the unprotected storage area; alternatively, the acceleration processor of the data acceleration processing device can copy the model information from the general-purpose processing device via direct memory access (DMA) and store it in the unprotected storage area.

[0101] Furthermore, the model information stored in the unprotected storage area may include model weights in encrypted form. Additionally, the model information stored in the unprotected storage area may also include at least one of multiple computational tasks and graph computation information, which may be in plaintext form.

[0102] Optionally, the model information may further include signature information. When the second memory of the data acceleration processing device stores the model information in the unprotected storage area, the data acceleration processing device may also verify the signature of the model information using a signature verification public key. In one possible embodiment, the data acceleration processing device further includes persistent memory storing a signature verification public key. The acceleration processor of the data acceleration processing device verifies the signature information of the model information, which may specifically include: the acceleration processor retrieving the signature verification public key from the persistent memory and determining that the signature information of the model information has been successfully verified based on the signature verification public key.

[0103] Optionally, the signature verification public key can be encrypted and stored in the persistent memory, so that the general-purpose processing device can decrypt the signature verification public key after obtaining it. The key used to decrypt the signature verification public key can be a derived key of the hardware unique key HUK. The HUK can be stored in the one-time programmable memory.

[0104] In one possible example, the general-purpose processing device can decrypt the encrypted signature verification public key using a derived key from the HUK to obtain the decrypted signature verification public key. In another possible example, when configuring the signature verification public key in the persistent memory of the data acceleration processing device, the signature verification public key can be encrypted using the derived key from the HUK, and then the encrypted signature verification public key can be stored in the persistent memory.

[0105] S304: The accelerator of the data acceleration processing device decrypts the encrypted model weights and stores the decrypted plaintext model weights in a protected storage area of ​​the second memory, which is inaccessible to the general-purpose processing device.

[0106] The second memory may also include a protected storage area that is inaccessible to the general-purpose processing device, or a protected storage area that is not operable by the general-purpose processing device, meaning that the general-purpose processing device cannot read or write to the protected storage area.

[0107] Furthermore, the model weights in the model information are in encrypted form. The accelerator of the data acceleration processing device can decrypt the encrypted model weights stored in the unprotected storage area to obtain the plaintext model weights. To protect the security of the plaintext model weights, the accelerator of the data acceleration processing device can also store the plaintext model weights in a protected storage area inaccessible to the general-purpose processing device.

[0108] Optionally, the specific process of decrypting the encrypted model weights by the accelerator processor of the data acceleration processing device can be executed by a corresponding accelerator or operator. For example, the data acceleration processing device may include a decryption operator AICPU, which can be used to decrypt the encrypted model weights. In addition, the data acceleration processing device is also equipped with a TEE environment, and the process of decrypting the encrypted model weights can be executed in the TEE environment.

[0109] Furthermore, in one possible embodiment, the data acceleration processing device further includes persistent memory, such as flash memory, which stores a decryption key. Accordingly, the acceleration processor of the data acceleration processing device decrypts the encrypted model weights, specifically by: the acceleration processor retrieving the decryption key from the persistent memory and decrypting the encrypted model weights according to the decryption key to obtain the plaintext model weights. The decryption key may be configured in the persistent memory; for example, it can be configured in the persistent memory of the data acceleration processing device during the process of configuring the model information in the external memory of the general-purpose processing device.

[0110] Optionally, the decryption key can be encrypted and stored in the persistent memory, so that the acceleration processor of the data acceleration processing device can decrypt the decryption key after obtaining it. The key used to decrypt the decryption key can be a derived key of the HUK. The data acceleration processing device also includes a one-time programmable memory, in which the HUK can be stored.

[0111] In one possible example, the accelerator of the data acceleration processing device can decrypt the encrypted decryption key based on the derived key of the HUK to obtain the decrypted decryption key. In another possible example, when the decryption key is configured in the persistent memory of the data acceleration processing device, the decryption key can be encrypted using the derived key of the HUK, and then the encrypted decryption key can be stored in the persistent memory.

[0112] After the accelerator processor of the data acceleration processing device stores the decrypted plaintext model weights into the protected storage area of ​​the second memory, such as Figure 7 As shown, the method may also include S305.

[0113] S305: The acceleration processor of the data acceleration processing device executes multiple computational tasks corresponding to the model information based on the model weights in plaintext form.

[0114] When the accelerator of the data acceleration processing device decrypts the plaintext model weights, the accelerator can execute the multiple computational tasks based on the model weights. Optionally, after completing the multiple computational tasks, the data acceleration processing device can also return the computational results corresponding to the multiple computational tasks to the general processing device.

[0115] In this embodiment, the general-purpose processing device can transmit model information to the data acceleration processing device via a bus, and the model weights in the model information are in encrypted form, thereby ensuring the security of the model weights within the general-purpose processing device and during transmission. Furthermore, after decrypting the encrypted model weights, the data acceleration processing device stores the decrypted plaintext model weights in a protected storage area inaccessible to the general-purpose processing device. That is, the general-purpose processing device cannot access the plaintext model weights, thus ensuring the security of the model weights during execution when the data acceleration processing device performs the computation task based on the plaintext model weights. Therefore, compared with related technologies, this embodiment can protect the security of model information during operation without using hardware isolation technologies such as TEE, thereby preventing the theft of model information.

[0116] Furthermore, such as Figure 7 As shown, before S305, the method further includes S306 to S309.

[0117] S306: The acceleration processor of the data acceleration processing device sends the plaintext model weights to the general processing device at the base address of the protected storage area.

[0118] The plaintext model weights can be stored contiguously in the protected storage area and can be arranged according to the computation order in the multiple computational tasks. For example, the model weights can include multiple weights, and the order in which these weights are arranged in the protected storage area is consistent with the computation order in the multiple computational tasks. The base address, also known as the starting address, specifically refers to the logical address of the first weight among the multiple weights included in the model weights. This logical address can be the address used by the processor (e.g., an accelerator processor) to locate the data when accessing it.

[0119] For example, taking the graph computation information in the model information as including multiple computation nodes and the computation order corresponding to these multiple computation nodes, and the model weights as including multiple weights that are the weights used by these multiple computation nodes during the computation process, the arrangement order of the plaintext model weights in the protected storage area can be as follows: Figure 8 As shown.

[0120] In one possible embodiment, after storing the plaintext model weights in the protected storage area, the acceleration processor of the data acceleration processing device can determine the base address of the plaintext model weights in the protected storage area based on the position of the first weight in the protected storage area, and send the base address to the general processing device via the bus.

[0121] S307: The general-purpose processing device receives the base address of the model weight in the protected storage area in plaintext form sent by the data acceleration processing device, and determines the logical address of the model weight based on the base address and the address offset value of the model weight.

[0122] In this general-purpose processing device, the encrypted model weights are stored contiguously and arranged according to the computation order in the multiple computing tasks. For example, the model weights may include multiple weights, and the order in which these weights are arranged in the general-purpose processing device is consistent with the computation order in the computing tasks. The address offset value of the model weights may include the address offset value of any one of the multiple weights. Taking the i-th weight as an example, the address offset value of the i-th weight may refer to the offset value between the i-th logical address of the i-th weight and the base address corresponding to the first weight, where i is an integer.

[0123] For example, taking the graph computation information in the model information as including multiple computation nodes and the computation order corresponding to the multiple computation nodes, and the model weights as including multiple weights that are the weights used by the multiple computation nodes in the computation process, the arrangement order of the encrypted model weights in the general processing device can be as follows: Figure 9 As shown.

[0124] When the encrypted model weights are stored contiguously in the first memory of the general-purpose processing device, the general-purpose processing device can determine the address offset value of any one of the model weights according to the arrangement order in which the encrypted model weights are stored. The address offset value of a weight can be the offset between the logical address of the weight and the base address corresponding to the first weight in the arrangement order of the model weights.

[0125] In one possible embodiment, when the general processing device receives the base address of the plaintext model weight in the protected storage area, the general processing device can determine the logical address of the weight in the protected storage area based on the base address and the address offset value of any one of the model weights.

[0126] S308: The general processing device sends the logical address of the model weight to the data acceleration processing device.

[0127] When the general-purpose processing device determines the logical address of each weight in the model weights, it can send the logical address of each weight in the model weights to the data acceleration processing device via the bus. In one possible embodiment, the general-purpose processing device can send the logical addresses of the weights required for each computation task in the model weights to the data acceleration processing device in the order in which the data acceleration processing device executes the aforementioned plurality of computation tasks.

[0128] S309: The acceleration processor of the data acceleration processing device receives the logical address of the model weight and retrieves the plaintext model weight from the protected storage area according to the logical address.

[0129] When the accelerator of the data acceleration processing device receives the logical address of each weight in the model weight, the accelerator of the data acceleration processing device can, when executing the multiple computing tasks, retrieve the corresponding plaintext model weight from the protected storage area according to the logical address, and execute the multiple computing tasks according to the plaintext model weight.

[0130] In another possible embodiment, the acceleration processor of the data acceleration processing device may also determine the logical address of the model weight in the following manner: the general processing device sends the address offset value of the model weight to the data acceleration processing device through the bus; the acceleration processor of the data acceleration processing device receives the address offset value of the model weight through the bus, and determines the logical address of the model weight based on the base address of the model weight in plaintext form in the protected storage area and the address offset value.

[0131] Furthermore, the unprotected storage area and the protected storage area can be pre-configured by those skilled in the art. Optionally, the protected storage area can be a statically partitioned block of physically contiguous storage, and the capacity of the protected storage area is configurable.

[0132] For example, an unprotected storage area accessible to the general-purpose processing device can be pre-configured in the second memory, and the storage area in the second memory other than the unprotected storage area can be the protected memory; or, a protected storage area inaccessible to the general-purpose processing device can be pre-configured in the second memory, and the storage area in the second memory other than the protected storage area can be the unprotected memory; or, both the unprotected storage area and the protected storage area can be configured in the second memory at the same time.

[0133] In one possible embodiment, the bus includes a management channel through which the general-purpose processing device can send memory configuration information to the data acceleration processing device before acquiring the model information. This memory configuration information is used to configure the unprotected storage area and / or the protected storage area. Thus, the acceleration processor of the data acceleration processing device can receive the memory configuration information through the management channel and configure the unprotected storage area and / or the protected storage area in the second memory according to the memory configuration information. The configuration of the protected storage area is explained below.

[0134] When configuring the unprotected storage area and the protected storage area, the second memory of the data acceleration processing device can be cleaned first, that is, the data in the second memory can be cleaned to prevent the residual data in the second memory from being leaked.

[0135] In addition, the size of the protected storage area can be configured through this management channel, and a statically partitioned physical address contiguous storage area can be used as the protected storage area.

[0136] Secondly, after configuring the protected memory area, address mapping is performed. The address of the protected memory area does not participate in the address mapping corresponding to the bus. For example, if the bus is a PCIe bus, the page table (also called a PCIe page table) corresponding to the PCIe bus does not include the address mapping of the protected memory area, thus the general-purpose processing device cannot access the protected memory area through the bus. However, when mapping the page table of the accelerator processor in the data acceleration processing device, the addresses of both the protected and unprotected memory areas can participate in the mapping. That is, the page table corresponding to the accelerator processor in the data acceleration processing device includes the address mappings of both the protected and unprotected memory areas, thus ensuring that the accelerator processor of the data acceleration processing device can access both the unprotected and protected memory areas.

[0137] Furthermore, when configuring the unprotected storage area and the protected storage area, if there are other running services in the electronic device, the device can be reset to forcibly interrupt the execution of those services.

[0138] The above embodiments describe the technical solutions of the embodiments of this application from the perspective of the interaction between the general-purpose processing device and the data acceleration processing device. In practical applications, the steps corresponding to the general-purpose processing device can be implemented by the processor in the general-purpose processing device and the inference framework running on the processor, and the steps corresponding to the data acceleration processing device can be implemented by the acceleration processor in the data acceleration processing device and the firmware running on the acceleration processor. For ease of understanding, the following will be explained separately through... Figures 10 to 12Taking the structure of the electronic device shown as an example, the technical solution of the embodiment of this application will be illustrated. Figures 10 to 12 The following explanation uses the processor in this general-purpose processing device, including the CPU, as an example.

[0139] In one example, such as Figure 10 As shown, the general-purpose processing device includes a CPU and a first memory, the data acceleration processing device includes an acceleration processor and a second memory, the second memory includes a protected storage area and an unprotected storage area, and the acceleration processor includes multiple accelerators and a decryption operator AICPU. Accordingly, the method may include: S11. The CPU of the general-purpose processing device copies model information to the unprotected storage area of ​​the data acceleration processing device via a bus. The model information includes computation graph information (plaintext), model weights (ciphertext), and computation tasks (plaintext); S12. The acceleration processor of the data acceleration processing device schedules the decryption operator AICPU to decrypt the model weights and stores the decrypted model weights (plaintext) in the protected storage area of ​​the data acceleration processing device's memory; S13. The acceleration processor of the data acceleration processing device sends the base address of the model weights in the protected storage area to the general-purpose processing device via a bus; S14. When the CPU of the general-purpose processing device receives the base address, it determines the address offset value of the model weights; S15. The CPU of the general-purpose processing device determines the logical address of the model weights based on the base address and the address offset value, and sends the logical address to the data acceleration processing device via a bus; S16. The acceleration processor of the data acceleration processing device retrieves the model weights from the protected memory based on the logical address and executes the computation task based on the model weights.

[0140] In another example, in combination Figure 10 ,like Figure 11 As shown, the general-purpose processing device and the data acceleration processing device are coupled via a PCIe bus. The general-purpose processing device also includes an inference architecture and a device management driver, while the data acceleration processing device also includes flash memory, one-time programmable memory, and firmware. Accordingly, the method may further include: S21. Configuring the data acceleration processing device via the device management driver. This configuration may include configuring a decryption key and a signature verification public key in the flash memory, and encrypting and protecting the decryption key and signature verification public key using a derived key of HUK in the one-time programmable memory; S22. Configuring a protected storage area in the memory of the data acceleration processing device via the device management driver, wherein the PCIe page table only maps the addresses of the unprotected storage area, and the page table of the acceleration processor in the data acceleration processing device maps the addresses of both the unprotected and protected storage areas.

[0141] In yet another example, combining Figure 11 ,like Figure 12As shown, the method may further include: S31. The inference architecture in the general-purpose processing device loads model information and copies the model information to the unprotected storage area of ​​the data acceleration processing device via a bus; S32. The firmware in the data acceleration processing device obtains the signature verification public key from the flash memory and verifies the signature information of the model information; S33. After successful signature verification, the firmware in the data acceleration processing device obtains the decryption key from the flash memory, decrypts the encrypted model weights stored in the unprotected storage area, and stores the decrypted plaintext model weights in the protected storage area; S34. The firmware in the data acceleration processing device loads the plaintext model weights and executes the calculation task.

[0142] In the above example, the firmware of the data acceleration processing device provides the following functions to configure the inference framework to protect the model information during runtime: a) Model information verification: protects the integrity of the model information and prevents it from being tampered with or implanted with malicious operators; b) Model weight decryption: protects the confidentiality of the model weights; c) Protected storage area: a storage area is isolated in the second memory of the data acceleration processing device, which can only be accessed by the acceleration processor of the data acceleration processing device and cannot be accessed by the general processing device through the bus; in addition, storing the decrypted plaintext model weights in the protected storage area in memory can improve the reading efficiency of the acceleration processor; d) Key management: the verification public key and decryption key are encrypted and managed to ensure the confidentiality of the keys.

[0143] In this embodiment, the general-purpose processing device can transmit model information to the data acceleration processing device via a bus, and the model weights in the model information are in encrypted form, thereby ensuring the security of the model weights within the general-purpose processing device and during transmission. Furthermore, after decrypting the encrypted model weights, the data acceleration processing device stores the decrypted plaintext model weights in a protected storage area inaccessible to the general-purpose processing device. That is, the general-purpose processing device cannot access the plaintext model weights, thus ensuring the security of the model weights during execution when the data acceleration processing device performs the computation task based on the plaintext model weights. Therefore, compared with related technologies, this embodiment can protect the security of model information during operation without using hardware isolation technologies such as TEE, thereby preventing the theft of model information.

[0144] Based on this, embodiments of this application also provide another data processing method, which can be applied to the data acceleration processing device described above, such as... Figure 13 As shown, the data processing method includes the following steps.

[0145] S320: The acceleration processor of the data acceleration processing device acquires the target task sent by the general processing device based on the model information.

[0146] The target task can be any one of multiple computational tasks corresponding to the model information. Optionally, the target task may include computation type and operand information. For example, the computation type can be one of addition, subtraction, XOR, OR, or AND operations. The operand information can be used to indicate the storage information of the input data corresponding to the target task, and this storage information can be used to obtain the input data.

[0147] S321: The data acceleration processing device obtains the first logical address of the target weight and reads the target weight from the protected storage area according to the first logical address. The general processing device cannot access the protected storage area.

[0148] In one possible embodiment, the accelerator obtaining the first logical address of the target weight may include: the accelerator determining the base address of the plaintext model weight in the protected storage area; the accelerator sending the base address to the general-purpose processing device, the base address being used to determine the first logical address of the target weight; and the accelerator receiving the first logical address of the target weight sent by the general-purpose processing device.

[0149] The target weight is the weight in the model weights required when the target task is executed; that is, the model weights include the target weight. The first logical address of the target weight specifically refers to the logical address of the target weight within the protected storage area.

[0150] S322: The acceleration processor of the data acceleration processing device executes the target task according to the target weight.

[0151] In one possible embodiment, the acceleration processor of the data acceleration processing device can also obtain the corresponding operands based on the operand information included in the target task; the acceleration processor can then execute the target task based on the obtained target weights, operands, and the computation type included in the target task. Afterwards, the acceleration processor can also send the execution result corresponding to the target task to the general-purpose processing device.

[0152] Understandably, in the above text Figures 5 to 12 All relevant content for each step in the corresponding method embodiment can be referenced from Figure 13 The corresponding method embodiments are described in detail here, and will not be repeated in the present application embodiments.

[0153] In this embodiment, the data acceleration processing device can store the plaintext model weights obtained by decrypting the encrypted model weights into a protected storage area inaccessible to the general processing device. That is, the general processing device cannot access the plaintext model weights. In this way, when the data acceleration processing device executes the corresponding target task based on the plaintext model weights, it can ensure the security of the model weights during execution.

[0154] The foregoing mainly describes the solutions provided by the embodiments of this application from the perspective of the interaction between the general-purpose processing device and the data acceleration processing device. It is understood that, in order to achieve the above-mentioned functions, the general-purpose processing device and the data acceleration processing device include hardware structures and / or software modules corresponding to the execution of each function. Those skilled in the art should readily recognize that, in conjunction with the units and algorithm steps of the various examples described in the embodiments disclosed herein, this application can be implemented in hardware or a combination of hardware and computer software. Whether a function is executed in hardware or by computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0155] This application embodiment can divide the general processing device and the data acceleration processing device into functional modules based on the above method example. For example, each function can be divided into its own functional modules, or two or more functions can be integrated into one processing module. The functional modules can be implemented in hardware or software. It should be noted that the module division in this application embodiment is illustrative and only represents one logical functional division; other division methods may be used in actual implementation. The following explanation uses the division of functional modules according to corresponding functions as an example:

[0156] When using integrated units, Figure 14 A possible structural diagram of the data acceleration processing apparatus involved in the above embodiments is shown. The data acceleration processing apparatus may include a storage unit 401 and a processing unit 402. In one possible embodiment, the storage unit 401 can be used to support the data acceleration processing apparatus in executing S303 of the above method embodiments; the processing unit 402 can be used to support the apparatus in executing S304, S305, and S309 of the above method embodiments. Further, the data acceleration processing apparatus also includes a sending unit 403; wherein the sending unit 403 can be used to support the data acceleration processing apparatus in executing S306 of the above method embodiments. In one possible embodiment, the processing unit 402 can be used to support the data acceleration processing apparatus in executing S320 to S322 of the above method embodiments.

[0157] All relevant content of each step involved in the above method embodiments can be referenced from the functional description of the corresponding functional module, and will not be repeated here.

[0158] Based on hardware implementation, the above-mentioned processing unit 402 can be an accelerator processor, the sending unit 403 can be a transmitter, and the transmitter and receiver can be integrated into a transceiver, which can also be called a communication interface.

[0159] Figure 15 This is a schematic diagram of a possible structure of the data acceleration processing apparatus according to an embodiment of this application. The data acceleration processing apparatus may include a memory 411 and an acceleration processor 412. The memory 411 stores the program code and data of the apparatus, and the acceleration processor 412 controls the operation of the data acceleration processing apparatus in the above method embodiments. For example, the acceleration processor 412 supports the apparatus in executing S304, S305, and / or S309 in the above method embodiments, or in executing S320 to S322 in the above method embodiments, and / or other processes using the technology described herein. Optionally, the data acceleration processing apparatus may further include a communication interface 413, which supports the apparatus in performing the steps of communicating with a general-purpose processing device in the above method embodiments.

[0160] The accelerator processor 412 can be a GPU, NPU, DSP, CPU, application-specific integrated circuit, processing chip, field-programmable gate array, or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It can implement or execute various logic blocks, modules, and circuits described in connection with the embodiments of this application. The accelerator processor 412 can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, etc.

[0161] Memory 411 can be volatile memory or non-volatile memory, etc. Optionally, memory 411 can be integrated into accelerator processor 412. In one example, memory 411 can include main memory, persistent memory, and one-time programmable memory, etc. For example, the persistent memory can include flash memory, which can be used to store decryption keys and / or signature public keys, etc., as described below, and the one-time programmable memory can be used to store the hardware unique key HUK, as described below.

[0162] When using integrated units, Figure 16A possible structural schematic diagram of the general-purpose processing apparatus involved in the above embodiments is shown. The apparatus may include a processing unit 501 and a transmitting unit 502. In one possible embodiment, the processing unit 501 is used to support the apparatus in executing the steps of determining the logical address in S301 and S307 of the above method embodiments; the transmitting unit 502 can be used to support the apparatus in executing S302 and S308 of the above method embodiments. Further, the general-purpose processing apparatus also includes a receiving unit 503; wherein the receiving unit 503 is used to support the general-purpose processing apparatus in executing the step of receiving the base address in S307 of the above method embodiments.

[0163] All relevant content of each step involved in the above method embodiments can be referenced from the functional description of the corresponding functional module, and will not be repeated here.

[0164] Based on hardware implementation, the above-mentioned processing unit 501 can be a processor, the sending unit 502 can be a transmitter, and the receiving unit 503 can be a receiver. The receiver and the transmitter can be integrated into a transceiver, which can also be called a communication interface.

[0165] Figure 17 This is a schematic diagram of a possible structure of a general-purpose processing device according to an embodiment of this application. The device includes a memory 511 and a processor 512. The memory 511 stores program code and data of the device, and the processor 512 controls the operation of the general-purpose processing device in the above method embodiments. For example, the processor 512 executes the steps of determining logical addresses in S301 and S307 of the above method embodiments, and / or other processes used in the techniques described herein. Optionally, the device may further include a communication interface 513, which supports the device in performing the steps of communicating with the data acceleration processing device in the above method embodiments.

[0166] The processor 512 can be a central processing unit, a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a processing chip, a field-programmable gate array (FPGA), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute various logic blocks, modules, and circuits described in connection with the embodiments of this application. The processor 512 can also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, etc. The communication interface 513 can be a transceiver, transceiver circuitry, or transceiver interface, etc. The memory 511 can be volatile memory or non-volatile memory, etc.

[0167] For example, the communication interface 513, processor 512, and memory 511 are interconnected via bus 514; bus 514 can be a PCI bus or an EISA bus, etc. Bus 514 can be divided into address bus, data bus, control bus, etc. For ease of representation, only one thick line is used in the figure, but this does not mean that there is only one bus or one type of bus.

[0168] Optionally, the memory 511 may be included in the processor 512.

[0169] In another aspect of this application, an electronic device is also provided, comprising any of the data acceleration processing devices and any of the general-purpose processing devices provided above. The data acceleration processing device is used to perform the steps of the data acceleration processing device in the above method embodiments; the general-purpose processing device is used to perform the steps of the general-purpose processing device in the above method embodiments.

[0170] In another aspect of this application, a terminal device is also provided, comprising any of the data acceleration processing devices and any of the general-purpose processing devices provided above. The data acceleration processing device is used to execute the steps of the data acceleration processing device in the above method embodiments; the general-purpose processing device is used to execute the steps of the general-purpose processing device in the above method embodiments.

[0171] For example, the terminal device can be a mobile phone, tablet, computer, camera, wearable device, or vehicle-mounted device, etc.

[0172] The methods provided in this application can be implemented entirely or partially through software, hardware, or a combination thereof. When implemented using software, they can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, a network device, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, optical fiber, twisted pair) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any medium accessible to a computer or a data storage device such as a server or data center that integrates one or more media. The medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., optical disk), or a semiconductor medium (e.g., solid-state drive), etc.

[0173] In another aspect of this application, a computer-readable storage medium is provided, the computer-readable storage medium including computer instructions that, when executed by a device, cause the device to perform one or more steps of the data acceleration processing apparatus in the above method embodiments.

[0174] In another aspect of this application, a computer-readable storage medium is provided, the computer-readable storage medium including computer instructions, which, when executed by a device, cause the device to perform one or more steps of the general processing apparatus in the data processing method provided in the above method embodiments.

[0175] In another aspect of this application, a computer program product containing instructions is provided, which, when run on a computer, causes the computer to perform one or more steps of the data acceleration processing apparatus described in the above method embodiments.

[0176] In another aspect of this application, a computer program product containing instructions is provided, which, when run on a computer, causes the computer to perform one or more steps of the general processing apparatus described in the above method embodiments.

[0177] Finally, it should be noted that the above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions within the technical scope disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A data processing method, characterized in that, Applied in a data acceleration processing device, the data acceleration processing device including an acceleration processor and memory, the memory including an unprotected storage area and a protected storage area, the method includes: The memory receives model information sent by the general-purpose processing device and stores the model information in the unprotected storage area. The general-purpose processing device does not support trusted execution environment capabilities. The general-purpose processing device can access the unprotected storage area. The model information includes model weights in encrypted form. The accelerator decrypts the encrypted model weights and stores the decrypted plaintext model weights in the protected storage area. The general-purpose processing device cannot access the protected storage area. The plaintext model weights are used by the accelerator to execute multiple computational tasks.

2. The method according to claim 1, characterized in that, The data acceleration processing device also includes a persistent memory, which stores a decryption key; The accelerated processor decrypts the ciphertext model weights, including: The accelerated processor obtains the decryption key from the persistent memory and decrypts the ciphertext model weights according to the decryption key.

3. The method according to claim 1 or 2, characterized in that, The data acceleration processing device further includes a persistent memory storing a signature verification public key, and the model information further includes signature information; the method further includes: The accelerated processor retrieves the signature verification public key from the persistent storage and determines that the signature information of the model information has been successfully verified based on the signature verification public key.

4. The method according to claim 2, characterized in that, The decryption key and / or signature verification public key are stored encrypted, and the data acceleration processing device further includes a one-time programmable memory storing a hardware unique key HUK; the method further includes: The accelerated processor retrieves the HUK from the one-time programmable memory and encrypts / decrypts the decryption key and / or the signature verification public key based on the derived key of the HUK.

5. The method according to any one of claims 1-2 and 4, characterized in that, The model weights are stored contiguously in the protected storage area, and the method further includes: The accelerated processor sends the base address of the plaintext model weights in the protected storage area to the general-purpose processing device, the base address being used to determine the logical address of the model weights; The accelerated processor receives the logical address of the model weight sent by the general processing device, and retrieves the plaintext model weight from the protected storage area according to the logical address.

6. The method according to any one of claims 1-2 and 4, characterized in that, The method further includes: The accelerated processor receives memory configuration information and configures the unprotected storage area and / or the protected storage area according to the memory configuration information.

7. A data processing method, characterized in that, The method is applied in a general-purpose processing device that does not support trusted execution environment capabilities. The general-purpose processing device is used to communicate with a data acceleration processing device, which includes an acceleration processor and memory. The memory includes an unprotected storage area accessible by the general-purpose processing device and a protected storage area inaccessible by the general-purpose processing device. Obtain model information, which includes model weights in encrypted form; The model information is sent to the data acceleration processing device so that the memory in the data acceleration processing device stores the model information in the unprotected storage area, and the acceleration processor stores the plaintext model weights after decrypting the encrypted model weights in the protected storage area. The plaintext model weights are used by the acceleration processor to execute multiple computing tasks.

8. The method according to claim 7, characterized in that, The model information also includes signature information, which is used to verify the model information.

9. The method according to claim 7 or 8, characterized in that, The model weights are stored contiguously in the protected storage area, and the method further includes: The plaintext model weights sent by the data acceleration processing device are received in the protected storage area at their base address. The logical address of the model weight is determined based on the base address and the address offset value of the model weight; The logical address of the model weights is sent to the data acceleration processing device.

10. The method according to claim 7 or 8, characterized in that, The method further includes: Send memory configuration information to the data acceleration processing device, the memory configuration information being used to configure the unprotected storage area and / or the protected storage area.

11. A data processing method, characterized in that, An application is made in a data acceleration processing device, the data acceleration processing device including an acceleration processor and memory, the memory including an unprotected storage area and a protected storage area, the unprotected storage area including model information, the model information including model weights in encrypted form, the protected storage area including model weights in plaintext form after decryption of the encrypted model weights, the method including: The acceleration processor acquires the target task sent by the general processing device based on the model information, and the general processing device does not support trusted execution environment capabilities. The accelerated processor obtains the first logical address of the target weight and reads the target weight from the protected storage area according to the first logical address. The target weight is the weight in the model weight that needs to be used when the target task is executed. The general processing device cannot access the protected storage area. The accelerated processor executes the target task according to the target weight.

12. The method according to claim 11, characterized in that, The accelerated processor obtains the logical address of the target weight, including: The accelerated processor determines the base address of the plaintext model weights in the protected memory area; The accelerated processor sends the base address to the general-purpose processing device, the base address being used to determine the first logical address of the target weight; The accelerated processor receives the first logical address of the target weight sent by the general-purpose processing device.

13. A data acceleration processing device, characterized in that, The data acceleration processing device includes a processing unit and a storage unit, wherein the storage unit includes an unprotected storage area and a protected storage area. The storage unit is used to receive model information sent by the general-purpose processing device and store the model information in the unprotected storage area. The general-purpose processing device does not support trusted execution environment capabilities, but can access the unprotected storage area. The model information includes model weights in encrypted form. The processing unit is used to decrypt the encrypted model weights and store the decrypted plaintext model weights in the protected storage area. The general-purpose processing device cannot access the protected storage area. The plaintext model weights are used by the processing unit to perform multiple computational tasks.

14. The data acceleration processing apparatus according to claim 13, characterized in that, The data acceleration processing device also includes a persistent memory, which stores a decryption key; The processing unit is further configured to obtain the decryption key from the persistent storage and decrypt the ciphertext model weights according to the decryption key.

15. The data acceleration processing apparatus according to claim 13 or 14, characterized in that, The data acceleration processing device also includes a persistent memory, which stores a signature verification public key, and the model information also includes signature information. The processing unit is further configured to obtain the signature verification public key from the persistent storage and determine that the signature information of the model information has been successfully verified based on the signature verification public key.

16. The data acceleration processing apparatus according to claim 14, characterized in that, The decryption key and / or signature verification public key are stored in an encrypted manner. The data acceleration processing device also includes a one-time programmable memory, which stores a hardware unique key HUK. The processing unit is further configured to retrieve the HUK from the one-time programmable memory and encrypt / decrypt the decryption key and / or the signature verification public key according to the derived key of the HUK.

17. The data acceleration processing apparatus according to claim 13, 14, or 16, characterized in that, The model weights are stored contiguously in the protected storage area; The processing unit is further configured to send the base address of the plaintext model weights in the protected storage area to the general processing device, wherein the base address is used to determine the logical address of the model weights; The processing unit is further configured to receive the logical address of the model weight sent by the general processing device, and obtain the plaintext model weight from the protected storage area according to the logical address.

18. The data acceleration processing apparatus according to claim 13, 14, or 16, characterized in that, The processing unit is further configured to receive memory configuration information and configure the unprotected storage area and / or the protected storage area according to the memory configuration information.

19. A universal processing device, characterized in that, The general-purpose processing device does not support trusted execution environment capabilities. The general-purpose processing device is used to communicate with a data acceleration processing device, which includes an acceleration processor and memory. The memory includes an unprotected storage area accessible by the general-purpose processing device and a protected storage area inaccessible by the general-purpose processing device. The general-purpose processing device includes: A processing unit is used to acquire model information, the model information including model weights in encrypted form; A sending unit is configured to send the model information to the data acceleration processing device, so that the memory in the data acceleration processing device stores the model information in the unprotected storage area, and the acceleration processor stores the plaintext model weights after decrypting the encrypted model weights in the protected storage area. The plaintext model weights are used by the acceleration processor to execute multiple computational tasks.

20. The universal processing apparatus according to claim 19, characterized in that, The model information also includes signature information, which is used to verify the model information.

21. The universal processing apparatus according to claim 19 or 20, characterized in that, The model weights are stored contiguously in the protected storage area, and the general processing device further includes a receiving unit; The receiving unit is configured to receive the base address of the plaintext model weights in the protected storage area sent by the data acceleration processing device; The processing unit is further configured to determine the logical address of the model weight based on the base address and the address offset value of the model weight; The sending unit is also used to send the logical address of the model weights to the data acceleration processing device.

22. The universal processing apparatus according to claim 19 or 20, characterized in that, The sending unit is further configured to send memory configuration information to the data acceleration processing device, the memory configuration information being used to configure the unprotected storage area and / or the protected storage area.

23. A data acceleration processing device, characterized in that, The data acceleration processing device includes a processing unit and a storage unit, wherein the storage unit includes an unprotected storage area and a protected storage area. The storage unit is used to store model information in the unprotected storage area. The model information includes model weights in encrypted form, and the protected storage area cannot be accessed by general processing devices. The storage unit is also used to store the plaintext model weights corresponding to the encrypted model weights in the protected storage area, and the general processing device cannot access the protected storage area. The processing unit is used to obtain the target task sent by the general processing device based on the model information, wherein the general processing device does not support trusted execution environment capabilities. The processing unit is further configured to obtain a first logical address of the target weight, and read the target weight from the protected storage area according to the first logical address. The target weight is the weight in the model weight that needs to be used when the target task is executed. The general processing device cannot access the protected storage area. The processing unit is also configured to execute the target task according to the target weight.

24. The data acceleration processing apparatus according to claim 23, characterized in that, The data acceleration processing device further includes a sending unit and a receiving unit; The processing unit is further configured to determine the base address of the plaintext model weights in the protected storage area; The sending unit is further configured to send the base address to the general processing device, the base address being used to determine the first logical address of the target weight; The receiving unit is configured to receive the first logical address of the target weight sent by the general processing device.

25. A data acceleration processing device, characterized in that, The data acceleration processing device includes an acceleration processor and a memory, wherein the memory stores instructions that, when the acceleration processor executes the instructions, cause the device to perform the data processing method as described in any one of claims 1-6, or to perform the data processing method as described in any one of claims 11-12.

26. A universal processing device, characterized in that, The general-purpose processing device does not support trusted execution environment capabilities. The general-purpose processing device includes a processor and a memory, the memory storing instructions that, when the processor executes the instructions, cause the device to perform the data processing method as described in any one of claims 7-10.

27. An electronic device, characterized in that, The electronic device includes: a data acceleration processing apparatus as described in any one of claims 13-18, any one of claims 23-24, or claim 25, and a general processing apparatus as described in any one of claims 19-22, or claim 26.

28. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed by the device, cause the device to perform the data processing method as described in any one of claims 1-12.

29. A computer program product, characterized in that, The computer program product includes instructions that, when executed by a device, cause the device to perform the data processing method as described in any one of claims 1-12.