A method and device for distributed secure inference of a large model based on a trusted execution environment

CN122247653APending Publication Date: 2026-06-19ZHEJIANG UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: ZHEJIANG UNIV
Filing Date: 2026-02-10
Publication Date: 2026-06-19

Smart Images

Figure CN122247653A_ABST

Patent Text Reader

Abstract

A method and apparatus for distributed secure inference of large models based on a trusted execution environment (TEE) is disclosed. The method includes: (1) model weight obfuscation. The linear layer weights of the large model are obfuscated offline in the TEE, and the key is stored in the TEE; (2) each computing node starts a secure inference engine and completes node trust verification through remote proof; (3) during model inference, the TEE encrypts the layer input activation values and sends them to the GPU, which performs linear calculations based on the obfuscated weights; (4) the linear calculation results are returned to the TEE for deobfuscation and nonlinear calculations, and the encrypted intermediate activation values are passed to the next node until the inference task is completed. This invention, by offloading computationally intensive linear operators to the GPU and retaining key management and nonlinear calculations in the TEE, effectively prevents the leakage of model weights and intermediate features in an untrusted environment while ensuring inference efficiency. It solves the problem of the difficulty in balancing secure deployment and efficient inference of large models in a third-party computing environment. It is suitable for multi-node distributed large model inference scenarios and has good security, scalability and application value.

Need to check novelty before this filing date? Find Prior Art