Joint optimization method and device for graph replacement and parallelization of deep neural network inference
By performing graph substitution and parallel processing on the computation graph of deep neural networks, a joint optimization graph for the target is determined, which solves the problems of long inference time and low hardware utilization in existing technologies, and achieves more efficient hardware performance utilization and inference speed.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TSINGHUA UNIVERSITY
- Filing Date
- 2024-03-22
- Publication Date
- 2026-06-12
AI Technical Summary
Even after optimization, the inference time of existing deep neural networks is still relatively long, and existing methods fail to fully utilize hardware performance, resulting in low hardware utilization.
By determining multiple computational subgraphs based on the initial computational graph, graph substitution and parallelization are performed on each subgraph to determine the target joint optimization graph corresponding to each computational subgraph. The optimization results are used to characterize that the inference time meets the preset requirements.
By maximizing hardware performance, the inference time of deep neural networks is significantly reduced, improving hardware utilization and inference speed.
Smart Images

Figure CN118228762B_ABST