An optimization method and device for improving global memory access efficiency of a triton compiler

By adding first axis attribute information and integer divisibility attribute values ​​to the Triton compiler, the axis analysis module and memory access merging module were optimized, solving the problem of low global memory access efficiency in the Triton compiler and achieving efficient memory access behavior.

CN122242659APending Publication Date: 2026-06-19HANGZHOU ADVANCED COMPILATION TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HANGZHOU ADVANCED COMPILATION TECHNOLOGY CO LTD
Filing Date
2026-03-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

The Triton compiler suffers from inaccurate calculation results and low memory access efficiency in global memory access optimization, especially in axis analysis optimization pass and thread layout calculation, which makes it impossible to effectively merge memory access and vectorized memory access.

Method used

Add first axis attribute information to the axis analysis module of the Triton compiler to record the continuity information of tensors in different scenarios, align the memory access granularity in the memory access merging module, add attribute values ​​with integer divisibility as preset values, and generate a unique identifier for each attribute value.

🎯Benefits of technology

It improves the global memory access efficiency of the Triton compiler, ensures the correctness of the calculation results of the axis analysis module, and realizes the efficient memory access behavior of the operator.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

This invention discloses an optimization method to improve the global memory access efficiency of the Triton compiler. The method includes: adding first axis attribute information to the axis analysis module in the Triton compiler, and recording the continuity information of tensors in different scenarios based on the first axis attribute information; in the memory access merging module of the Triton compiler, aligning the number of consecutive elements handled by a single thread with the memory access granularity during the downgrade to LLVM intermediate representation process to modify the memory access behavior of operators; adding an attribute value with a preset integer divisibility value to the parameter passing part of the operator in the intermediate representation of the Triton compiler, and generating a unique identifier for the kernel generated for each added attribute value, thereby ensuring the correctness of the calculation results of the axis analysis module and improving the memory access efficiency of operators.
Need to check novelty before this filing date? Find Prior Art