Method and apparatus for encoding weight parameters of large language model
By applying randomized Hadamard transform and linearly constrained quantization grid mapping to the weight parameters of large language models, the problem of scalar quantization and vector quantization being unable to be simultaneously achieved in existing technologies is solved. This results in high quantization accuracy and computational efficiency at low bit depths, making it suitable for edge devices and large-scale service scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 启元实验室
- Filing Date
- 2026-05-22
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies cannot simultaneously leverage the advantages of scalar and vector quantization when performing vector quantization, resulting in performance degradation and low computational efficiency of the model at low bit settings.
By performing random Hadamard transform on the weight parameters of the large language model and using a mapping grid composed of multiple discrete vectors for mapping transformation, combined with a linearly constrained quantization grid, the organic integration of scalar quantization and vector quantization is achieved. Affine transformation and LDL decomposition are used to optimize the quantization process.
It maintains high quantization accuracy and computational efficiency with low bit settings, reduces storage space requirements, and improves the model's deployment capability in edge devices and large-scale service scenarios.
Smart Images

Figure CN122242597A_ABST