A binary code analysis method and system for cross-architecture knowledge transfer
By employing a cross-architecture knowledge transfer binary code analysis method, and utilizing word embedding models and linear transformation alignment matrices, a unified instruction semantic space is established. This resolves the analysis differences between different CPU architectures, improves the analysis accuracy of low-frequency CPU architectures, and enhances cross-architecture application capabilities.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANGHAI PALMIN TECH
- Filing Date
- 2026-03-27
- Publication Date
- 2026-06-12
AI Technical Summary
Existing binary code analysis methods are difficult to reuse across CPU architectures, especially for low-frequency CPU architectures where there is a lack of sufficient code samples and labeled data, resulting in low accuracy and coverage of analysis tools. Furthermore, existing cross-architecture conversion methods lose architecture-specific semantic information, affecting analysis accuracy.
By generating binary code corpora for specific CPU architectures, training the instruction vector space using a word embedding model, and aligning the instruction semantic spaces of different CPU architectures using a linear transformation alignment matrix, a unified general word embedding vector space is established, enabling cross-architecture knowledge transfer.
It enables the reuse of binary code analysis knowledge across different CPU architectures, improves the analysis accuracy of CPU architectures used in low frequency, provides a unified vector space to support tasks such as instruction prediction and vulnerability detection across architectures, and has good scalability.
Smart Images

Figure 1 
Figure 2 
Figure 3