A text-guided molecule generation method based on two-stage training and representation alignment regularization

CN122245533APending Publication Date: 2026-06-19DALIAN UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
DALIAN UNIV OF TECH
Filing Date
2026-04-02
Publication Date
2026-06-19

Smart Images

  • Figure CN122245533A_ABST
    Figure CN122245533A_ABST
Patent Text Reader

Abstract

This invention belongs to the interdisciplinary field of artificial intelligence and computational chemistry, and discloses a text-guided molecular generation method based on two-stage training and representation alignment regularization. The method involves data preparation and representation construction, model structure design, two-stage model training, and inference generation and output. Based on a latent diffusion model, it achieves text-guided molecular generation, generating chemically reasonable and semantically consistent candidate molecular structures according to natural language descriptions. The two-stage training strategy decouples unconditional molecular generation pre-training from text-conditional training in stages. In the first stage, it learns the structural distribution of the molecular chemical space, establishes stable molecular structure priors, alleviates the mutual interference between chemical structure learning and text semantic alignment, and improves the chemical legitimacy of the generated molecules. The introduction of a representation alignment regularization mechanism aligns the hidden states within the diffusion model with high-quality external molecular semantic representations, explicitly enhancing chemical semantic modeling capabilities and improving the semantic consistency and controllability of molecular generation under text conditions.
Need to check novelty before this filing date? Find Prior Art