Text generation device and text generation model training method through removal of target word noise

A three-stage training process for Large Language Models enhances performance by learning domain and task knowledge through targeted noise removal, addressing underperformance issues in existing LLMs.

WO2026127451A1PCT designated stage Publication Date: 2026-06-18UNIVERSITY INDUSTRY COOPERATION GROUP OF KYUNG HEE UNIVERSITY

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
UNIVERSITY INDUSTRY COOPERATION GROUP OF KYUNG HEE UNIVERSITY
Filing Date
2025-11-26
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing Large Language Models (LLMs) face underperformance in specific tasks and domains due to a mismatch between pre-training and fine-tuning objectives, lacking sufficient domain and task knowledge.

Method used

A three-stage training process involving pre-training, target word denoising (TWD), and fine-tuning is employed to enhance the text generation model's performance by learning domain and task knowledge through post-training on a given dataset, using input and output text with targeted noise removal.

🎯Benefits of technology

The method reduces training costs and improves model performance by focusing on domain and task-specific knowledge acquisition, outperforming existing models in various text generation tasks.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure KR2025019773_18062026_PF_FP_ABST
    Figure KR2025019773_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Disclosed are a text generation device and a text generation model training method through removal of target word noise. The device comprises a training unit and a generation unit. The training unit performs post-training for performing target word noise removal using input text and output text of a specific task on a text generation model that has been pre-trained on linguistic characteristics using general text belonging to an unlabeled data set, and then performs fine-tuning on the post-trained text generation model. The generation unit generates result text by applying the text generation model trained by the training unit to new input text.
Need to check novelty before this filing date? Find Prior Art