Document chunking method and system therefor
By chunking documents based on hierarchical structure and context, the method generates high-quality passages, enhancing the accuracy of generative language models in question-answering services.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- POSICUBE CO LTD
- Filing Date
- 2024-12-26
- Publication Date
- 2026-07-02
AI Technical Summary
Existing document chunking methods based on fixed-length token limits disregard document context, leading to corrupted passages and reduced accuracy in generative language models.
A method and system for chunking documents that considers the hierarchical structure of documents, preprocessing text to maintain context, and generating passages based on a preset maximum passage length, ensuring semantic consistency and completeness.
Generates high-quality passages that enhance the usability of passage databases, significantly improving the accuracy of generative language models in question-answering services.
Smart Images

Figure KR2024021113_02072026_PF_FP_ABST