Document determination system, document determination device, and document determination method
The document determination system enhances OCR accuracy by using vectorized word vectors and positional information to identify document types, addressing the challenge of varying content in forms.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Patents
- Current Assignee / Owner
- NTT DATA GROUP CORP
- Filing Date
- 2022-07-21
- Publication Date
- 2026-06-19
AI Technical Summary
Existing OCR technologies struggle to accurately identify the type of document based on character information, especially when the content varies significantly between predefined format items and user-generated entries, leading to difficulties in distinguishing between different types of forms.
A document determination system that utilizes character recognition, vectorized word vectors, and positional information to identify document types by calculating the probability of word vector occurrence in predefined areas, enhancing the accuracy of OCR by comparing these probabilities against pre-defined document information.
The system effectively identifies document types by recognizing and converting words into vectorized forms, allowing for improved OCR accuracy by distinguishing between different document formats and user-generated content.