Unlock instant, AI-driven research and patent intelligence for your innovation.

Document processing device, document processing method, and storage medium recording program therefor

A document processing and document technology, applied in the direction of electrical digital data processing, special data processing applications, natural language data processing, etc., can solve the problem that the specified accuracy level cannot be achieved

Inactive Publication Date: 2006-03-22
FUJIFILM BUSINESS INNOVATION CORP
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the techniques disclosed above have the problem that the title of the document is based on the presence or absence of formatting (such as underscores) that is irrelevant to the meaningful content of the character strings contained in the paper document to be digitized, or based on the presence or absence of other character strings. specified distances, which is prone to judgment errors, which makes it impossible to achieve a specified level of accuracy high enough to be practicable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document processing device, document processing method, and storage medium recording program therefor
  • Document processing device, document processing method, and storage medium recording program therefor
  • Document processing device, document processing method, and storage medium recording program therefor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] Embodiments according to the present invention will be described below with reference to the drawings.

[0019]

[0020] FIG. 1 is an exemplary block diagram showing the configuration of a document digitization system 10 configured with a document processing apparatus 110 according to a first embodiment of the present invention. The image reading device 120 in FIG. 1 is, for example, a scanner device equipped with an ADF (Auto Document Feeder) or other types of automatic paper feeding mechanism, which reads paper documents placed in the ADF one page at a time, and passes A communication line 130 such as a LAN (Local Area Network) transmits document image data corresponding to the read image to the document processing apparatus 110 . Note that although the case where the communication line 130 is a LAN is described in this embodiment, this may of course include a WAN (Wide Area Network) or the Internet or the like. It should also be noted that although the case in whi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A document processing device, a document processing method, and a storage medium recording program therefor. The invention provides a document processing device including: a memory that stores syntax data expressing syntax of character strings whose probability of being a title of a document is high or-character strings whose probability of being a title of a document is low; an input unit that inputs document data obtained by digitizing a document; an extraction unit that analyzes the input document data and extracts character string data expressing character strings; a syntax analyzing unit that analyzes the extracted character string data and specifies the syntax of each character string contained in the document corresponding to the document data; and a specifying unit that specifies, from among the extracted character string data, character string data expressing a title of the document corresponding to the document data, based on results of specification by the syntax analyzing unit and content stored in the memory.

Description

technical field [0001] The present invention relates to a technique for digitizing a paper document, and in particular, a technique for specifying a title according to the content of a paper document. Background technique [0002] Paper documents (hereinafter also referred to as "documents") are excellent media for transferring and recording information, but necessarily come with problems including the need for storage space such as archives. In addition, when information is recorded in paper documents and saved, if information recorded in these paper documents is required later, it is necessary to find among a large number of paper documents kept in archives and the like in which desired information is recorded paper documents. In other words, recording and saving information in paper documents is not ideal from the standpoint of operational efficiency. [0003] In such cases, paper documents are usually digitized and stored. Specifically, images corresponding to each pa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00
CPCG06F17/271G06F40/211
Inventor 增市博刘绍明田宗道弘田川昌俊田代洁伊滕笃石川恭辅佐藤直子
Owner FUJIFILM BUSINESS INNOVATION CORP