Information processing device, information processing system, information processing method, and program

The information processing device automates file naming and indexing for document images using machine learning, reducing user labor by automatically generating file names and search indexes.

JP2026105546APending Publication Date: 2026-06-26CANON KK

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
CANON KK
Filing Date
2024-12-16
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing methods require manual registration of document images with different character string layouts, increasing user labor for generating file names.

Method used

An information processing device with character recognition, acquisition, extraction, and generation means to automatically generate file names and search indexes from document images using machine learning models.

Benefits of technology

Reduces user effort in adding information to document images by automating the file naming and indexing process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026105546000001_ABST
    Figure 2026105546000001_ABST
Patent Text Reader

Abstract

To reduce the effort required of users when adding information to document images. [Solution] The information processing device includes: character recognition means for performing character recognition processing on a document image; acquisition means for acquiring information indicating a first attribute for searching the document image and information indicating a second attribute for generating a file name; extraction means for extracting strings corresponding to the first attribute and strings corresponding to the second attribute from the group of strings obtained as a result of the character recognition processing; processing means for associating the strings corresponding to the first attribute with the document image as a search index; and generation means for generating a file name for the document image using the strings corresponding to the second attribute.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to processing of document images.

Background Art

[0002] There is a method of automatically generating information such as a file name to be assigned to a document image from the document image.

[0003] In Patent Document 1, a method of generating a file name or the like to be assigned to an input document image using the layout information of a registered document is disclosed. In Patent Document 1, as information of a registered document, the layout of a character string and information on the position where a predetermined character string in the registered document is described are registered. Then, the position of a predetermined character string is acquired from a registered document determined to match the layout of the character string in the input document image. And, the file name of the input document image is generated using the character string included in the acquired position in the input document image.

Prior Art Documents

Patent Documents

[0004]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0005] However, in the method of Patent Document 1, it is necessary to register a document image with a different character string layout as a registered document, and it is necessary to register information on the position of the character string used for the file name for each registered document. For this reason, if information such as a file name to be assigned to a document image is to be automatically generated by the method of Patent Document 1, the labor of the user registering the registered document will increase.

Means for Solving the Problems

[0006] The information processing device of the present disclosure is characterized by comprising: character recognition means for performing character recognition processing on a document image; acquisition means for acquiring information indicating a first attribute for searching the document image and information indicating a second attribute for generating a file name; extraction means for extracting strings corresponding to the first attribute and strings corresponding to the second attribute from a group of strings obtained as a result of the character recognition processing; processing means for associating the strings corresponding to the first attribute with the document image as a search index; and generation means for generating a file name of the document image using the strings corresponding to the second attribute. [Effects of the Invention]

[0007] According to this disclosure, it is possible to reduce the effort required of users when adding information to document images. [Brief explanation of the drawing]

[0008] [Figure 1] A diagram showing an example of the configuration of an information processing system. [Figure 2] A diagram showing an example of the hardware configuration of each device that makes up an information processing system. [Figure 3] Sequence diagram of an information processing system. [Figure 4] A flowchart detailing the process of generating a pre-trained model. [Figure 5] A diagram showing an example of the file save settings screen. [Figure 6] A diagram showing an example of the search index settings screen. [Figure 7] A diagram showing an example of file saving rules and indexing rules. [Figure 8] A flowchart illustrating the process of saving document image files. [Figure 9] A diagram showing an example of a document image. [Figure 10] This diagram shows an example of a file save information confirmation screen. [Figure 11] A diagram showing an example of a search index confirmation screen. [Figure 12] A flowchart illustrating the process of saving document image files. [Figure 13] A diagram showing an example of an item information confirmation screen. [Modes for carrying out the invention]

[0009] Embodiments of the technology described herein will be explained below with reference to the drawings. Note that the components described in the following embodiments are illustrative and are not intended to limit the scope of the technology described herein.

[0010] <First Embodiment> [Configuration of the Information Processing System] Figure 1 is a diagram showing an example configuration of an information processing system. As shown in Figure 1, the information processing system 100 includes, for example, an image forming apparatus 110, a learning apparatus 120, and an information processing server 130, which are connected to each other via a network 104. The dotted arrows in Figure 1 indicate the flow of data transmitted and received via the network 104. In the information processing system 100, the image forming apparatus 110, the learning apparatus 120, and the information processing server 130 may be configured with multiple connections to the network 104, rather than a single connection. For example, the information processing server 130 may consist of a first server device with high-speed computing resources and a second server device with large-capacity storage, which are connected to each other via the network 104.

[0011] The image forming apparatus 110 is an information processing device that can be implemented, for example, as an MFP (Multi-Function Peripheral) equipped with multiple functions such as printing, scanning, and faxing. The image forming apparatus 110 has at least an image acquisition unit 119 as a functional unit.

[0012] The image forming apparatus 110 has a scanner device 206 (see FIG. 2). The scanner device 206 optically reads a document 111, which is a hard copy document having a character string printed on a recording medium such as paper, and the image acquisition unit 119 performs predetermined image processing on the data obtained as a result to acquire a document image 113, which is a scanned image. Alternatively, for example, the image acquisition unit 119 receives FAX data 112 transmitted from a FAX transmitter (not shown) and performs predetermined FAX image processing to acquire the document image 113. The image acquisition unit 119 transmits the acquired document image 113 to the information processing server 130.

[0013] Note that the image forming apparatus 110 may be configured to be realized by a PC (Personal Computer) or the like in addition to an MFP equipped with scanning and FAX functions. Specifically, for example, the image forming apparatus 110, which is a PC, may transmit a document image 113 such as a PDF or JPEG generated using a document creation application operating on the PC to the information processing server 130. Image formation is assumed to include generating image data.

[0014] The learning apparatus 120 has functional units of a learning data generation unit 121 and a learning unit 122. The learning data generation unit 121 generates learning data based on a plurality of document image samples 114. Details will be described later.

[0015] The learning unit 122 performs machine learning on a learning model using the learning data generated by the learning data generation unit 121 to generate a learned model. In the present embodiment, the learning unit 122 generates a document type determination model 115a, which is a learned model that outputs, as a result, information indicating the type (document type) of a document indicated by a document image to be processed, such as an invoice, an estimate, a purchase order, and a delivery note. The learning unit 122 generates an item information extraction model 115b, which is a learned model that outputs, as a result, information indicating a character string (also referred to as item information) corresponding to an attribute (also referred to as an item) of a character string included in a document image, such as a title, a document number, an issue date, a company name, and an amount.

[0016] The learning device 120 transmits the generated trained model 115 to the information processing server 130 via the network 104. The transmission of the trained model 115 by the learning device 120 is performed before the transmission of the document image 113 by the image forming apparatus 110, and the trained model 115 is stored on the information processing server 130. By storing the trained model 115 on the information processing server 130, the transmission of the trained model 115 from the learning device 120 to the information processing server 130 may be made only when the trained model 115 has been updated.

[0017] The information processing server 130 has an information processing unit 131 and a data management unit 132.

[0018] The information processing unit 131 receives the document image 113 transmitted from the image forming apparatus 110, performs optical character recognition (OCR) processing on the document image 113, and obtains a group of recognized character strings from the document image 113. The information processing unit 131 uses the document type determination model 115a to obtain a document type determination result 117 indicating which of the pre-set document types, such as invoice, quotation, purchase order, or delivery note, the document is and determines the document type. The information processing unit 131 uses the item information extraction model 115b to obtain item information as an item information extraction result 118 from the group of recognized character strings from the document image 113, which are character strings corresponding to items such as title, document number, issue date, company name, and amount.

[0019] The information processing unit 131 also functions as an acquisition unit to obtain attribute information, an extraction unit to extract strings corresponding to a certain attribute, a generation unit to generate a file name for a document image, and a generation unit to generate a save location for a document image. The information processing unit 131 also functions as a processing unit that assigns a file name to a document image, saves a search index associated with the document image, and saves the document image to the save location. The information processing unit 131 generates the file name using the strings from the document type determination result 117 and the item information extraction result 118. Details will be described later.

[0020] The data management unit 132 stores the document image file 116 generated from the document image 113 received from the image forming apparatus 110. The data management unit 132 also stores the file saving rules and indexing rules, which will be described later, in association with the document type.

[0021] Network 104 is implemented using a LAN or WAN, and is a communication unit that connects the image forming apparatus 110, the learning apparatus 120, and the information processing server 130 to each other, and for sending and receiving data between the devices.

[0022] [Hardware configuration of image forming apparatus] Figure 2 shows an example of the hardware configuration of the image forming apparatus 110, learning apparatus 120, and information processing server 130 included in the information processing system 100.

[0023] Figure 2(a) shows the hardware configuration of the image forming apparatus 110. The image forming apparatus 110 includes a CPU 201, ROM 202, RAM 204, printer device 205, scanner device 206, document transport device 207, storage 208, input device 209, display device 210, and external interface 211. Each part is connected to the others via a data bus 203.

[0024] The CPU 201 is a control unit that controls the overall operation of the image forming apparatus 110. The CPU 201 starts the system of the image forming apparatus 110 by executing a startup program stored in the ROM 202, and then executes a control program stored in the storage 208 to realize the functions of the image forming apparatus 110, such as printing, scanning, and faxing.

[0025] ROM 202 is a storage unit implemented with non-volatile memory that stores the startup program for starting the image forming apparatus 110. Data bus 203 is a communication unit for sending and receiving data between devices constituting the image forming apparatus 110. RAM 204 is a storage unit implemented with volatile memory that is used as work memory when the CPU 201 executes the control program. Storage 208 is a storage unit implemented with an HDD (Hard Disk Drive) or the like that stores the aforementioned control program and document images.

[0026] The printer device 205 is an image output device that prints document images onto a storage medium such as paper. The scanner device 206 is an image input device that optically reads a storage medium such as paper on which text, diagrams, etc., are printed. The data obtained by the scanner device 206 is acquired as a document image. The document transport device 207 is implemented as an ADF (Auto Document Feeder), etc., and detects documents placed on the document glass and transports the detected documents one by one to the scanner device 206.

[0027] The input device 209 is an operation unit implemented as a touch panel or hard keys, which receives operation input from a user using the image forming apparatus 110. The display device 210 is a display unit implemented as a liquid crystal display, which displays the settings screen of the image forming apparatus 110 to the user. The CPU 201 also functions as a display control unit that controls the screen displayed on the display device 210. Furthermore, the screen displayed on the display device 210 may be displayed based on information transmitted and processed by the CPU 261 of the information processing server 130. For this reason, the CPU 261 of the information processing server 130 also functions as a display control unit.

[0028] The external interface 211 is an interface that connects the image forming apparatus 110 and the network 104, and is used to receive fax data from a fax transmitter (not shown) and to send document images to the information processing server 130.

[0029] [Hardware configuration of the learning device] Figure 2(b) shows the hardware configuration of the learning device 120. As shown in Figure 2(b), the learning device 120 has a CPU 231, ROM 232, RAM 234, storage 235, input device 236, display device 237, external interface 238, and GPU 239, and each part is connected to the others via a data bus 233.

[0030] The CPU 231 is a control unit that controls the entire operation of the learning device 120. The CPU 231 starts the system of the learning device 120 by executing a boot program stored in the ROM 232. The CPU 231 then generates a trained model 115 for document type determination or item information extraction by executing a training program stored in the storage 235.

[0031] ROM232 is a storage unit implemented with non-volatile memory that stores the boot program for starting the learning device 120. Data bus 233 is a communication unit for sending and receiving data between devices that make up the learning device 120. RAM234 is a storage unit implemented with volatile memory that is used as work memory when the CPU 231 executes the learning program. Storage 235 is a storage unit implemented with an HDD (Hard Disk Drive) or the like that stores the aforementioned learning program and document images.

[0032] The input device 236 is an operation unit implemented as a mouse or keyboard, which receives operation input from an engineer controlling the learning device 120. The display device 237 is a display unit implemented as a liquid crystal display, which displays the settings screen of the learning device 120 to the engineer. The external interface 238 is an interface that connects the learning device 120 to the network 104, and receives document image samples 114 from the outside and sends trained models 115 to the information processing server 130.

[0033] The GPU239 is an arithmetic unit composed of image processing processors. For example, the GPU239 performs calculations to generate a trained model 115 based on a set of strings contained in a given document image, according to control commands given by the CPU231.

[0034] Each functional unit included in the learning device 120 shown in Figure 1 is realized by the CPU 231 executing a predetermined program, but is not limited to this. Other hardware, such as a GPU 239 for speeding up calculations or an FPGA (Field Programmable Gate Array) (not shown), may also be used. Each functional unit may be realized through the cooperation of software and hardware such as dedicated ICs, or some or all of the functions may be realized by hardware alone.

[0035] [Hardware configuration of the information processing server] Figure 2(c) shows the hardware configuration of the information processing server 130. As shown in Figure 2(c), the information processing server 130 consists of a CPU 261, ROM 262, RAM 264, storage 265, input device 266, display device 267, and external interface 268, which are connected to each other via a data bus 263.

[0036] The CPU 261 is a control unit that controls the overall operation of the information processing server 130. The CPU 261 starts up the information processing server 130 system by executing a boot program stored in the ROM 262, and performs information processing such as optical character recognition (OCR) and information extraction by executing information processing programs stored in the storage 265.

[0037] ROM262 is a storage unit implemented with non-volatile memory that stores the boot program for starting the information processing server 130. Data bus 263 is a communication unit for sending and receiving data between devices that make up the information processing server 130. RAM264 is a storage unit implemented with volatile memory that is used as work memory when the CPU 261 executes the information processing program. Storage 265 is a storage unit implemented with an HDD (Hard Disk Drive) or the like that stores the aforementioned information processing program, trained model 115, document image file 116, document type determination result 117, item information extraction result 118, etc.

[0038] The input device 266 is an operation unit implemented as a mouse and keyboard, etc., which receives operation input to the information processing server 130 from users or engineers using the information processing server 130. The display device 267 is a display unit implemented as a liquid crystal display, etc., which displays the settings screen of the information processing server 130, etc., to users or engineers using the information processing server 130.

[0039] The external interface 268 is an interface that connects the information processing server 130 and the network 104, and receives the trained model 115 from the learning device 120 and the document image 113 from the image forming apparatus 110.

[0040] Each functional unit included in the information processing server 130 in Figure 1 is realized by the CPU 261 executing a predetermined program, but is not limited to this. Other hardware, such as a GPU (Graphics Processing Unit) or FPGA (Field Programmable Gate Array), may also be used to speed up calculations. Each functional unit may be realized through the cooperation of software and hardware such as dedicated ICs, or some or all of the functions may be realized by hardware alone.

[0041] [Sequence for generating a trained model] Figure 3 shows the usage sequence of the information processing system 100 shown in Figure 1. The symbol "S" in the description of each process indicates a step in the sequence, and the same applies to subsequent flowcharts. For ease of explanation, user or engineer operations are also described using steps.

[0042] Figure 3(a) shows the development flow by engineers for the document type determination model 115a and the item information extraction model 115b.

[0043] In S301, the engineer involved with the information processing system 100 inputs multiple document image samples 114, which are sample images representing documents, into the learning device 120. The document image samples 114 are document images corresponding to document types such as invoices, quotations, purchase orders, and delivery slips.

[0044] In S302, the learning data generation unit 121 of the learning device 120 generates first learning data based on the document image sample 114, and the learning unit 122 generates a document type determination model 115a by performing machine learning using the first learning data.

[0045] In S303, the learning device 120 transmits the generated document type determination model 115a to the information processing server 130. The data management unit 132 of the information processing server 130 stores the document type determination model 115a in the storage 265.

[0046] In S304, the learning data generation unit 121 of the learning device 120 generates second learning data based on the document image sample 114, and the learning unit 122 generates an item information extraction model 115b by performing machine learning using the second learning data.

[0047] In S305, the learning device 120 transmits the generated item information extraction model 115b to the information processing server 130. The data management unit 132 of the information processing server 130 saves the item information extraction model 115b in the storage 265. Details of S302 to S305 in Figure 3(a) will be described later with reference to Figure 4.

[0048] In S306, the engineer registers the default settings for file saving rules with the information processing server 130.

[0049] In S307, the engineer registers the default settings for the indexing rules with the information processing server 130.

[0050] The information processing server 130 saves the default settings for file saving rules and indexing rules (collectively referred to as file saving rules) based on the information entered by the engineer. These default settings are applied when a user uses the information processing system 100 without setting the file saving rules and indexing rules in advance. As described later, users can change the default settings for file saving rules and indexing rules to their desired settings.

[0051] [Sequence of processes for saving document images] Figure 3(b) is a diagram illustrating the process flow for assigning a file name and a search index to a document image 113 acquired by the image forming apparatus 110 in accordance with user instructions, and saving the file to a predetermined storage location.

[0052] In S311, users can customize file saving rules.

[0053] In S312, the user configures custom indexing rules. The configuration screens for user configuration in S311 and S312 will be described later. The data management unit 132 of the information processing server 130 saves the file saving rules and indexing rules entered by the user as user custom settings for the information processing system 100.

[0054] In S313, the user places a paper document (original) into the image forming apparatus 110 and instructs the image forming apparatus 110 to perform a scan of the document.

[0055] In S314, the scanner device 206 of the image forming apparatus 110 reads the placed paper document, and the image acquisition unit 119 generates a document image, which is an image representing the scanned document. The image acquisition unit 119 then transmits the generated document image to the information processing server 130 as the document image to be processed. The information processing unit 131 of the information processing server 130 performs character recognition (OCR) processing on the document image to be processed transmitted in S314 and obtains a group of characters recognized from the document image.

[0056] In S315, the information processing unit 131 of the information processing server 130 inputs data of a group of strings from the document image to be processed into the document type determination model 115a, and uses the output result to determine the document type of the document image to be processed.

[0057] In S316, the information processing unit 131 of the information processing server 130 inputs data of strings recognized from the document image to be processed into the item information extraction model 115b, and extracts item information from the input strings that correspond to each item.

[0058] In S317, the information processing unit 131 of the information processing server 130 generates a file name and save location (referred to as file save information) based on the document type and item information.

[0059] In S318, the information processing unit 131 of the information processing server 130 extracts a string to be used as a search index based on the document type and item information.

[0060] In S319, the information processing unit 131 of the information processing server 130 displays the generated file name, save location, and string to be used as a search index as candidates to the user on the display device 210 of the image forming apparatus 110.

[0061] In S320, the user modifies the file name, save location, and string to be assigned as a search index via the input device 209 of the image forming apparatus 110. If the user instructs not to modify the candidates displayed in S319, the file name, save location, and string to be assigned as a search index generated by the information processing server 130 are specified.

[0062] In S321, the information processing unit 131 of the information processing server 130 assigns a user-specified file name and search index to the document image file to be processed, and saves the document image file to the user-specified save location.

[0063] [Generating machine learning models] Figure 4 is a flowchart illustrating the details of the document type determination model 115a and item information extraction model 115b generation process shown in S302-S305 of Figure 3(a) executed by the learning device 120. The series of processes shown in the flowchart of Figure 4 will be explained as being performed by the CPU 231 of the learning device 120 loading program code stored in ROM 232 or storage 235 into RAM 234 and executing it. Some or all of the functions of the steps in Figure 4 may be executed by the GPU 239, or they may be implemented by hardware such as an ASIC or electronic circuit.

[0064] In S401, CPU231 acquires multiple document image samples entered by the engineer in S301 in Figure 3. For example, sample images of documents created with different layouts for each issuing company, such as invoices, quotations, and purchase orders, which are generally referred to as semi-standard forms, are acquired.

[0065] In S402, CPU231 performs block selection (BS) processing and optical character recognition (OCR) processing on the document image sample acquired in S401, and obtains a set of recognized strings from the document image sample.

[0066] Block selection (BS) processing is a process that selects block regions so that a document image is divided into units of objects that make up the document image, and determines the attributes of each block region. Specifically, it is a process that determines the attributes of, for example, characters, photographs, and diagrams, and divides the document image into block regions with different attributes. Block selection (BS) processing can be implemented using known region determination techniques. The data of the string group obtained as a result of OCR processing may be, for example, strings of words that make up the document image, which are arranged within the document image with spaces or lines separating them, and read sequentially in a predetermined reading order based on the arrangement information. Alternatively, the data of the string group obtained may be, for example, strings of words that have been divided using a morphological analysis method from the text that makes up the document image, and read sequentially in a predetermined reading order based on the arrangement information.

[0067] In S403, CPU231 obtains a first ground truth label indicating the document type for each of the multiple document image samples acquired in S401. Document types include, for example, invoices, quotations, purchase orders, and delivery notes. The first ground truth label may be obtained manually by an engineer for each document image sample, or it may be automatically obtained by inputting the document image samples into a model that outputs pre-generated document types. CPU231 generates first training data, which is machine learning data consisting of combinations of recognized strings from the document image samples and the first ground truth labels for each document image sample. First training data is generated for each of the multiple document image samples.

[0068] In S404, CPU231 obtains a second ground truth label indicating which item each string to be extracted from the string set obtained in S402 corresponds to. These items may include, for example, title, document number, issue date, company name, and total amount. The second ground truth label may be manually assigned by an engineer, or it may be automatically assigned by inputting the document image sample into a model that extracts pre-generated item information. CPU231 then generates a second training data set in which the second ground truth label is assigned to each string in the string set recognized from the document image sample that corresponds to the item. A second training data set is generated for each of the multiple document image samples.

[0069] In S405, CPU231 generates a document type determination model 115a using machine learning with the first training data. The document type determination model 115a is a trained model that, when input, outputs information about the document type of the document image to be processed from among the trained document types. The generated document type determination model 115a is, for example, a trained model that has been trained to output a label corresponding to the first correct label. The document type determination model 115a is generated, for example, by training a training model that has been prepared to output a predetermined document type label for the features of the input string group as an inference result.

[0070] In S406, CPU231 generates an item information extraction model 115b using machine learning with the second training data. The item information extraction model 115b is a pre-trained model that, when input, takes feature data of a group of strings contained in the document image to be processed as input and outputs information of strings that correspond to items from that group of strings. The generated item information extraction model 115b is, for example, a pre-trained model that is trained to output labels that correspond to the second correct labels. The item information extraction model 115b is generated by training the prepared training model so that when feature data of a group of strings is input, it outputs item labels for strings that are to be extracted and does not output labels for strings that are not to be extracted.

[0071] Known methods can be used to generate the document type determination model 115a and the item information extraction model 115b. For example, feature vectors representing the features of a string converted using Word2Vec, fastText, BERT, XLNet, ALBERT, etc., and the position coordinates of the string placed in the document image can be used. Specifically, for example, a single string data can be converted into a feature vector represented by 768-dimensional numbers by using a pre-trained BERT language model on a general text (e.g., the entire text of a Wikipedia article). The training model can be a logistic regression, decision tree, random forest, support vector machine, neural network, etc., which are generally known machine learning algorithms. Specifically, for example, the inference result of one of the predetermined document types and item information labels can be output according to the output value of the fully connected layer of a neural network that takes the feature vector output by the BERT language model as input.

[0072] In S407, the CPU 231 sends the generated document type determination model 115a to the information processing server 130. The document type determination model 115a is then stored in the storage 265 within the information processing server 130.

[0073] In S408, the CPU 231 sends the generated item information extraction model 115b to the information processing server 130. The item information extraction model 115b is then stored in the storage 265 within the information processing server 130.

[0074] [Regarding setting file saving rules] In this embodiment, the information processing server 130 executes a process to generate a file name to be assigned to the document image (document image file generated from the document image) acquired by the image forming apparatus 110, using a string extracted from the document image. Furthermore, the information processing server 130 executes a process to generate the save location for the document image (document image file), using a string extracted from the document image.

[0075] The information processing server 130 generates filenames based on filename generation rules and save destinations based on save destination generation rules. The filename generation rules and save destination generation rules are collectively called file saving rules. File saving rules are set by an engineer or a user. In this embodiment, file saving rules are described as being set and saved for each document type, but a common file saving rule may be used for all document types.

[0076] Figure 5 shows an example of a file saving settings screen for users to set file saving rules. For example, when a user instructs the image forming apparatus 110 to display the file saving settings screen 500 via the input device 209, the information processing server 130 sends the necessary information to the image forming apparatus 110, and the file saving settings screen 500 is displayed on the display device 210. When the user makes a setting via the file saving settings screen 500 in S311, the setting is sent from the image forming apparatus 110 to the information processing server 130. As a result, the information processing server 130 saves the file saving rules set by the user.

[0077] The file save settings screen 500 includes a document type display area 501 that displays the type of document to which the settings apply, a file name generation rule input area 502, a save destination generation rule input area 503, and a confirmation button 504.

[0078] The currently saved file name generation rules are displayed in the file name generation rule input area 502, and the currently saved save destination generation rules are displayed in the save destination generation rule input area 503. If the user changes the file name generation rules or save destination generation rules via the file save settings screen 500, the changed file name generation rules or save destination generation rules are displayed. If the user has not made any changes, the file name generation rules or save destination generation rules set by the engineer are displayed.

[0079] Users can change the file name generation rule or the save destination generation rule by changing the string contained in the file name generation rule input area 502 or the save destination generation rule input area 503.

[0080] File name generation rules and save location generation rules consist of a combination of a predefined string and an item name. In the file name or save location generation process, the item name is replaced with the string corresponding to the item indicated by the item name. Item names included in file saving rules are enclosed in <> to distinguish them from the predefined string. For example, in the file name generation process, the string <item name> in the file name generation rule is replaced with the string corresponding to the item indicated by that item name to generate the file name.

[0081] For example, in the input area 502 of the file name generation rule in Figure 5, "Invoice" and "_" are predefined strings, and <Issue Date>, <Title>, and <Document Number> represent item names. These item names indicate items that are examples of items (second attribute) used to generate file names. Also, in the input area 503 of the save destination generation rule in Figure 5, <Company Name> represents an item name. This item name indicates an example of an item (third attribute) used to generate a save destination.

[0082] The item names are not limited to the names of items corresponding to strings extracted from the document image (referred to as "items of information to be included"), such as title and company name. They may also be the names of items related to the process by which the image forming apparatus 110 acquires the document image (referred to as "items of input information"), such as scan date and user name.

[0083] [Regarding the setting of indexing rules] In this embodiment, in order to improve the searchability of document images (document image files generated from document images) acquired by the image forming apparatus 110, the information processing server 130 performs a process of extracting and saving a string to be used as a search index. The search index is saved in association with the document image (document image file). Then, in the search process, the search index is used so that document images (document image files) associated with the search index that matches the search word are output as search results. If the search index is not saved, for example, document images with file names that at least partially match the search word will be output as search results. On the other hand, by saving the search index, the search index is used in addition to the file name for searching, thus improving searchability.

[0084] The information processing server 130 extracts strings to be used as search indexes based on indexing rules. It then presents the extracted strings to the user as candidates for search indexes and, based on the user's instructions, saves the search indexes associated with document image files.

[0085] Indexing rules are set by an engineer or a user. In this embodiment, indexing rules are described as being set for each document type, but a common indexing rule may be set for all document types.

[0086] Figure 6 shows an example of a screen for a user to set indexing rules. Figure 6(a) shows an example of a search index setting screen 600. The search index setting screen 600 is displayed, for example, on the display device 210 of the image forming apparatus 110. For example, when a user instructs the image forming apparatus 110 to display the search index setting screen 600 via the input device 209, the information processing server 130 sends the information necessary to display the search index setting screen 600 to the image forming apparatus 110. The image forming apparatus 110 then displays the search index setting screen 600 on the display device 210. When the user makes a setting via the search index setting screen 600 in S312, the setting is sent from the image forming apparatus 110 to the information processing server 130. As a result, the information processing server 130 saves the indexing rules set by the user.

[0087] The search index settings screen 600 includes a document type display area 601 that displays the type of document to which the index will be applied, an index assignment rule setting area 602, a change button 603, and a confirm button 604.

[0088] The indexing rule setting area 602 displays the currently saved indexing rules. If the user previously modified the indexing rules via the search index settings screen 600, the modified indexing rules will be displayed. If the user has not made any changes, the indexing rules set by the engineer will be displayed. Users can modify the indexing rules by changing the items contained in the indexing rule setting area 602.

[0089] The indexing rules consist of item names corresponding to strings to be used as search indexes. If a user wants to change the item names included in the indexing rule setting area 602, the user presses the change button 603. When the change button 603 is pressed, the detailed settings screen 610 shown in Figure 6(b) is displayed on the display device 210.

[0090] The detailed settings screen 610 includes a list of item names indicating items that can be used as a search index, and a checkbox corresponding to each item name. In this embodiment, the items extracted as a search index are the input information items and the written information items mentioned above. Note that either one of the items may be used. The input information item area 611 includes checkboxes for selecting item names indicating input information items such as scan date and user name. The written information item area 612 includes checkboxes for selecting item names indicating written information items such as title and company name. When the reset button 613 is pressed by the user, the selection state of the checkboxes included in the index detailed settings screen 610 returns to the default setting state.

[0091] When the user presses the confirm button 614, the item names corresponding to the checkboxes that were selected on the detailed settings screen 610 are reflected in the indexing rule setting area 602 of the search index setting screen 600. For example, in the information item area 612, "Title," "Company Name," "Document Number," and "Issue Date" are selected, so the items indicated by these item names are set as the indexing rule. These items are examples of items (first attributes) for searching document images.

[0092] Furthermore, the items extracted as a search index are not limited to input information items or written information items; for example, items of storage information such as file name and file save location may also be included. In this case, the user may be allowed to select which items of storage information such as file name and file save location should be included in the indexing rules.

[0093] Figure 7 shows an example of file saving rules and indexing rules stored on the information processing server 130. Initially, the file saving rules and indexing rules set by the engineer are stored on the information processing server 130.

[0094] When the user presses the OK button 504 on the file save settings screen 500, the file name generation rule is changed to the content displayed in the file name generation rule input area 502, as shown in Figure 7. The save destination generation rule is also changed to the content displayed in the save destination generation rule input area 503. Furthermore, when the user presses the OK button 604 on the search index settings screen 600, the index assignment rule is changed to the content displayed in the index assignment rule settings area 602, as shown in Figure 7.

[0095] [Saving document image files] Figure 8 is a flowchart illustrating the process by which the information processing server 130 saves the document image file received from the image forming apparatus 110. Specifically, Figure 8 is a flowchart that details steps S315 to S321 in Figure 3(b). The flowchart in Figure 8 begins when the information processing server 130 receives the document image transmitted from the image forming apparatus 110 in S314 of Figure 3(b). Before the start of the flowchart in Figure 8, the information processing server 130 has saved the document type determination model, item information extraction model, file saving rules, and indexing rules.

[0096] The series of processes shown in the flowchart of Figure 8 will be explained assuming that the CPU 261 of the information processing server 130 loads the program code stored in ROM 262 or storage 265 into RAM 264 and executes it. Some or all of the steps in Figure 8 may be implemented by hardware such as an ASIC or electronic circuit.

[0097] In S801, the CPU 261 acquires the document image 113 transmitted from the image forming apparatus 110 as the document image to be processed.

[0098] Figure 9 shows an example of a document image acquired in S801. As shown in Figure 9, the document image 113 includes strings 901 to 905 that indicate item information corresponding to predetermined items (predetermined attributes), such as title, issue date, document number, name of the issuing company, and total amount. In the explanation of each step in Figure 8, it will be assumed that the document image to be processed is the document image shown in Figure 9.

[0099] In S802, the CPU261 performs block selection (BS) processing and optical character recognition (OCR) processing, as described in S402, on the document image to be processed, and obtains the set of character strings contained in the document image to be processed.

[0100] In S803, the CPU 261 inputs the string information obtained in S802 into the document type determination model 115a stored in storage 265, and determines the document type of the document image to be processed based on the output result.

[0101] A group of strings is generally a string unit called a token, arranged according to a predetermined reading order. In this embodiment, the input group of strings is converted into a feature vector that represents the characteristics of the entire string, and the document type can be determined using the output result of a document type determination model, which is a multi-class classifier that takes the feature vector as input. In this embodiment, the document type determination model outputs probability values ​​(0 to 1) for each document type: invoice, quotation, purchase order, delivery note, and contract, and the document type is determined based on the output probability values. For example, the CPU 261 determines the document type of the document image to be processed if the probability value is greater than or equal to a predetermined threshold (0.9), but the method of determining the document type from the output result is not limited. The document type with the maximum probability value may be determined as the document type of the document image to be processed. Furthermore, the method of determining the document type is not limited to using a trained model, and other known technologies other than those in this embodiment may be used.

[0102] In S804, the CPU 261 inputs the string information obtained in S802 into the item information extraction model 115b stored in storage 265, and based on the output result, obtains strings corresponding to predetermined items as item information. The string group is arranged in a predetermined reading order using string units called tokens. In this embodiment, the input string group is converted into a 768-dimensional feature vector that shows the characteristics of the entire string using the aforementioned BERT. When this feature vector is input, the item information extraction model 115b outputs a label indicating the corresponding item name for each string unit, or a label indicating that it does not correspond to any item.

[0103] In this embodiment, the extracted string is normalized and then obtained as item information. Alphanumeric characters are converted to half-width characters, and dates are standardized to a format of 4-digit year (YYYY) + 2-digit month (MM) + 2-digit day (DD). For example, in the document image, the date string 902 is "June 5, 2024", but the item information for the item name "Issuance Date" is obtained as "20240605". Also, if the document image to be processed is the document image in Figure 9, the item information corresponding to the item name "Title" will be "○○ Invoice". The item information corresponding to the item name "Document Number" will be "BN0037". The item information corresponding to the item name "Company Name" will be "ABC Corporation".

[0104] In S805, CPU261 generates a file name using the item information (string) obtained in S804, according to the file name generation rules associated with the document type obtained in S803.

[0105] Specifically, CPU261 retrieves the item names included in the file name generation rules. That is, CPU261 retrieves the item names included in the file name generation rules as information indicating the second attribute.

[0106] CPU261 extracts the item information (string) that corresponds to the acquired item name from the item information (string) acquired by S804.

[0107] Then, CPU261 replaces the item names in the file name generation rules with the corresponding strings to generate the file name.

[0108] Similarly, CPU261 obtains the item name included in the destination generation rule associated with the document type obtained in S803. That is, CPU261 obtains the item name included in the destination generation rule as information indicating the third attribute. CPU261 extracts the item information (string) corresponding to the obtained item name from the item information (string) obtained in S804. Then CPU261 replaces the item name in the destination generation rule with the string corresponding to the item name and generates the destination. In this embodiment, the file name and destination are collectively referred to as file storage information.

[0109] In S806, CPU261 displays (recommendates) the file name and save location generated in S805 as candidates for the file name and save location of the document image to be processed.

[0110] Figure 10 shows an example of a file save information confirmation screen, which is a screen for displaying candidate file names and save locations. The file save information confirmation screen 1000 is a screen for the user to confirm the file name and save location generated by the information processing server 130 and to correct them as needed. The file save information confirmation screen 1000 is displayed, for example, on the display device 210 of the image forming apparatus 110. For example, in S806, the CPU 261 sends information to the image forming apparatus 110 for displaying the file save information confirmation screen 1000. The file save information confirmation screen 1000 includes a file name display area 1001 for displaying the file name generated in S805 and a save location display area 1002 for displaying the save location generated in S805.

[0111] The file save information confirmation screen 1000 in Figure 10 is an example of the file save information confirmation screen when "Invoice" is obtained as the document type of the document image to be processed in S803. The file name display area 1001 recommends and displays file names generated according to the file name generation rule associated with "Invoice". Assume that the file name generation rule associated with "Invoice" is "Invoice_<Issuance Date>_<Title>_<Document Number>". In this case, the CPU 261 generates the file name by replacing <Issuance Date>, <Title>, and <Document Number> with the corresponding item information strings. That is, "<Issuance Date>" is replaced with the item information "20240605" which corresponds to the issue date. "<Title>" is replaced with the item information "○○ Invoice" which corresponds to the title. "<Document Number>" is replaced with the item information "BN0037" which corresponds to the document number. The resulting file, "Invoice_20240605_〇〇Invoice_BN0037", is recommended and displayed in file name display area 1001.

[0112] Similarly, the save location display area 1002 recommends save locations generated according to the save location generation rule associated with the document type "Invoice" obtained in S803. For example, suppose the save location generation rule replaces the subfolder with the company name item information. In this case, the CPU 261 generates an address where the subfolder is the item information corresponding to the company name, "ABC Corporation," as the save location, and processes it so that it is recommended and displayed in the save location display area 1002.

[0113] The CPU 261 instructs the image forming apparatus 110 to display the file name display area 1001 and the save location display area 1002 as text boxes. Therefore, the file save information confirmation screen 1000 is configured so that the user can edit the recommended file name and save location to their desired strings.

[0114] In S807, when the user presses the OK button 1003 on the file save information confirmation screen 1000, the CPU 261 obtains the file name displayed in the file name display area 1001 and the save destination displayed in the save destination display area 1002.

[0115] In S808, CPU261 retrieves the item names included in the indexing rule. That is, CPU261 retrieves the item names included in the indexing rule as information indicating the first attribute.

[0116] The CPU 261 extracts item information (strings) from the item information (strings) acquired in S804 that corresponds to the item name related to the information described. The CPU 261 also extracts item information (strings) from the input information of the image forming apparatus 110 that corresponds to the item name related to the input information.

[0117] In S809, CPU261 displays the string extracted in S808 as a candidate for a search index to be saved in association with the document image being processed (recommendation display).

[0118] Figure 11 shows an example of a search index confirmation screen for presenting search index candidates to the user. The search index confirmation screen 1100 is a screen for the user to confirm the strings of search index candidates extracted by the information processing server 130 and to modify them as needed. The search index confirmation screen 1100 is displayed on the display device 210 of the image forming apparatus 110. For this reason, in S809, the CPU 261 transmits information to the image forming apparatus 110 for displaying the search index confirmation screen 1100.

[0119] The search index confirmation screen 1100 includes an input information display area 1101 and a description information display area 1102. The input information display area 1101 displays the strings to be assigned as search indexes extracted from the input information in S808, according to the index assignment rules corresponding to the document type obtained in S803. The description information display area 1102 displays the strings extracted from the item information in S808, according to the index assignment rules corresponding to the document type obtained in S803, as candidates for search indexes.

[0120] The CPU 261 processes the information display area 1102 so that it is displayed as a text box. Therefore, the search index confirmation screen 1100 is configured to allow the user to edit the strings displayed as search index candidates from the information display area 1102 to their desired strings.

[0121] For example, in the information display area 1102 of Figure 11, the string extracted in S808 according to the indexing rules associated with "Invoice" is recommended and displayed as a candidate for the search index. Assume that the indexing rules associated with "Invoice" include <Title>, <Company Name>, <Document Number>, and <Issuance Date>. In this case, the item information corresponding to the items indicated by these item names is extracted from the item information obtained in S804, and the string representing the extracted item information is displayed in the information display area 1102.

[0122] In S810, the CPU 261 retrieves the strings displayed in the input information display area 1101 and the written information display area 1102 when the user presses the confirm button 1103 on the search index confirmation screen 1100.

[0123] In S811, CPU261 generates a document image file from the document image to be processed, and assigns the file name obtained in S807 as the file name of that document image file.

[0124] In S812, CPU261 associates the string obtained in S810 with the document image file, which has been assigned a filename, and saves it as a search index.

[0125] In this embodiment, the CPU 261 assigns a string to be used as a search index to the document image file as metadata. For example, if the document image file is a PDF file, the index is assigned to the document image file using a PDF editing tool. The method of associating and saving the search index with the document image file is not limited. Alternatively, for example, the search index information may be stored separately from the document image file in a database or within the system in association with the document image file, and managed so that the user can refer to it as needed. In this case, the CPU 261 may display a screen that allows the user to select the output format, and the search index may be saved in the output format selected by the user.

[0126] In S813, CPU261 saves the document image file associated with the search index to the storage location obtained in S807.

[0127] Note that the order of processing in the flowchart in Figure 8 is not limited. For example, assigning file names and assigning search indexes may be performed at the same time.

[0128] As described above, this embodiment makes it possible to infer the appearance of item information from the strings contained in a document image. As a result, in this embodiment, item information can be extracted from the document image to be processed without registering the layout of the document image. Therefore, this embodiment reduces the effort required of the user to generate file names and extract strings to be used as search indexes. Furthermore, in this embodiment, since the search index can be saved simultaneously with the document image file, the searchability of the document image file can be improved compared to saving only the file.

[0129] <Second Embodiment> The first embodiment was described as accepting user modification instructions for both file storage information and search index information. This embodiment describes a method for accepting user modification instructions for item information. This embodiment will mainly describe the differences from the first embodiment. Unless otherwise specified, the configuration and processing are the same as in the first embodiment.

[0130] [Storage of document image files] Figure 12 is a flowchart illustrating the process by which the information processing server 130 saves the document image file received from the image forming apparatus 110. That is, Figure 12 is a flowchart corresponding to the flowchart in Figure 8 in the second embodiment. For this reason, in Figure 12, steps that are the same as those in Figure 8 are numbered the same as in Figure 8. In this embodiment, once item information contained in the document image to be processed is acquired in S804, the process proceeds to S1201.

[0131] In S1201, CPU261 displays the item information extracted in S804 to the user.

[0132] Figure 13 shows an example of an item information confirmation screen for displaying item information for each item name extracted from the document image to be processed. The item information confirmation screen 1300 in Figure 13 is displayed on the display device 210 of the image forming apparatus 110. Therefore, in S1201, the CPU 261 transmits information to the image forming apparatus 110 for displaying the item information confirmation screen 1300.

[0133] The item information confirmation screen 1300 includes an item information display area 1301, preview areas 1302-1304, and a confirmation button 1305.

[0134] The item information display area 1301 is an area that displays the item information extracted in S804. The CPU 261 instructs the image forming apparatus 110 to display the item information display area 1301 as a text box. Therefore, the item information confirmation screen 1300 is configured so that the user can edit the item information extracted in S804 into a desired string. The CPU 261 may also configure the item information display area 1301 to display only the strings (item information) corresponding to the items indicated by the item names included in the file name generation rule, the item names included in the save destination generation rule, and the item names included in the indexing rule. By displaying in this way, item names common to each rule are displayed together as one. Therefore, the user can edit the strings corresponding to the items indicated by the common item names together via the item information confirmation screen 1300.

[0135] The CPU 261 generates a file name according to the file name generation rules using the item information displayed in the item information display area 1301, as described in S805. The generated file name is displayed in the file name preview area 1302. The CPU 261 also generates a save destination according to the save destination generation rules using the item information displayed in the item information display area 1301, as described in S805. The generated save destination is displayed in the save destination preview area 1303.

[0136] Furthermore, CPU 261 extracts item information from the item information displayed in item information display area 1301 according to the indexing rules, using the method described in S808. The string representing the extracted item information is displayed in the search index preview area 1304.

[0137] If the item information contained in the item information display area 1301 is modified by the user, the CPU 261 retrieves the modified information from the image forming apparatus 110 and generates new strings for the file name, save location, and search index using the modified item information. The CPU 261 then processes the file name preview area 1302, save location preview area 1303, and search index preview area 1304 to display the newly generated strings for the file name, save location, and search index.

[0138] Furthermore, if the item information is modified by the user, the newly generated file name, save location, and search index string may be displayed on the item information confirmation screen 1300 when the user confirms them. For example, if the item information is modified by the user, the file save information confirmation screen 1000 in Figure 10 and the search index confirmation screen 1100 in Figure 11 may be displayed. The CPU 261 may also display the newly generated file name, save location, and search index string on the item information confirmation screen 1300 when it detects the pressing of the OK button 1003 in Figure 10 or the OK button 1103 in Figure 11. Returning to Figure 12, we continue the explanation of the flowchart.

[0139] In S1202, when the user presses the confirm button 1305 included in the item information confirmation screen 1300, the CPU 261 determines that the string displayed in the item information display area 1301 corresponds to the string indicated by the item name.

[0140] In S1203, the CPU 261 assigns the file name displayed in the file name preview area 1302 when the OK button 1305 is pressed to the document image file generated from the document image to be processed.

[0141] In S1204, when the OK button 1305 is pressed, the CPU 261 saves the string displayed in the search index preview area 1304 as a search index, associating it with the document image file to which the filename has been assigned. The method of saving the search index associating it with the document image file is not limited. For example, as in the first embodiment, it may be attached to the document image file as metadata.

[0142] In S1205, the CPU 261 saves the document image file associated with the search index to the save location displayed in the save location preview area 1303 when the OK button 1305 is pressed.

[0143] As explained above, according to this embodiment, strings corresponding to common item names among the item names included in the file saving rules and indexing roots are reflected in the file name, save location, and search index strings with only a single modification. This reduces the effort required from the user when generating file names and adding indexes.

[0144] <Other Embodiments> In the embodiment described above, the processing performed by the information processing server 130 may also be performed by the image forming apparatus 110. For example, the processing performed by the information processing server 130 may be performed by the CPU 201 or the information processing unit included in the image forming apparatus 110.

[0145] This disclosure can also be implemented by supplying a program that implements one or more of the functions of the above-described embodiments to a system or device via a network or storage medium, and by having one or more processors in the computer of that system or device read and execute the program. It can also be implemented by a circuit (e.g., an ASIC) that implements one or more functions.

[0146] The above-described embodiments include the following configurations.

[0147] (Composition 1) A character recognition means that performs character recognition processing on a document image, An acquisition means for acquiring information indicating a first attribute for searching the document image and information indicating a second attribute for generating a file name. Extraction means for extracting strings corresponding to the first attribute and strings corresponding to the second attribute from the group of strings obtained as a result of the character recognition process, Processing means for associating the document image with a string corresponding to the first attribute as a search index, A generation means that generates the file name of the document image using the string corresponding to the second attribute, An information processing device characterized by having the following features. (Configuration 2) The extraction means is Based on the output obtained by inputting the group of strings obtained as a result of the character recognition process into a trained model that has been trained to output strings corresponding to predetermined attributes, the strings corresponding to the first attribute and the strings corresponding to the second attribute are extracted. The information processing device according to configuration 1, characterized by the above. (Composition 3) The system further includes a display control means that processes the strings corresponding to the first attribute extracted by the extraction means to be displayed as candidates for a search index for searching the document image. An information processing device according to configuration 1 or 2, characterized by the above. (Composition 4) The display control means displays a screen for receiving instructions from the user to modify the string displayed as a candidate. The processing means associates the strings corresponding to the first attribute extracted by the extraction means, specifically the strings that were not modified by the user and the strings that were modified by the user, with the document image as a search index. The information processing apparatus according to configuration 3, characterized by the above. (Composition 5) The processing means associates the search index with the document image by adding a search index as metadata to the file generated from the document image. An information processing apparatus according to any one of configurations 1 to 4, characterized by the above. (Composition 6) The system further includes a second display control means that displays the file name generated by the generation means as a candidate file name to be assigned to the document image, If the generated file name is modified by the user, the generation means assigns the modified file name to the document image. An information processing device according to any one of configurations 1 to 5, characterized by the above. (Composition 7) The predetermined attribute includes the first attribute and the second attribute, The extraction means extracts a string corresponding to the predetermined attribute, The system further includes a display control means for displaying a screen that receives instructions from the user to modify a string corresponding to the predetermined attribute. The information processing apparatus according to configuration 2, characterized in that... (Composition 8) The extraction means is If the user modifies the string corresponding to the predetermined attribute, the string corresponding to the first attribute and the string corresponding to the second attribute are extracted from the string corresponding to the predetermined attribute after the user's modification. The information processing apparatus according to configuration 7, characterized by the features described above. (Composition 9) The extraction means further extracts strings corresponding to the third attribute from the group of strings obtained as a result of the character recognition process, A second generation means for generating a storage location using a string corresponding to the third attribute, A second processing means that processes the document image to save it to the generated storage location, An information processing device according to any one of configurations 1 to 8, further comprising (Composition 10) The system further includes setting means for setting the first attribute and the second attribute. An information processing apparatus according to any one of configurations 1 to 9, characterized by the above. (Composition 11) A management means for associating and managing the first attribute and the second attribute for each document type, The system further includes a determination means for determining the document type of the document image, The extraction means extracts a string corresponding to the first attribute associated with the determined document type and a string corresponding to the second attribute associated with the determined document type. An information processing apparatus according to any one of configurations 1 to 10, characterized by the above. (Composition 12) The aforementioned determination means is The set of strings obtained as a result of the character recognition process is input into the trained model, and the document type of the document image is determined based on the results obtained. The information processing apparatus according to configuration 11, characterized by the features described above. (Composition 13) The device further includes a receiving means for receiving the aforementioned document image from an image forming apparatus operated by the user, The display control means performs processing to be displayed on the display unit of the image forming apparatus. The information processing apparatus according to configuration 3, characterized by the above. (Composition 14) An information processing device described in any one of items 1 to 13, An image forming apparatus that transmits the document image to the information processing apparatus, An information processing system having (Composition 15) A character recognition step that performs character recognition processing on a document image, An acquisition step to obtain information indicating a first attribute for searching the document image and information indicating a second attribute for generating a file name. Extraction step of extracting strings corresponding to the first attribute and strings corresponding to the second attribute from the group of strings obtained as a result of the character recognition process, A processing step involves associating the document image with a string corresponding to the first attribute as a search index. A generation step of generating the file name of the document image using the string corresponding to the second attribute, An information processing method characterized by having the following features. (Composition 16) A program for causing a computer to execute each of the means of the information processing device described in any one of configurations 1 to 13. [Explanation of Symbols]

[0148] 130 Information Processing Server 131 Information Processing Department

Claims

1. A character recognition means that performs character recognition processing on a document image, An acquisition means for acquiring information indicating a first attribute for searching the document image and information indicating a second attribute for generating a file name, Extraction means for extracting strings corresponding to the first attribute and strings corresponding to the second attribute from the group of strings obtained as a result of the character recognition process, Processing means for associating the document image with a string corresponding to the first attribute as a search index, A generation means that generates the file name of the document image using the string corresponding to the second attribute, An information processing device characterized by having the following features.

2. The extraction means is Based on the output obtained by inputting the group of strings obtained as a result of the character recognition process into a trained model that has been trained to output strings corresponding to predetermined attributes, the strings corresponding to the first attribute and the strings corresponding to the second attribute are extracted. The information processing apparatus according to feature 1.

3. The system further includes a display control means that processes the strings corresponding to the first attribute extracted by the extraction means to be displayed as candidates for a search index for searching the document image. The information processing apparatus according to feature 1.

4. The display control means displays a screen for receiving instructions from the user to modify the string displayed as a candidate. The processing means associates the strings corresponding to the first attribute extracted by the extraction means, specifically the strings that were not modified by the user and the strings that were modified by the user, with the document image as a search index. The information processing apparatus according to claim 3.

5. The processing means associates the search index with the document image by adding a search index as metadata to the file generated from the document image. The information processing apparatus according to feature 1.

6. The system further includes a display control means that displays the file name generated by the generation means as a candidate file name to be assigned to the document image. If the generated file name is modified by the user, the generation means assigns the modified file name to the document image. The information processing apparatus according to feature 1.

7. The predetermined attribute includes the first attribute and the second attribute, The extraction means extracts a string corresponding to the predetermined attribute, The system further includes a display control means for displaying a screen that receives instructions from the user to modify a string corresponding to the predetermined attribute. The information processing apparatus according to feature 2.

8. The extraction means is If the string corresponding to the predetermined attribute is modified by the user, the string corresponding to the first attribute and the string corresponding to the second attribute are extracted from the string corresponding to the predetermined attribute after it has been modified by the user. The information processing apparatus according to feature 7.

9. The extraction means further extracts strings corresponding to the third attribute from the group of strings obtained as a result of the character recognition process, A second generation means for generating a storage location using a string corresponding to the third attribute, A second processing means that processes the document image to save it to the generated storage location, The information processing apparatus according to claim 1, further comprising the features.

10. The system further includes setting means for setting the first attribute and the second attribute. The information processing apparatus according to feature 1.

11. A management means for associating and managing the first attribute and the second attribute for each document type, The system further includes a determination means for determining the document type of the document image, The extraction means extracts a string corresponding to the first attribute associated with the determined document type and a string corresponding to the second attribute associated with the determined document type. The information processing apparatus according to feature 1.

12. The aforementioned determination means is The document type of the document image is determined based on the results obtained by inputting the strings obtained as a result of the character recognition process into the trained model. The information processing apparatus according to feature 11.

13. The device further includes a receiving means for receiving the aforementioned document image from an image forming apparatus operated by the user, The display control means performs processing to be displayed on the display unit of the image forming apparatus. The information processing apparatus according to claim 3.

14. The information processing apparatus according to claim 1, An image forming apparatus that transmits the document image to the information processing apparatus, An information processing system having

15. A character recognition step that performs character recognition processing on a document image, An acquisition step to obtain information indicating a first attribute for searching the document image and information indicating a second attribute for generating a file name. Extraction step of extracting strings corresponding to the first attribute and strings corresponding to the second attribute from the group of strings obtained as a result of the character recognition process, A processing step involves associating the document image with a string corresponding to the first attribute as a search index. A generation step of generating the file name of the document image using the string corresponding to the second attribute, An information processing method characterized by having the following features.

16. A program for causing a computer to execute each of the means of the information processing apparatus described in any one of claims 1 to 13.