Information processing device and program
The information processing device enhances generative AI accuracy in business applications by structuring data processing into stages, ensuring high-quality QA information generation at lower costs through summary acquisition and evaluation units.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SEIKO SOLUTIONS
- Filing Date
- 2024-12-03
- Publication Date
- 2026-06-15
AI Technical Summary
Existing methods for improving the accuracy of generative AI in business applications, such as LLMs, are costly and do not always achieve high-quality QA information generation, despite efforts like increasing learning data and manual data quality improvement.
An information processing device that includes a summary information acquisition unit, question generation perspective acquisition unit, QA generation reference document acquisition unit, and evaluation units to generate and refine QA information, ensuring high-quality output at lower costs by structuring the data processing into stages and using a generative AI system.
The device enables the generation of high-quality QA information efficiently by optimizing data processing stages, reducing costs, and improving answer accuracy through structured data extraction and evaluation.
Smart Images

Figure 2026096222000001_ABST
Abstract
Description
【Technical Field】 【0001】 The present invention relates to an information processing apparatus and a program. 【Background Art】 【0002】 With the development and spread of computer technology, documents that have hitherto been in paper form are being digitized. For example, Patent Document 1 discloses a technique for improving business efficiency by presenting a template of a contract document. In recent years, there have been active attempts to apply LLMs (Large Language Models; large-scale language models) that perform advanced natural language processing and generative AI that generates arbitrary text, images, etc. to business. 【Prior Art Documents】 【Patent Documents】 【0003】 【Patent Document 1】 Japanese Unexamined Patent Application Publication No. 2024-31109 【Summary of the Invention】 【Problems to be Solved by the Invention】 【0004】 In order to utilize generative AI or the like in business, it is required that the accuracy of the generated product (for example, an answer to a user's question, etc.) be high. As methods for improving the answer accuracy of generative AI, for example, solutions such as increasing the number of learning data, devising a method for chunking learning data, and improving the quality of learning data (QA information) by manually preparing the learning data of generative AI are generally known. However, even if the above solutions are used, it is not always possible to achieve high answer accuracy, and even if it is achieved, there is a problem that it requires a huge cost. 【0005】 Therefore, the present invention has been made in view of the above points, and an object thereof is to provide a technique capable of generating high-quality QA information at low cost. 【Means for Solving the Problems】 【0006】 One aspect of the present invention is an information processing device comprising: a summary information acquisition unit that acquires summary information which is a summary of a business document containing information about business procedures; a question generation perspective acquisition unit that acquires a question generation perspective which is a perspective relating to the direction of questions to be generated and which includes business content and business procedures; a first supply unit that supplies the summary information and the question generation perspective to a generation unit to generate questions in line with the question generation perspective from the summary information; a QA generation reference document acquisition unit that acquires one or more QA generation reference documents containing information about answers; a second supply unit that supplies one or more of the QA generation reference documents and the questions to a generation unit to extract question information relating to the questions and answer information relating to the answers to the questions from one or more of the QA generation reference documents, and generates QA information which is a combination of the question information and the answer information; and a QA information output unit that outputs the generated QA information. 【0007】 Furthermore, in one embodiment of the present invention, the information processing device further comprises a question evaluation receiving unit that receives evaluations of the generated questions, and the second supply unit generates the QA information based on the questions that have received evaluations of a predetermined standard or higher. 【0008】 Furthermore, in one aspect of the present invention, the information processing device further comprises a QA evaluation receiving unit that receives evaluations of the generated QA information, and the QA information output unit outputs the QA information that has received an evaluation of a predetermined standard or higher to a storage unit. 【0009】 Furthermore, in one embodiment of the present invention, the QA information does not include any information unrelated to either the question or the answer to the question. 【0010】 Furthermore, one aspect of the present invention is a program that causes a computer to execute: a summary information acquisition step of acquiring summary information which is a summary of business documents containing information about business procedures; a question generation perspective acquisition step of acquiring a question generation perspective which is a perspective relating to the direction of questions to be generated and which includes business content and business procedures; a first supply step of supplying the summary information and the question generation perspective to a generation step so as to generate questions in line with the question generation perspective from the summary information; a QA generation reference document acquisition step of acquiring one or more QA generation reference documents containing information about answers; a second supply step of supplying one or more of the QA generation reference documents and the questions to a generation step so as to extract question information relating to the questions and answer information relating to answers to the questions from one or more of the QA generation reference documents, and generate QA information which is a combination of the question information and the answer information; and a QA information output step of outputting the generated QA information. [Effects of the Invention] 【0011】 According to the present invention, high-quality QA information can be generated at low cost. [Brief explanation of the drawing] 【0012】 [Figure 1] This is a diagram to explain RAG (Retrieval-Augmented Generation). [Figure 2] This is a block diagram illustrating an example configuration of the information processing system 1 according to the embodiment. [Figure 3] This figure shows an example of the process related to generating questions that is performed by the information processing device 10 according to the embodiment. [Figure 4] This figure shows an example of the process for generating QA information that is executed by the information processing device 10 according to the embodiment. [Figure 5] This is a diagram illustrating an example of pre-processing according to the embodiment. [Figure 6] This diagram illustrates an example of the processing flow of the information processing device 10 according to the embodiment. [Figure 7] This figure shows a schematic example of the hardware configuration of the information processing device 90 applied to this embodiment. [Figure 8] This is the first diagram illustrating an example of conventional pre-processing. [Figure 9] This is the second diagram illustrating an example of conventional pre-processing. [Modes for carrying out the invention] 【0013】 [Comparison with conventional technology] Figure 1 is a diagram illustrating RAG. RAG has traditionally been known as one of the techniques for improving the accuracy of answers by LLM. RAG is a system that searches for and retrieves documents (business documents, etc.) related to a question and inputs them into the generating AI along with the question, thereby enabling more specialized and accurate answers. This method is currently the most effective and economical way to achieve answers based on knowledge beyond the general knowledge possessed by the generating AI. RAG can be broadly divided into three processes: pre-processing, document search, and answer generation. 【0014】 Preprocessing is the process of transforming various business data into a format suitable for search models to search within business documents. This allows the generating AI to produce answers that are professional and highly accurate enough for practical use. Furthermore, it eliminates the need for manual question and answer creation, reducing the cost of creating Q&A documents. 【0015】 FIG. 8 is a first diagram for explaining an example of conventional preprocessing. FIG. 8 shows an example of chunking a registered original document based on an algorithm set for RAG under predetermined conditions such as paragraphs or the number of characters. Generally, in RAG, chunks containing content similar to the user's question are used for document search and answer generation. However, there are cases where the content that should be the answer to the user's question is not included in the chunk containing content similar to the user's question and cannot be used for the answer (see FIG. 8(A)). Also, since chunks that do not contain words close to the user's question do not hit in document search, there are cases where the chunk containing the content for the answer cannot be appropriately searched (see FIG. 8(B)). In the above cases, the generated answer becomes an answer based on the general theory of the LLM, or in the worst case, no answer can be obtained, making it difficult to use for business support. 【0016】 FIG. 9 is a second diagram for explaining an example of conventional preprocessing. FIG. 9 shows an example of making the document into one chunk as a long text without chunking to avoid information separation by chunking as shown in FIG. 2. Generally, according to the preprocessing of making the document into one chunk as a long text, the answer accuracy is higher than in the case of chunking. However, when making the document into one chunk as a long text, there is a problem that the amount of input documents for the generative AI increases, so the cost and the response time required for the answer increase. Also, the longer the document becomes, the more inaccurate the content recognition by the generative AI becomes, so there is a possibility of inconsistency in the answer results (see FIG. 9(A)). Also, when there is no identical or similar phrase to the user's question within the chunk (in the document), it does not hit in document search, so the problem that the chunk (document) containing the content for the answer cannot be appropriately searched still exists (see FIG. 9(B)). 【0017】 The information processing apparatus and program according to the embodiment solve the above-described problems, generate data suitable for document search by appropriately processing various data, and realize an improvement in the answer accuracy by the generation AI at low cost. Note that the information processing apparatus and program according to the embodiment are not limited to being used for preprocessing of RAG. The information processing apparatus and program according to the embodiment may be used, for example, for generating document information used for fine-tuning or the like, or may be used for creating Q&A. 【0018】 [Embodiment] Regarding the information processing apparatus and program according to the present embodiment, suitable embodiments will be described in detail below with reference to the accompanying drawings. In the description of the drawings, the same or similar parts are denoted by the same or similar reference numerals. Note that the present embodiment is not limited to these embodiments, and also includes those with various modifications or improvements. That is, the constituent elements described below include those that can be easily assumed by those skilled in the art and those that are substantially the same, and the constituent elements described below can be combined as appropriate. Further, in the present embodiment, various omissions, substitutions, or changes of the constituent elements may be made without departing from the gist of the present invention. 【0019】 Note that in all the drawings for explaining the embodiment, those having the same function are denoted by the same reference numerals, and repeated explanations are omitted. Also, in the present application, "based on XX" means "at least based on XX", and includes cases where it is based on another element in addition to XX. Also, "based on XX" is not limited to the case where XX is directly used, and includes cases where it is based on something obtained by performing operations or processing on XX. Also, "something based on XX" may include XX itself. "XX" is an arbitrary element (for example, arbitrary information). Hereinafter, embodiments of the present invention will be described with reference to the drawings. 【0020】 [Configuration Example of Information Processing System] Figure 2 is a block diagram illustrating an example configuration of an information processing system 1 according to an embodiment. The information processing system 1 comprises an information processing device 10, a user terminal 20, and a generation device 30. One user terminal 20 may be connected to one information processing device 10, or multiple user terminals 20 may be connected. 【0021】 The information processing device 10 and the user terminal 20, and the information processing device 10 and the generation device 30 are connected to each other via a network NW (network NW1, network NW2). The network NW may be a wireless communication network or a wired communication network. The network NW may be configured using, for example, the Internet or a local area network (LAN). The network NW may be configured by combining multiple networks. 【0022】 [Example of an information processing device configuration] The information processing device 10 comprises an arithmetic unit 11 and a storage unit 13. 【0023】 The arithmetic unit 11 is equipped with a central processing unit (CPU) and operates based on the programs and data stored in the storage unit 13, providing various functions. 【0024】 The storage unit 13 is composed of, for example, a hard disk drive or semiconductor memory (flash memory, RAM, ROM), and stores various types of information, such as programs and data read by the arithmetic unit 11. The storage unit 13 may also be implemented by a virtual storage device, such as a cloud server, located outside the information processing device 10. 【0025】 [Example of user terminal configuration] The user terminal 20 is, for example, an information processing device such as a personal computer, tablet, or smartphone. The user terminal 20 comprises a user terminal display unit 21, a user terminal operation unit 22, a user terminal calculation unit 23, and a user terminal storage unit 24. 【0026】 The user terminal display unit 21 is equipped with, for example, a liquid crystal display and displays various images. In the following description, the display of an image by the user terminal display unit 21 is also referred to as presenting an image to the user. The user may be a person who uses the information processing device 10, or a person who receives an answer from the generating AI to a user question. 【0027】 The user terminal operation unit 22 may, for example, be equipped with a touch panel to detect user operations. Alternatively, the user terminal operation unit 22 may acquire user operation information from input devices such as a keyboard or mouse. 【0028】 The user terminal processing unit 23 (not shown) is equipped with a central processing unit and operates based on programs and data stored in the user terminal storage unit 24 (not shown), providing various functions. 【0029】 The user terminal storage unit 24 is composed of, for example, a hard disk drive or semiconductor memory, and stores various types of information, such as programs and data read by the user terminal processing unit 23. The user terminal storage unit 24 may also be implemented by a virtual storage device, such as a cloud server, located outside the user terminal 20. 【0030】 [Example of a generating device configuration] The generation device 30 represents an AI that has been pre-trained using natural language, images, or audio. The generation device 30 may be built by the user or it may be built externally (for example, an LLM). Also, if the generation device 30 is built by the user, it may be provided within the information processing device 10. Hereinafter, the process of supplying information to the generation device 30 to generate desired information will be simply referred to as "generating". The generation device 30 may also be called a generation unit. 【0031】 [Functions of the information processing device] The calculation unit 11 includes, as its functional units, a communication unit 110, a display control unit 111, a business document acquisition unit 112, a summary information generation unit 113, a summary information acquisition unit 114, a question generation perspective acquisition unit 115, a first supply unit 116, a question acquisition unit 117, a question evaluation reception unit 118, a QA generation reference document acquisition unit 119, a second supply unit 120, a QA information acquisition unit 121, a QA evaluation reception unit 122, and a QA information output unit 123. Each of these functional units may be implemented using electronic circuits as needed. Furthermore, each functional unit does not have to be included in a single device, and the information processing device 10 may be configured from multiple devices. 【0032】 The communications unit 110 communicates information with the user terminal 20. 【0033】 The display control unit 111 controls the display on the user terminal 20. Specifically, the display control unit 111 generates the screen to be displayed on the user terminal display unit 21 and transmits it to the user terminal 20 via the communication unit 110. The user terminal 20 displays the screen generated by the display control unit 111. 【0034】 [Function of the information processing device: Question generation] Figure 3 shows an example of the question generation process performed by the information processing device 10 according to the embodiment. The information processing device 10 generates comprehensive questions that meet the user's objectives through the processing performed by the function units of the question evaluation reception unit 118 from the business document acquisition unit 112. 【0035】 The business document acquisition unit 112 acquires business document D1. The business document acquisition unit 112 may acquire business document D1 stored in the storage unit 13, or it may acquire business document D1 uploaded based on user operations. 【0036】 Business document D1 may include, for example, know-how related to the business, the overall progress of the business, the steps for each task, the procedures for each unit of work, the rules for performing the business, and the objectives of the business. It is desirable that business document D1 includes information about the business procedures in order to generate more specialized answers. Business document D1 may include, for example, procedure manuals, manuals equivalent to manuals, and business flowcharts. Business document D1 may be a document showing internal information accumulated by the company to which the user belongs, or it may be a commercially available document. Business document D1 may be in various data formats such as text format, PDF format, and image format. 【0037】 The summary information generation unit 113 generates summary information SI by summarizing the business document D1 acquired by the business document acquisition unit 112. The summary information generation unit 113 may also generate summary information SI by extracting pre-summarized content such as headings and a table of contents, which is a list of headings, shown in the business document D1. 【0038】 A pre-summarized document is a summary of the document's content, including its content and its context. A table of contents, which is a list of headings, is, for example, located at the beginning of a document and shows an overview of the document's content and its order of presentation. 【0039】 A heading is, for example, a name or sentence that summarizes the content of a document, and contains less information than the entire document. Headings may also include major headings (e.g., "Section" or "Chapter") that include a word indicating the overall theme of the document, or subheadings and minor headings (e.g., "Section," "Item," or "Sub-item") that include more specific words and summarize the content of the target text or paragraph. In this case, business document D1 may be a document with a two- or more-level structure, such as text, headings that divide the text into multiple sections, major headings, and subheadings. Furthermore, headings may be presented in a format different from typical text, such as ending with a noun, consisting of a single sentence, not containing periods (.) or commas (,), having a heading number such as 1.1 at the beginning, or being written in bold. Headings may also be words that appear multiple times throughout the document. 【0040】 The summary information generation unit 113 may use all elements included in the pre-summarized content as summary information SI, or it may use only some of them as summary information SI. For example, Figure 2 shows an example in which the summary information generation unit 113 generates summary information SI by extracting the main headings "A1" to "C1" and the subheadings "A11" to "C14". This embodiment is not limited to this example, and the summary information generation unit 113 may, for example, not use the main headings as summary information SI, but only use the subheadings as summary information SI. 【0041】 Furthermore, the summary information generation unit 113 may generate one summary information SI from one heading, or generate multiple summary information SI from one heading, or generate one or more summary information from multiple headings. 【0042】 Furthermore, the summary information generation unit 113 may generate summary information SI by summarizing the content shown in the business document D1 as necessary, such as when there is no pre-summarized content described in the business document D1. The summary information generation unit 113 may also generate summary information SI by summarizing an arbitrary range of text, such as paragraph by paragraph or section by section, using, for example, a trained model for natural language processing such as an LLM. In addition, the summary information generation unit 113 may generate summary information SI based on word occurrence conditions, such as the word appearing at the beginning of a paragraph, or words that frequently appear within a predetermined range, such as the entire document or each paragraph. 【0043】 The summary information acquisition unit 114 acquires summary information SI generated by the summary information generation unit 113. The summary information acquisition unit 114 may also acquire summary information SI entered based on user operations. The number of summary information acquired by the summary information acquisition unit 114 may be one, but it is desirable to acquire multiple summaries from the viewpoint of creating comprehensive questions. 【0044】 The question generation perspective acquisition unit 115 acquires one or more question generation perspectives. The question generation perspective acquisition unit 115 may acquire question generation perspectives from the user terminal 20 based on user input operations, acquire those previously stored by the user from the storage unit 13, acquire those generated or deemed necessary by a function unit or generation device 30 (not shown), or acquire them by two or more of these methods. 【0045】 Question generation perspectives are viewpoints regarding the direction of questions generated from summary information (SI) in order to ensure that the questions generated align with the user's objectives. Question generation perspectives may differ depending on the purpose of the questions to be generated or the Q&A to be generated, and they may also differ for each business document D1. For example, question generation perspectives may include "business content" to generate questions about the business content shown in business document D1, "purpose" to generate questions about the purpose of the business content, "procedure" to generate questions about the specific implementation methods of the business content, "terminology" to generate questions about the terminology included in business document D1, "proficiency level" to generate questions that take into account the user's experience (e.g., new employee (up to 3 years of employment)), and "generation AI recommendation" to generate questions that the generation AI deems necessary. It is desirable that the question generation perspectives include "business content" and "procedure" in order to generate answers that are professional and highly accurate enough to be used in business. 【0046】 The first supply unit 116 generates question Q by supplying summary information SI and question generation perspectives to the generation device 30 along with information instructing the creation of question Q (for example, a sentence containing the instructions). The generation device 30 generates question Q from the summary information SI in accordance with the question generation perspectives. The first supply unit 116 may generate one question Q from one summary information SI, or multiple question Qs from one summary information SI, or one or more question Qs from multiple summary information SIs. It is desirable that multiple question Qs be generated from one or more summary information SIs in order to comprehensively create specialized question Qs. For example, the first supply unit 116 may generate question Qs from one summary information SI in accordance with the question generation perspectives of "work content," "purpose," "procedure," and "terminology." The first supply unit 116 may also determine the perspective of the generated question Q (for example, a newcomer's perspective) based on "proficiency level." Specifically, the first supply unit 116 may generate questions Q from the summary information SI "A11," such as "What is A11? (Term)," "What is the purpose of A11? (Purpose)," "What are the business contents of A11? (Business contents)," and "What are the necessary steps to carry out the business of A11? (Procedure)." It may also generate questions Q of a level of expertise that a new employee could ask. The question acquisition unit 117 acquires the questions Q generated by the generation device 30. The first supply unit 116 and the question acquisition unit 117 may together be referred to as the question generation unit. 【0047】 The question Q generated by the first supply unit 116 may be evaluated by a question evaluation unit (not shown) in the information processing device 10 to determine if it was appropriately generated, or it may be displayed on the user terminal 20 and evaluated by the user. For evaluation by the question evaluation unit, a trained model that has been trained using past evaluation history, for example, may be used. When presenting the generated question Q to the user, multiple question Qs may be displayed and evaluated together. Evaluating each question Q individually would be time-consuming in order to generate question Q comprehensively. By having the display control unit 111 display multiple question Qs together and have the user evaluate whether the generation trend of the multiple question Qs is in line with the purpose, multiple question Qs can be easily evaluated. Note that the process of generating QA information, described later, may proceed without evaluating the question Qs. 【0048】 The question evaluation receiving unit 118 receives evaluations for questions Q generated by the question acquisition unit 117. The question evaluation receiving unit 118 may, for example, acquire evaluations based on user operations from the user terminal 20, or acquire evaluation results from the question evaluation unit or the like. The evaluation may be expressed as a binary "good / bad" or as a representation divided into three or more levels. As described above, by executing the processing of the question evaluation receiving unit 118 from the business document acquisition unit 112, high-quality questions Q can be comprehensively generated. 【0049】 The information processing device 10 may, with respect to questions Q that received a low evaluation, use an AI capable of inputting and outputting natural language, images, or audio to correct and re-evaluate the content of questions Q, referring to the evaluation criteria for questions Q and the source documents for creating the QA information (QA generation reference document D2 described later). 【0050】 [Function of information processing device: Generation of QA information] Figure 4 shows an example of the process for generating QA information executed by the information processing device 10 according to the embodiment. The information processing device 10 can generate and output documents that are comprehensive and lead to improved answer accuracy through the processing executed by the functional units from the QA generation reference document acquisition unit 119 to the QA information output unit 123. QA information is information that combines question information about a question and answer information about the answer to that question. 【0051】 The QA generation reference document acquisition unit 119 acquires one or more QA generation reference documents D2. Similar to the business document acquisition unit 112, the QA generation reference document acquisition unit 119 may acquire QA generation reference documents D2 stored in the storage unit 13, or it may acquire QA generation reference documents D2 uploaded based on user operations. The QA generation reference document acquisition unit 119 may also acquire matching QA generation reference documents D2 by similarity search based on summary information SI or question Q words. 【0052】 QA generation reference document D2 is a document that contains information related to the answer, such as information that answers question Q, or information related to the answer to question Q. QA generation reference document D2 may also be a business document D1. Information that answers question Q is information that is described in procedures, manuals, manuals equivalent to manuals, business flows, reference books that describe specialized knowledge in business or industry, etc., and is information related to the specific work content. Information related to the answer to question Q may include information that serves as a hint for the answer, such as information that indicates there is work to be done in a particular situation, even if the specific work content is not shown, as well as information related to specific contacts, such as organizational charts, contact information, usage and registrants of group email addresses, contact information of related companies, project member lists, etc., and information related to the responsibilities of each organization and each employee, such as the areas and responsibilities of each branch, which may be combined with other QA generation reference documents D2 to form an answer. This information related to the answer is called answer information AI. Furthermore, QA generation reference document D2, like business document D1, may be, for example, a procedure manual, a manual or similar document, or a document containing internal information accumulated by the company to which the user belongs, or a commercially available document, or it may be in various data formats such as text format, PDF format, and image format. 【0053】 Furthermore, the QA generation reference document D2 contains question information QI related to question Q, such as the same or similar wording as question Q, or related content that is not the exact content asked by question Q. The question information QI, which is information related to question Q, only needs to be included in at least one of the QA generation reference document acquisition units 119 that the QA generation reference document acquisition unit 119 acquires, and it is not necessary for a single document to contain both the question information QI related to question Q and the answer information AI related to the answer to question Q. 【0054】 The second supply unit 120 generates QA information QAI by supplying question Q and QA generation reference document D2 to the generation device 30. For each question Q, the generation device 30 generates QA information QAI, which is a combination of question information QI and answer information AI from one or more QA generation reference documents D2. The case where the second supply unit 120 generates QA information QAI for "What are the business contents of A11?" will be explained as an example of the process with a specific example. Based on the question Q "What are the business contents of A11?", the second supply unit 120 extracts a sentence (question information QI) that is identical or similar to the phrase "What are the business contents of A11?" from one or more QA generation reference documents D2. The second supply unit 120 also extracts a sentence (answer information AI) that relates to the business contents of "A11" from one or more QA generation reference documents D2 based on the question Q "What are the business contents of A11?". The second supply unit 120 generates QA information QAI for the question Q, "What are the contents of A11's work?" by combining the extracted question information QI and answer information AI. It is desirable that the second supply unit 120 removes information other than the question information QI and answer information AI from the generated QA information QAI, or generates it without including any information other than the question information QI and answer information AI, from the generated QA information QAI to prevent discrepancies in the generated AI and other answers. The QA information acquisition unit 121 acquires the QA information QAI generated by the generation device 30. The second supply unit 120 and the QA information acquisition unit 121 may together be referred to as the QA information generation unit. 【0055】 The second supply unit 120 may generate QA information QAI based on question Q which has been evaluated at a predetermined level or higher by user operation or evaluation by a question evaluation unit (not shown). An evaluation at a predetermined level or higher may be, for example, an evaluation of "good," or an evaluation above a threshold predetermined by the user. In other words, the information processing device 10 makes it easier to generate the QA information QAI desired by the user by using question Q which is in line with the user's purpose when generating the QA information QAI. 【0056】 The QA information QAI generated by the QA information QAI may be evaluated by a QA information evaluation unit (not shown) of the information processing device 10 to determine if it was properly generated, similar to the evaluation of question Q, or it may be displayed on the user terminal 20 and evaluated by the user. 【0057】 The first supply unit 116 and the question evaluation reception unit 118 may generate information using the same generation device 30, or they may generate information using different generation devices. 【0058】 The QA evaluation receiving unit 122 receives evaluations of the QA information QAI generated by the QA information acquisition unit 121. The QA evaluation receiving unit 122 may, for example, acquire evaluations based on user operations from the user terminal 20, or it may acquire evaluation results from a QA information evaluation unit (not shown) or the like. The evaluation may be expressed in the same format as the question evaluation receiving unit 118. As described above, by executing the processing of the QA evaluation receiving unit 122 from the QA generation reference document acquisition unit 119, the high quality of the QA information QAI can be guaranteed. 【0059】 The QA information output unit 123 outputs QA information QAI that has received a predetermined or higher evaluation through user operation or evaluation by a QA information evaluation unit (not shown). The QA information output unit 123 may, for example, output the QA information QAI that has received a predetermined or higher evaluation to a storage unit such as the storage unit 13 and store it. The QA information QAI stored in the storage unit 13 is used as reference information in answer generation. By using QA information QAI that has received a predetermined or higher evaluation in answer generation, the generation AI that answers user questions can generate more specialized and accurate answers. 【0060】 [Comparison with conventional pre-processing] Figure 5 is a diagram illustrating an example of pre-processing according to the embodiment. Figure 5 shows an example of performing a document search using QA information QAI generated by the information processing device 10 according to the embodiment. The information processing device 10 generates QA information QAI by extracting question information QI and answer information AI from QA generation reference document D2, so the question and answer exist in one chunk (Figure 5(A)). Furthermore, when extracting question information QI and answer information AI from multiple QA generation reference documents D2, QA information QAI can be generated even if one QA generation reference document D2 does not contain question information QI and answer information AI (Figure 5(B)). Therefore, the problem of not being able to provide an appropriate answer because the question or answer is not included in one chunk (see Figure 8) can be solved. In addition, QA information QAI does not include content unrelated to the user question (information other than question information QI and answer information AI). Therefore, the problem of subtle discrepancies in the answer content of the generated AI etc. due to unnecessary information (see Figure 9) can be solved. Other information besides the question information QI and answer information AI may include, for example, answer information AI for other questions Q that cannot be an answer to the target question Q, or the name and page number of the QA generation reference document D2 that indicates the source of the information other than the question information QI and answer information AI. 【0061】 [An example of the processing flow of an information processing device] Figure 6 is a diagram illustrating an example of the processing flow of the information processing device 10 according to the embodiment. The business document acquisition unit 112 acquires the business document D1 (step S101). The summary information generation unit 113 generates summary information SI. The summary information acquisition unit 114 acquires the summary information SI (step S102). The question generation perspective acquisition unit 115 acquires the question generation perspective (step S103). The first supply unit 116 generates a question Q in line with the question generation perspective from the summary information SI and acquires it with the question acquisition unit 117 (step S104). The question evaluation reception unit 118 accepts an evaluation of the question Q (step S105). The QA generation reference document acquisition unit 119 acquires one or more QA generation reference documents D2 (step S106). The second supply unit 120 generates QA information QAI by extracting question information QI and answer information AI from the QA generation reference document D2, and the QA information acquisition unit 121 acquires it (step S107). The QA evaluation reception unit 122 accepts evaluations of the QA information QAI (step S108). The QA information output unit 123 outputs the QA information QAI to the storage unit 13, etc., making the generated AI accessible (step S109). 【0062】 As described above, the acquisition of business document D1 (step S101) and the acquisition of QA generation reference document D2 (step S106) may be performed at the same time. For example, the user may upload business document D1 and QA generation reference document D2 simultaneously. In this case, the business document acquisition unit 112 and the QA generation reference document acquisition unit 119 may be implemented by a single functional unit. Furthermore, when the user inputs summary information SI, the acquisition of business document D1 (step S101) and the generation of summary information SI may not be performed. In this case as well, the acquisition of summary information SI (step S102) and the acquisition of QA generation reference document D2 (step S106) may be performed at the same time. 【0063】 In the above description, an example is shown in which the question evaluation receiving unit 118 acquires information indicating the evaluation result. However, this embodiment is not limited to this example, and the question evaluation receiving unit 118 is not necessarily required to acquire information indicating the evaluation result. If the user evaluates that question Q has not been generated appropriately, the user may instruct the information processing device 10 to regenerate question Q by operating the user terminal operation unit 22. The information processing device 10 will regenerate the question Q according to the instruction, so question Q that has received an evaluation below a predetermined level will not be used to generate the QA information QAI. On the other hand, if the user evaluates that question Q has been generated appropriately, the user will instruct the information processing device 10 to continue processing by operating the user terminal operation unit 22. The information processing device 10 will continue processing according to the instruction, so the QA information QAI will be generated using question Q that has received an evaluation of a predetermined level or higher. In such a case, the question evaluation receiving unit 118 can fully fulfill its role if it receives an instruction from the user to continue processing as an evaluation result. In other words, the evaluation received by the question evaluation receiving unit 118 may be an instruction to either continue processing or to regenerate. The evaluation in the QA evaluation receiving unit 122 is the same as the evaluation in the question evaluation receiving unit 118. 【0064】 [Summary of Embodiments] According to the embodiment described above, the information processing device 10 includes: a summary information acquisition unit 114 that acquires summary information SI which is a summary of a business document D1 containing information about business procedures; a question generation perspective acquisition unit 115 that acquires a question generation perspective which is a perspective related to the direction of the questions to be generated, and which includes business content and business procedures; a first supply unit 116 that supplies the summary information and the question generation perspective to the generation unit to generate one or more questions Q in line with the question generation perspective from the summary information SI; a question acquisition unit 117 that acquires the generated questions Q; and a QA that includes information about the answer. The information processing device 10 includes a QA generation reference document acquisition unit 119 that acquires one or more generation reference documents D2, a second supply unit 120 that supplies one or more of the QA generation reference documents and the question to the generation unit, causing the unit to extract question information QI related to question Q and answer information AI related to the answer to question Q from one or more QA generation reference documents D2, and to generate QA information QAI which is a combination of the question information QI and the answer information AI, a QA information acquisition unit 121 that acquires the generated QA information QAI, and a QA information output unit 123 that outputs the generated QA information QAI. In other words, the information processing device 10 generates question Q as preparation for generating QA information QAI. When generating QA information QAI, if only a long reference document is input, there is a possibility that irrelevant QA information QAI will be generated. By dividing the generation process into stages as described above, it is possible to comprehensively generate highly accurate QA information QAI at low cost using a general LLM. 【0065】 Furthermore, when the information processing device 10 is used for pre-processing documents, it is possible to easily generate information (QA information QAI) in which question information QI and answer information AI are included in one chunk, thereby improving the accuracy of the answers generated by the AI. In addition, since QA information QAI is generated by extracting question information QI and answer information AI, the amount of data per chunk becomes smaller, which reduces search costs and optimizes response speed. 【0066】 Furthermore, according to the embodiment described above, the information processing device 10 further includes a question evaluation receiving unit 118 that receives an evaluation (including an instruction to continue processing) for a question Q, and the QA information acquisition unit 121 generates QA information QAI based on question Qs that have received an evaluation of a predetermined standard or higher. In other words, according to the information processing device 10 according to the embodiment, it is possible to check in advance the question Q to be used to generate the QA information QAI. If the generation trend of question Qs is in line with the purpose, there is a high probability that the generated QA information QAI will also be in line with the purpose. The information processing device 10 that can check question Qs in advance can improve the quality of the generated QA information QAI. 【0067】 Furthermore, according to the embodiment described above, the information processing device 10 further includes a QA evaluation receiving unit 122 that receives evaluations (including storage instructions) of the generated QA information QAI, and the QA information output unit 123 outputs the QA information QAI that has received an evaluation of a predetermined standard or higher to the storage unit 13. In other words, according to the information processing device 10 of the embodiment, the quality of the QA information QAI can be confirmed, and consequently, the accuracy of the answers by the generated AI can be improved. 【0068】 Furthermore, according to the embodiment described above, the QA information QAI does not include any information unrelated to either the question Q or the answer to the question Q. In other words, the information processing device 10 according to the embodiment removes information unnecessary for the answer from the QA information QAI. This prevents discrepancies from occurring in the answers generated by the AI, etc. 【0069】 [Example Hardware Configuration] Figure 7 is a schematic diagram of an example hardware configuration of an information processing device 90 applied to this embodiment. The information processing device 90 comprises a processor 91, main memory 92, communication interface 93, auxiliary storage device 94, input / output interface 95, and internal bus 96. The processor 91, main memory 92, communication interface 93, auxiliary storage device 94, and input / output interface 95 are connected to each other via the internal bus 96 so as to be able to communicate with each other. The information processing device 90 may be applied to, for example, an information processing device 10. In this case, for example, the communication unit 110 may be configured using the communication interface 93. For example, the storage unit 13 may be configured using the auxiliary storage device 94. Also, the functional unit shown in the arithmetic unit 11 may be configured using the processor 91 and the main memory 92. 【0070】 Furthermore, the QA information output unit 123 from the business document acquisition unit 112 of the information processing device 10 may be implemented by the user terminal calculation unit 23 of the user terminal 20. In this case, the functions performed by the storage unit 13 may be implemented by the user terminal storage unit 24 or a cloud server, etc. 【0071】 The information processing device 10 may be implemented using multiple information processing devices. For example, the information processing device 10 may be implemented using a device such as a cloud. The arithmetic unit 11 and the storage unit 13 may each be implemented in different information processing devices. For example, in the information processing device 10, business documents D1, QA generation reference documents D2, summary information SI, questions Q, and QA information QAI may be stored in a distributed manner across multiple information processing devices. 【0072】 Furthermore, the entirety or a part thereof of the functions of each part of the information processing device 10 in the above-described embodiment may be realized by recording a program for realizing these functions on a computer-readable recording medium, having a computer system read the program recorded on this recording medium, and executing it. The term "computer system" here includes hardware such as an operating system and peripheral devices. 【0073】 Furthermore, "computer-readable recording media" refers to portable media such as flexible disks, magneto-optical disks, ROMs, and CD-ROMs, as well as recording units such as hard disks built into computer systems. In addition, "computer-readable recording media" may include those that dynamically hold programs for a short period of time, such as communication lines used when transmitting programs over networks such as the Internet or communication lines such as telephone lines, and those that hold programs for a certain period of time, such as volatile memory inside computer systems that act as servers or clients in such cases. Moreover, the above-mentioned program may be for the purpose of realizing some of the functions described above, and may also be able to realize the above-mentioned functions in combination with programs already recorded in the computer system. 【0074】 Although one embodiment of this invention has been described in detail above with reference to the drawings, the specific configuration is not limited to that described above, and various design changes can be made without departing from the spirit of this invention. Furthermore, the configurations described in each embodiment and example above may be combined. [Explanation of symbols] 【0075】 1...Information processing system, 10...Information processing device, 11...Calculation unit, 110...Communication unit, 111...Display control unit, 112...Business document acquisition unit, 113...Summary information generation unit, 114...Summary information acquisition unit, 115...Question generation perspective acquisition unit, 116...First supply unit, 117...Question acquisition unit, 118...Question evaluation reception unit, 119...QA generation reference document acquisition unit, 120...Second supply unit, 121...QA information acquisition unit, 122...QA evaluation reception unit, 123...QA information output unit, 13...Storage unit, 20...User terminal, 21...User terminal display unit, 22...User terminal operation unit, 23...User terminal calculation unit, 24...User terminal storage unit, 30...Generation device, D1...Business document, SI...Summary information, Q...Question, QI...Question information, AI...Answer information, QAI...QA information, D2...QA generation reference document
Claims
[Claim 1] A summary information acquisition unit acquires summary information that summarizes business documents containing information about business procedures, A question generation perspective acquisition unit acquires a question generation perspective that includes the direction of the questions to be generated, and which includes the content and procedures of the work. A first supply unit supplies summary information and the question generation perspective to a generation unit, thereby generating questions from the summary information in accordance with the question generation perspective. A QA generation reference document acquisition unit that acquires one or more QA generation reference documents containing information about the answers, A second supply unit supplies one or more of the QA generation reference documents and the questions to the generation unit, extracts question information relating to the questions and answer information relating to the answers to the questions from one or more of the QA generation reference documents, and generates QA information by combining the question information and the answer information. A QA information output unit that outputs the generated QA information, An information processing device equipped with the following features. [Claim 2] The system further includes a question evaluation receiving unit that receives evaluations for the generated questions, The second supply unit generates the QA information based on the questions that have received an evaluation of a predetermined standard or higher. The information processing apparatus according to claim 1. [Claim 3] The system further includes a QA evaluation receiving unit that receives evaluations of the generated QA information, The QA information output unit outputs the QA information that has received an evaluation of a predetermined standard or higher to the storage unit. The information processing apparatus according to claim 2. [Claim 4] The aforementioned Q&A information does not include any information unrelated to either the question or the answer to the question. The information processing apparatus according to claim 1. [Claim 5] On the computer, A summary information acquisition step to obtain summary information that summarizes business documents containing information about business procedures, A question generation perspective acquisition step that acquires a question generation perspective that includes the content and procedures of the work, which is a perspective on the direction of the questions to be generated. A first supply step involves supplying summary information and the question generation perspective to the generation step, thereby generating questions from the summary information that are in line with the question generation perspective. A step to obtain one or more reference documents for QA generation that contain information about the answers, A second supply step involves supplying one or more of the aforementioned QA generation reference documents and the aforementioned questions to the generation step, thereby extracting question information relating to the questions and answer information relating to the answers to the questions from one or more of the aforementioned QA generation reference documents, and generating QA information by combining the question information and the answer information. A QA information output step that outputs the generated QA information, A program that executes the command.