Information processing device

The information processing device enhances chatbot accuracy by performing morphological analysis and synonym-based searches to align large language model responses with user intent, addressing the issue of incomplete information extraction in RAG systems.

JP2026100416APending Publication Date: 2026-06-19TOYOTA JIDOSHA KK

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
TOYOTA JIDOSHA KK
Filing Date
2024-12-09
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing chatbots using Retrieval-Augmented Generation (RAG) with large language models often fail to accurately extract relevant information from knowledge bases, leading to incorrect answers due to incomplete or misinterpreted search results.

Method used

An information processing device performs morphological analysis on user input, acquires synonyms, and searches a database using these synonyms to enhance the accuracy of large language models by retrieving relevant data from a knowledge base.

Benefits of technology

Improves the accuracy of large language model responses by ensuring the extracted answers align with user intent through morphological analysis and synonym-based searches.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026100416000001_ABST
    Figure 2026100416000001_ABST
Patent Text Reader

Abstract

To improve the response accuracy of large-scale language models. [Solution] The information processing device (10) includes an analysis means (111) that performs morphological analysis on a question sentence input by a user, an acquisition means (112) that obtains synonyms for at least one word contained in the question sentence based on the results of the morphological analysis, and a search means (113) that searches a database (30) based on the question sentence and the acquired synonyms.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0004] , ,

[0005] , , ,

[0001] The present invention relates to the technical field of information processing devices.

Background Art

[0002] As this type of device, for example, a device has been proposed that generates query data based on a document for a language model and uses a pair of the document and the query data for learning a search model for a chatbot (see Patent Document 1).

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] As a chatbot, a chatbot using a mechanism (Retrieval-Augmented Genration: RAG) that gives a large language model (Large Language Models: LLM) a unique information source by combining a large language model and the search of a specific information source (hereinafter, appropriately referred to as a "knowledge base") has been proposed. In a chatbot using RAG, a unique information source may be given to the large language model by inputting a part of the search results of the knowledge base into the large language model. For example, when the knowledge base is searched with the word "wiper", data including "direction indicator" may not be extracted. As a result, the answer generated by the large language model may be different from the answer sought by the user. Note that a large language model is a language model constructed using a very large dataset and deep learning technology.

[0005] This invention has been made in view of the above circumstances, and aims to provide an information processing device that can improve the response accuracy of large-scale language models. [Means for solving the problem]

[0006] An information processing device according to one aspect of the present invention comprises: an analysis means for performing morphological analysis on a question sentence input by a user; an acquisition means for obtaining synonyms for at least one word contained in the question sentence based on the results of the morphological analysis; and a search means for searching a database based on the question sentence and the acquired synonyms. [Brief explanation of the drawing]

[0007] [Figure 1] This is a diagram showing the configuration of an information processing system according to an embodiment. [Figure 2] This block diagram shows an example of the configuration of a computing device according to the embodiment. [Modes for carrying out the invention]

[0008] Embodiments relating to the information processing device will be described with reference to Figures 1 and 2. In Figure 1, the information processing system 1 comprises an information processing device 10, a server 20, and a knowledge base 30. The information processing device 10, server 20, and knowledge base 30 are configured to communicate with each other via a network NW. Server 20 is a server for operating a large-scale language model (LLM). For this reason, server 20 may be referred to as an LLM server. Server 20 may be a cloud server.

[0009] Knowledge Base 30 may contain multiple text data entries. Each text data entry may be fragmented data generated by splitting a single document. Such fragmented data may be referred to as "chunks." Specific examples of methods for splitting a single document include splitting at fixed lengths, splitting at sentence units based on sentence delimiters, and splitting based on structure such as Markdown. Each text data entry may be vectorized text data. In other words, Knowledge Base 30 may be a vector database / vector store.

[0010] (Information processing device 10) In Figure 1, the information processing device 10 comprises an arithmetic unit 11, a storage device 12, a communication device 13, an input device 14, and an output device 15. The arithmetic unit 11, storage device 12, communication device 13, input device 14, and output device 15 are connected via a data bus 16. The information processing device 10 may be a personal computer, a tablet terminal, or a smartphone.

[0011] The arithmetic unit 11 may have a processor. The arithmetic unit 11 may have a single processor or multiple processors. In other words, the arithmetic unit 11 may have one or more processors. Furthermore, the processor may be a multi-core processor. If the arithmetic unit 11 has a single processor that is a multi-core processor, then logically, the arithmetic unit 11 can be said to have multiple processors.

[0012] The processor may be at least one of the following: CPU (Central Processing Unit), GPU (Graphics Processing Unit), FPGA (Field Programmable Gate Array), and TPU (Tensor Processing Unit).

[0013] The storage device 12 may be at least one of the following: RAM (Random Access Memory), ROM (Read Only Memory), hard disk drive, magneto-optical disk drive, SSD (Solid State Drive), and optical disk array. In other words, the storage device 12 may be implemented by a single device or by multiple devices.

[0014] The communication device 13 may be capable of communicating with an external device (for example, a server 20) of the information processing device 10. The communication device 13 may use either wired or wireless communication.

[0015] The input device 14 is a device capable of receiving information input to the information processing device 10 from an external source. The input device 14 may include an operating device (e.g., a keyboard, mouse, touch panel, etc.) that can be operated by the user of the information processing device 10. The input device 14 may include a recording medium reader capable of reading information recorded on a recording medium that can be attached to and detached from the information processing device 10, such as a USB (Universal Serial Bus) memory. When information is input to the information processing device 10 via the communication device 13 (in other words, when the information processing device 10 acquires information via the communication device 13), the communication device 13 may function as an input device.

[0016] The output device 15 is a device capable of outputting information to the outside of the information processing device 10. The output device 15 has a display device 151 capable of outputting visual information such as characters and images as the above information. The output device 15 may also have a speaker capable of outputting auditory information such as sound as the above information. The output device 15 may also have a vibration motor capable of outputting tactile information such as vibration as the above information. The output device 15 may also have a printer. The output device 15 may be capable of outputting information to a recording medium that can be attached to and detached from the information processing device 10, such as a USB memory stick. When the information processing device 10 outputs information via the communication device 13, the communication device 13 may function as an output device.

[0017] The storage device 12 is capable of storing desired data. The storage device 12 may store the computer program CP that the arithmetic unit 11 will execute. The storage device 12 may temporarily store data that the arithmetic unit 11 will use temporarily when the arithmetic unit 11 is executing the computer program CP.

[0018] Furthermore, the computer program CP may be recorded on a non-temporary recording medium that is readable by a computer. In this case, the computer program CP may be stored in the storage device 12 by reading the recording medium using a recording medium reading device (not shown) provided by the information processing device 10. Furthermore, at least one of the following may be used as the recording medium: an optical disc, a magnetic medium, a magneto-optical disc, a semiconductor memory, and any other medium capable of storing a program. Furthermore, the computer program CP may be obtained from an external device (not shown) of the information processing device 10 via a communication device 13. In other words, the computer program CP may be downloaded from an external device to the storage device 12 of the information processing device 10.

[0019] The arithmetic unit 11 (for example, a processor) may execute the processing that the information processing device 10 should perform together with the memory device 12 in which the computer program CP is stored (in other words, together with the memory device 12 and the computer program CP stored in the memory device 12). For example, by the arithmetic unit 11 executing the computer program CP, a logical functional block for executing the processing that the information processing device 10 should perform may be realized within the arithmetic unit 11 (for example, within the processor).

[0020] Server 20 and knowledge base 30 provide a chatbot service using RAG. The arithmetic unit 11 of the information processing apparatus 10 has an analysis unit 111, an acquisition unit 112, a search unit 113, and an input unit 114 for using the chatbot service (see FIG. 2). The analysis unit 111, the acquisition unit 112, the search unit 113, and the input unit 114 may be realized as the above-described logical functional blocks. Note that at least one of the analysis unit 111, the acquisition unit 112, the search unit 113, and the input unit 114 may be realized as a physical processing circuit. At least one of the analysis unit 111, the acquisition unit 112, the search unit 113, and the input unit 114 may be realized in a form in which logical functional blocks and physical processing circuits are mixed.

[0021] For example, the user may use the chatbot service via the information processing apparatus 10. In this case, the user may input a question sentence via the input device 14 of the information processing apparatus 10. Here, the "question sentence" is not limited to an interrogative sentence. For example, the "question sentence" may be a sentence including expressions such as "Tell me about ****" and "Answer ****", including requests, instructions, commands, and the like. Therefore, the "question sentence" is not limited to an interrogative sentence, but is a concept including a sentence including expressions such as requests, instructions, commands, and the like. That is, the "question sentence" may mean a sentence that requests an answer from the other party.

[0022] The analysis unit 111 of the information processing apparatus 10 performs morphological analysis processing on the question sentence. Note that various existing modes can be applied to the morphological analysis processing. Therefore, a detailed description of the morphological analysis processing is omitted. The acquisition unit 112 of the information processing apparatus 10 acquires a synonym related to at least one word included in the question sentence based on the result of the morphological analysis processing. For example, the acquisition unit 112 may acquire the synonym using a dictionary tool.

[0023] The search unit 113 of the information processing apparatus 10 may search the knowledge base 30 based on the question sentence and the synonym. At this time, the search unit 113 may extract one or more text data related to the question sentence from the knowledge base 30. The search unit 113 may output a search result including one or more text data.

[0024] The input unit 114 of the information processing apparatus 10 transmits, via the communication device 13, a prompt including a question sentence and text data as part of a search result to the server 20. As a result, the prompt is input into the large language model.

[0025] The server 20 transmits an answer to the question sentence, generated by the large language model, to the information processing apparatus 10. For example, the arithmetic unit 11 of the information processing apparatus 10 may control the display device 151 to display the answer generated by the large language model.

[0026] (Technical effect) In the present embodiment, morphological analysis processing is performed on the question sentence input by the user. Based on the result of the morphological analysis processing, a synonym for at least one word included in the question sentence is obtained. Then, the knowledge base 30 is searched based on the question sentence and the synonym. Therefore, it can be expected that text data related to the user's question sentence is extracted from the knowledge base 30. As a result, it can be expected that the answer of the large language model to the question sentence is the answer sought by the user. Thus, according to the information processing apparatus 10 according to the present embodiment, the answer accuracy of the large language model can be improved.

[0027] Various aspects of the invention derived from the embodiments described above will be described below.

[0028] An information processing apparatus according to one aspect of the invention includes an analysis unit that performs morphological analysis processing on a question sentence input by a user, an acquisition unit that acquires a synonym for at least one word included in the question sentence based on the result of the morphological analysis processing, and a search unit that searches a database based on the question sentence and the acquired synonym.

[0029] In the above-described embodiment, "Knowledge Base 30" corresponds to an example of a "database," "Analysis Unit 111" corresponds to an example of an "analysis means," "Acquisition Unit 112" corresponds to an example of an "acquisition means," and "Search Unit 113" corresponds to an example of a "search means."

[0030] The information processing device according to the above embodiment may include input means for inputting the question text and the search results of the database into a large-scale language model. In the above embodiment, the "input unit 114" corresponds to an example of the "input means".

[0031] The present invention is not limited to the embodiments described above, and can be modified as appropriate without contradicting the gist or idea of ​​the invention as can be read from the claims and specification as a whole. Information processing devices that involve such modifications are also included within the technical scope of the present invention. [Explanation of symbols]

[0032] 1...Information processing system, 10...Information processing device, 20...Server, 30...Knowledge base, 111...Analysis unit, 112...Acquisition unit, 113...Search unit, 114...Input unit

Claims

1. An analysis means that performs morphological analysis on a question sentence entered by the user, Based on the results of the morphological analysis process, an acquisition means for obtaining synonyms for at least one word contained in the question sentence, A search means that searches a database based on the aforementioned question and the aforementioned acquired synonyms, An information processing device equipped with the following features.

2. The system includes input means for inputting the aforementioned question and the search results from the database into a large-scale language model. The information processing apparatus according to claim 1.