Information processing system
The information processing system addresses the challenge of uncontrolled search targets in chatbots by using a database with keyword-enhanced text data to refine searches, enabling precise retrieval of relevant information.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- TOYOTA JIDOSHA KK
- Filing Date
- 2024-12-10
- Publication Date
- 2026-06-22
AI Technical Summary
Existing chatbots using large language models struggle with limiting the search target, as users cannot control the information sources accessed by the model.
An information processing system with a database containing text data and additional information, including keywords, allows users to limit the search target by entering specific keywords, using a search unit to refine the search based on these keywords and additional information.
Enables users to narrow down the search results effectively, allowing precise control over the information retrieved by the large language model.
Smart Images

Figure 2026101003000001_ABST
Abstract
Description
Technical Field
[0001] The present invention relates to the technical field of information processing systems.
Background Art
[0002] As this type of system, for example, an apparatus for generating a document to be input as a prompt to a large language model (LLM) has been proposed (see Patent Document 1).
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] For example, by combining a large language model with the search of a specific information source (hereinafter, appropriately referred to as a "knowledge base"), a chatbot using a mechanism (Retrieval-Augmented Generation: RAG) for providing the large language model with a unique information source has been proposed. In such a chatbot, it is often impossible for a user to limit the search target. Note that a large language model is a language model constructed using a very large dataset and deep learning technology.
[0005] The present invention has been made in view of the above circumstances, and an object thereof is to provide an information processing system that allows a user to limit a search target.
Means for Solving the Problems
[0006] An information processing system according to one aspect of the present invention comprises a database in which a plurality of text data, each with additional information including keywords, is registered, and an extraction means for extracting text data from the database, based on a question and keywords entered by a user, which has additional information including keywords corresponding to the entered keywords and is related to the entered question. [Brief explanation of the drawing]
[0007] [Figure 1] This is a diagram showing the configuration of an information processing system according to an embodiment. [Figure 2] This block diagram shows an example of the configuration of a computing device according to the embodiment. [Figure 3] This figure shows an example of a displayed image. [Modes for carrying out the invention]
[0008] Embodiments of the information processing system will be described with reference to Figures 1 to 3. In Figure 1, the information processing system 1 comprises an information processing device 10, a server 20, and a knowledge base 30. The information processing device 10, the server 20, and the knowledge base 30 are configured to communicate with each other via a network NW.
[0009] Server 20 and Knowledge Base 30 provide a chatbot using RAG. Server 20 is a server for operating a Large-Scale Language Model (LLM). Therefore, Server 20 may be referred to as an LLM server. Server 20 may also be a cloud server.
[0010] Knowledge base 30 contains multiple text data. In other words, multiple text data are registered in knowledge base 30. In this embodiment in particular, each of the multiple text data is assigned additional information (e.g., metadata). For example, the additional information may include classification information for classifying the text data. For example, the classification information may include at least one of the following: keywords related to the text data, the field to which the text data belongs, and the type of text data. The additional information may be manually assigned to the text data by the person who registers the text data in knowledge base 30 (e.g., the administrator of knowledge base 30). The additional information may also be automatically assigned to the text data by a predetermined device (not shown) applying predetermined processing (e.g., natural language processing) to the text data registered in knowledge base 30.
[0011] For example, the text data included in (in other words, registered in) Knowledge Base 30 may be fragmented data generated by splitting a document. Fragmented data may be referred to as "chunks." Methods for splitting a document include, for example, splitting it at fixed lengths, splitting it sentence by sentence based on sentence delimiters, or splitting it based on structure such as Markdown. Furthermore, each of the multiple fragmented data may be vectorized and registered in Knowledge Base 30. In other words, Knowledge Base 30 may be a vector database / vector store.
[0012] The information processing device 10 comprises an arithmetic unit 11, a storage device 12, a communication device 13, an input device 14, and an output device 15. The arithmetic unit 11, storage device 12, communication device 13, input device 14, and output device 15 are connected via a data bus 16. The information processing device 10 may be a personal computer, a tablet terminal, or a smartphone.
[0013] The arithmetic unit 11 may have a processor. The arithmetic unit 11 may have a single processor or multiple processors. In other words, the arithmetic unit 11 may have one or more processors. Furthermore, the processor may be a multi-core processor. If the arithmetic unit 11 has a single processor that is a multi-core processor, then logically, the arithmetic unit 11 can be said to have multiple processors.
[0014] The processor may be at least one of the following: CPU (Central Processing Unit), GPU (Graphics Processing Unit), FPGA (Field Programmable Gate Array), and TPU (Tensor Processing Unit).
[0015] The storage device 12 may be at least one of the following: RAM (Random Access Memory), ROM (Read Only Memory), hard disk drive, magneto-optical disk drive, SSD (Solid State Drive), and optical disk array. In other words, the storage device 12 may be implemented by a single device or by multiple devices.
[0016] The communication device 13 may be capable of communicating with an external device (for example, a server 20) of the information processing device 10. The communication device 13 may use either wired or wireless communication.
[0017] The input device 14 is a device capable of receiving information input to the information processing device 10 from an external source. The input device 14 may include an operating device (e.g., a keyboard, mouse, touch panel, etc.) that can be operated by the user of the information processing device 10. The input device 14 may include a recording medium reader capable of reading information recorded on a recording medium that can be attached to and detached from the information processing device 10, such as a USB (Universal Serial Bus) memory. When information is input to the information processing device 10 via the communication device 13 (in other words, when the information processing device 10 acquires information via the communication device 13), the communication device 13 may function as an input device.
[0018] The output device 15 is a device capable of outputting information to the outside of the information processing device 10. The output device 15 has a display device 151 capable of outputting visual information such as characters and images as the above information. The output device 15 may also have a speaker capable of outputting auditory information such as sound as the above information. The output device 15 may also have a vibration motor capable of outputting tactile information such as vibration as the above information. The output device 15 may also have a printer. The output device 15 may be capable of outputting information to a recording medium that can be attached to and detached from the information processing device 10, such as a USB memory stick. When the information processing device 10 outputs information via the communication device 13, the communication device 13 may function as an output device.
[0019] The storage device 12 is capable of storing desired data. The storage device 12 may store the computer program CP that the arithmetic unit 11 will execute. The storage device 12 may temporarily store data that the arithmetic unit 11 will use temporarily when the arithmetic unit 11 is executing the computer program CP.
[0020] Furthermore, the computer program CP may be recorded on a computer-readable and non-transitory recording medium. In this case, the computer program CP may be stored in the storage device 12 by reading the recording medium using a recording medium reader (not shown) provided in the information processing device 10. As the recording medium, at least one of an optical disk, a magnetic medium, a magneto-optical disk, a semiconductor memory, and any other medium capable of storing a program may be used. Furthermore, the computer program CP may be acquired from a device (not shown) outside the information processing device 10 via the communication device 13. In other words, the computer program CP may be downloaded from an external device to the storage device 12 of the information processing device 10.
[0021] The arithmetic unit 11 (e.g., a processor) may execute the processing to be performed by the information processing device 10 together with the storage device 12 in which the computer program CP is stored (in other words, together with the storage device 12 and the computer program CP stored in the storage device 12). For example, by executing the computer program CP, a logical functional block for executing the processing to be performed by the information processing device 10 may be realized in the arithmetic unit 11 (e.g., in the processor).
[0022] The arithmetic unit 11 of the information processing device 10 has a search unit 111, an input unit 112, and an acquisition unit 113 (see FIG. 2) for using the chatbots provided by the server 20 and the knowledge base 30. The search unit 111, the input unit 112, and the acquisition unit 113 may be realized as the above-described logical functional blocks. Furthermore, at least one of the search unit 111, the input unit 112, and the acquisition unit 113 may be realized as a physical processing circuit. At least one of the search unit 111, the input unit 112, and the acquisition unit 113 may be realized in a form in which logical functional blocks and physical processing circuits are mixed.
[0023] For example, user U may use a chatbot using information processing device 10. For example, user U may input a question sentence to the chatbot using information processing device 10. At this time, an image 200 shown in FIG. 3 may be displayed on display device 151 of information processing device 10. Image 200 may include an input field 201 for user U to input a question sentence and an input field 202 for user U to input a keyword. After a question sentence is input into input field 201 and a keyword is input into input field 202, user U may select button 203 via input device 14.
[0024] Note that the "question sentence" is not limited to an interrogative sentence. For example, the "question sentence" may be a sentence including expressions such as "Tell me about ****" and "Answer ****", including requests, instructions, commands, and the like. Therefore, the "question sentence" is not limited to an interrogative sentence, but is a concept including sentences including expressions such as requests, instructions, and commands. That is, the "question sentence" may mean a sentence that requests an answer from the other party.
[0025] Search unit 111 of information processing device 10 searches knowledge base 30 based on the question sentence and keyword input by user U. For example, search unit 111 may identify text data to be searched from a plurality of text data registered in knowledge base 30 based on the input keyword and additional information attached to each text data. That is, search unit 111 may narrow down the search target based on the input keyword and additional information. For example, when the additional information includes the keyword, search unit 111 may identify the text data to which the additional information including the keyword corresponding to the input keyword is attached as the text data to be searched. For example, search unit 111 may calculate a search score indicating the degree of relevance between the input question sentence and the text data identified as the search target. Search unit 111 may extract text data having a search score equal to or higher than a predetermined value as text data related to the question sentence. For example, search unit 111 may extract text data to which additional information including the keyword corresponding to the input keyword is attached and that is related to the input question sentence.
[0026] Here, the refinement of the search target and the calculation of the search score are explained separately, but in practice, the refinement of the search target and the calculation of the search score may be performed simultaneously. Furthermore, the search unit 111 may extract multiple text data related to the question. That is, the search unit 111 may extract one or more text data related to the question from the knowledge base 30. Furthermore, various existing methods can be applied to the calculation method of the search score. Therefore, a detailed explanation of the calculation method of the search score is omitted.
[0027] The input unit 112 of the information processing device 10 sends a prompt containing a question and text data related to the question to the server 20 via the communication device 13. As a result, the prompt is input to the large-scale language model. The server 20 sends the answer to the question generated by the large-scale language model to the information processing device 10. The acquisition unit 113 of the information processing device 10 acquires the answer sent by the server 20. The arithmetic unit 11 of the information processing device 10 may control the display device 151 to display the answer.
[0028] (Technical effects) In this embodiment, the knowledge base 30 contains multiple text data entries, each with additional information attached. When the knowledge base 30 is searched, the search target is narrowed down based on keywords entered by the user (e.g., user U) and the additional information. In other words, in this embodiment, the user (e.g., user U) can narrow down the search target by entering keywords. Therefore, according to the information processing system 1 of this embodiment, the user can limit the search target.
[0029] Various aspects of the invention derived from the embodiments described above are described below.
[0030] An information processing system according to one aspect of the invention comprises a database in which a plurality of text data, each with additional information including keywords, is registered, and an extraction means that extracts text data from the database, based on a question and keywords entered by a user, in which additional information including keywords corresponding to the entered keywords is added, and which is related to the entered question. In the above embodiment, "knowledge base 30" corresponds to an example of a "database," and "search unit 111" corresponds to an example of an "extraction means."
[0031] In the information processing system according to the above embodiment, the additional information may be added manually. The information processing system according to the above embodiment may include a display means for displaying an image that includes a first field for the user to input a question and a second field for the user to input keywords. In the above embodiment, the "display device 151" corresponds to an example of the "display means".
[0032] The present invention is not limited to the embodiments described above, and can be modified as appropriate without contradicting the gist or idea of the invention as can be read from the claims and specification as a whole. Information processing systems involving such modifications are also included within the technical scope of the present invention. [Explanation of Symbols]
[0033] 1... Information processing system, 10... Information processing device, 20... Server, 30... Knowledge base, 111... Search unit, 112... Input unit, 113... Acquisition unit
Claims
1. Each of these is a database containing multiple text data entries with additional information including keywords, An extraction means for extracting text data related to the entered question from the database, based on the question and keywords entered by the user, and for which additional information including keywords corresponding to the entered keywords is added. An information processing system equipped with the following features.
2. The aforementioned additional information is added manually. The information processing system according to claim 1.
3. The system includes a display means for displaying an image that includes a first field for the user to enter a question and a second field for the user to enter keywords. The information processing system according to claim 1.