system

The system addresses the challenge of obtaining and organizing reliable information by collecting, generating, and formatting user requests into easily understandable documents, ensuring quick and accurate information delivery.

JP2026105494APending Publication Date: 2026-06-26SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-16
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Individual users face challenges in quickly obtaining and organizing reliable information from vast amounts of data, which is time-consuming and often of questionable reliability, making it difficult to utilize effectively.

Method used

A system that receives user requests, collects and organizes information, generates documents using natural language processing, formats them for easy understanding, and verifies reliability by cross-referencing with other sources, providing the information in a user-friendly format.

Benefits of technology

Enables users to efficiently and accurately obtain necessary information in a reliable and easily understandable format, saving time and ensuring the accuracy of the content.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026105494000001_ABST
    Figure 2026105494000001_ABST
Patent Text Reader

Abstract

We provide the system. [Solution] A means of receiving user requests, Means for collecting information based on received requests, A means of organizing collected information and generating documents, A means of formatting the generated document and providing it to the user, Means for presenting information to provide guidance via sound and visual means, A means of verifying information by cross-referencing it with reliable sources, A system that includes this.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] In modern society, it is difficult for individual users to quickly obtain the information they need and in a precisely organized form. In particular, ensuring the reliability of information and extracting only the necessary information from a vast amount of information requires a lot of time and effort. As a result, there is a problem that users cannot effectively utilize the information they seek and it is difficult to contribute to the improvement of the quality of life.

Means for Solving the Problems

[0005] The present invention provides a system that receives user requests, collects information based on the received requests, and organizes the collected information to generate documents. In addition, it provides a system that formats the generated documents and provides them to the user, verifies the reliability of the information related to the requests, and adds source information to the generated documents, thereby enabling users to quickly and accurately obtain the information they need.

[0006] "User" refers to an individual or organization that intends to use this system to acquire and organize information.

[0007] A "request" refers to a specific request from a user regarding the information or results they desire from the system.

[0008] "Information" refers to the knowledge, data, or materials requested by the user, and includes text, images, links, etc.

[0009] "Means of collection" refers to the function of acquiring relevant information from the internet or databases based on user requests.

[0010] "Means of organization" refers to the function of structuring collected information and classifying and arranging it in a way that is easy for users to understand.

[0011] "Means of generation" refers to the function of creating documents using natural language processing and other methods based on organized information.

[0012] "Means of provision" refers to the function of presenting the generated document to the user in a format that can be displayed, transmitted, or downloaded.

[0013] "Means of verifying reliability" refers to a function that compares and confirms the accuracy of collected information with other sources.

[0014] "Source information" refers to data that indicates the origin or source of the collected information. [Brief explanation of the drawing]

[0015] [Figure 1] It is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] Shows an emotion map to which multiple emotions are mapped. [Figure 10] Shows an emotion map to which multiple emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.

Embodiments for Carrying Out the Invention

[0016] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0017] First, the terms used in the following description will be explained.

[0018] In the following embodiments, a labeled processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0019] In the following embodiments, a labeled RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0020] In the following embodiments, a labeled storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, and the like.

[0021] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0022] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0023] [First Embodiment]

[0024] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0025] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0026] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0027] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0028] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0029] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0030] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0031] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0032] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0033] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0034] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0035] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0036] The system of the present invention efficiently acquires the information desired by the user and provides it as an accurate manual. This system integrates the processes of information collection, organization, generation, provision, and reliability verification. Specific embodiments of each function are described below.

[0037] User request received

[0038] First, the user enters a request through the terminal. This request is a specific request for information, such as "how to obtain a passport" or "how to enjoy a particular tourist destination." The terminal receives this request and sends it to the server through the interface.

[0039] Information gathering and organization

[0040] The server collects relevant information from the internet and specific databases based on the user's request. The collected information is then categorized according to the request, making it easy to understand what the user expects.

[0041] Document generation

[0042] Using the organized information, the server generates documents using natural language generation technology. These documents are presented in a user-friendly and easy-to-understand format, clearly structured with chapters, bullet points, and other visual aids.

[0043] Document provision

[0044] The generated document is formatted in PDF format and laid out to highlight important information. The document is then sent to the user via their device, who can view or download it.

[0045] Verification of information reliability and provision of source information

[0046] Furthermore, the server verifies the reliability of the collected information by comparing it with other reliable sources. During this process, source information and reference links are added to the documents to ensure users can use the information with confidence.

[0047] Specific example

[0048] For example, if a request is made for "How to enjoy Enoshima," the server will gather information on tourist spots, access, recommended activities and restaurants related to this destination, and organize it into an optimal travel plan. Then, based on this information, a guidebook-style manual will be created and provided to the user as a PDF. Links to the tourist destination's official website and travel review sites will also be included in the manual, eliminating the need for the user to conduct additional research themselves.

[0049] As described above, the system of the present invention provides users with easily understandable and useful information through a series of processes.

[0050] The following describes the processing flow.

[0051] Step 1:

[0052] The user uses the terminal's interface to enter an information request and press the send button. For example, a request might be, "I want to know tourist information for a specific area." The terminal receives this request from the user and sends it to the server.

[0053] Step 2:

[0054] The server analyzes the request received from the terminal and extracts relevant keywords. Based on these keywords, it initiates an information search across internet sources and specified databases.

[0055] Step 3:

[0056] The server filters the collected information, selecting only the most reliable data. Prioritizing information from official sources and specialized databases, it does so with a focus on obtaining reliable information from official sources and specialized databases.

[0057] Step 4:

[0058] The server categorizes the selected information and further organizes it based on its content. This organized information is then used for document generation in the next step.

[0059] Step 5:

[0060] The server uses natural language generation (NLG) to create user-friendly text based on organized information. The document is structured according to specific chapters and flows, clearly explaining key points.

[0061] Step 6:

[0062] The server visually formats the generated text and outputs it in PDF format. During this process, the layout and design are configured with user readability in mind.

[0063] Step 7:

[0064] The server cross-checks the reliability of the information and adds source information and relevant links to the document. This helps users gain a deeper understanding of the document's content.

[0065] Step 8:

[0066] The terminal receives the PDF manual sent from the server and displays it to the user. The user can view or save this manual and print it for use as needed.

[0067] (Example 1)

[0068] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0069] When users acquire specific information, challenges include organizing the collected information, verifying its reliability, and providing it in a user-friendly format. If this process is done manually, it is time-consuming and labor-intensive, and concerns remain regarding the accuracy and reliability of the collected information. Therefore, there is a need for a system that organizes and provides information to users in an efficient and reliable manner.

[0070] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0071] In this invention, the server includes means for receiving user requests, collecting information based on the received requests, and organizing it by category; means for generating documents using natural language generation technology based on the organized information; and means for formatting the generated documents and providing them to the user. As a result, users can quickly and efficiently obtain the information they need and use it with confidence regarding its reliability.

[0072] "Means for receiving user requests" refers to the function that allows users to input requests for information acquisition into the system via their terminals and to receive those inputs.

[0073] "Means of collecting information" refers to the process of obtaining relevant data from the internet or specific databases based on a received request.

[0074] "Methods for organizing by category" refer to methods for classifying collected information into different categories according to the requirements, thereby enabling quick access to the desired information.

[0075] "Methods for generating documents using natural language generation technology" refers to methods that utilize AI technology to generate text based on information organized by category, thereby creating documents in a format that is easy for humans to read.

[0076] "Means of formatting and providing documents to users" refers to the process of formatting generated documents into an appropriate format, such as PDF, and providing them in a form that users can access.

[0077] "Means of verifying the reliability of information by comparing it with other reliable sources" refers to methods of verifying the accuracy of collected information by cross-referencing it with other public sources or reliable data.

[0078] "Means of adding information about the source" refers to the process of clearly indicating the source or origin of information in a generated document so that users can verify the origin and background information of the information.

[0079] This system is designed to enable users to quickly and efficiently obtain specific information and provide it as a reliable document. The following describes specific embodiments of the present invention.

[0080] The user first uses their device to request the information they need. This request can be entered via keyboard or voice input and may include specific details such as "a tourist guide for Shonan." Once the user sends the request, the device transfers this data to the server.

[0081] The server collects information based on this request. The hardware used is a server with a high-speed CPU, and the software utilizes web crawler technology and APIs. The server accesses the internet and specific databases to retrieve relevant information and organizes this information into categories.

[0082] Next, the server generates a document using a generative AI model. This model integrates natural language processing technology, allowing it to generate a human-readable document based on a prompt such as, for example, "Please explain the procedure for applying for a passport." This document is then visually organized and presented clearly, utilizing chapters and bullet points.

[0083] The generated document is formatted using PDF creation software, resulting in a visually easy-to-understand format. User-friendliness is prioritized, with important information highlighted. Finally, the server sends this PDF to the user via their terminal, allowing them to view or download the information.

[0084] In this process, the server verifies the reliability of the collected information against other reliable data sources and adds source information and reference links to the document. This allows users to use the information with confidence and saves them the trouble of conducting additional research themselves. Overall, this system provides an efficient and reliable method of information acquisition and delivery.

[0085] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0086] Step 1:

[0087] The user enters an information request using a terminal. Using an input device (keyboard or voice input), they might request, for example, "Shonan tourist guide." The entered data is sent to the server in a structured format. Specifically, the user enters the request and presses the submit button, at which point the data is processed.

[0088] Step 2:

[0089] The server collects information based on information requests received from users. Using web crawler technology and APIs, it retrieves relevant information from the internet and specific databases. During this process, database searches are performed using keywords related to the requested topic, and the collected data is returned. Specifically, the server parses the request and accesses external sources to retrieve relevant information.

[0090] Step 3:

[0091] The server organizes the collected information by category. Here, a text analysis algorithm is used to divide the information into different categories based on its content. The input to this process is the collected raw data, and the output is data organized by category. Specifically, the server applies the algorithm to classify the information and stores it in a structured format.

[0092] Step 4:

[0093] The server generates text using a generative AI model based on organized information. The input is categorized data, and the output is a document in a human-readable format. Specifically, prompt sentences are fed into the AI ​​model to generate the document. In this process, the AI ​​model outputs natural language based on past training data.

[0094] Step 5:

[0095] The server formats the generated document into PDF format. The input here is the initial generated text, and the output is a visually formatted PDF document. Specifically, the server uses PDF creation software to adjust the document's layout. Particularly important sections are highlighted by changing font size and color.

[0096] Step 6:

[0097] The server verifies the reliability of the collected information and adds source information to the document. The input is a reliable data source compared to the organized information, and the output is the final document with logical consistency and source information added. Specifically, the server verifies reliability by cross-referencing with multiple reliable sources.

[0098] Step 7:

[0099] Ultimately, the server sends the formatted PDF to the user via the terminal. The input is the completed PDF document, and the output is the electronic file sent to the user. Specifically, the server transfers the document to the user's terminal via the network connection. The user can then receive, view, or download it.

[0100] (Application Example 1)

[0101] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0102] In modern society, users have a growing need to quickly and accurately access the information they need from a vast amount of data. However, much of this information is scattered, and some of it is of questionable reliability, making it extremely difficult for users to organize and quickly access it. Furthermore, providing this information through audio and visual means requires additional technological capabilities.

[0103] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0104] In this invention, the server includes means for receiving user requests, means for collecting information based on the received requests, means for organizing the collected information and generating documents, means for formatting the generated documents and providing them to the user, means for presenting information for audio and visual guidance, and means for verifying information by comparing it with reliable information sources. This enables users to quickly and accurately obtain reliable information and to facilitate smooth introduction and purchase decision support through audio and visual guidance.

[0105] "Means for receiving user requests" refers to devices or systems that capture requests via voice input, touch interfaces, etc., in order to accurately obtain the information and support that users are seeking.

[0106] "Means of collecting information" refers to algorithms and programs that search for and retrieve relevant information from the internet or specific databases based on user requests.

[0107] "Means of generating documents" refer to the processes and tools used to construct documents in a user-friendly format, based on organized information and utilizing natural language generation technology.

[0108] "Means of formatting and providing documents to users" refers to a system for visually arranging generated documents in PDF or other formats in an easily viewable manner and distributing them in a format that users can view.

[0109] "Means of presenting information for audio and visual guidance" refers to technologies that use displays and speakers to display or play information in an easy-to-understand format, making it easier for users to access the information.

[0110] "Means of verifying information by cross-referencing with reliable sources" refers to systems and methods that compare and verify collected information with other certified sources in order to ensure its accuracy and reliability.

[0111] To implement this application, the system first receives a user request via a terminal. The user inputs the request using voice or text, for example, "Tell me about a good restaurant nearby." This request is sent to the server, which then collects relevant information from the internet or databases based on the request.

[0112] The server organizes the collected information by category and generates documents using a generative AI model. These documents are structured with chapters and bullet points to facilitate user understanding. For audio and visual guidance, the information is displayed on the terminal's screen and played back through the speaker using speech synthesis software.

[0113] Furthermore, the server verifies the reliability of the information by cross-referencing it with other reliable sources. This process allows users to use the information with confidence. For example, if a user who has moved to a new area asks, "What are some recommended tourist spots nearby?", the server will gather information on nearby tourist attractions and provide a visual map and audio guide.

[0114] An example of a prompt message is: "We are developing a system that can provide surrounding information based on the user's voice requests. Please generate reliable guidance and visual instructions."

[0115] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0116] Step 1:

[0117] The user enters a request using a terminal. In the case of voice input, an audio signal is sent to the terminal via the microphone. The terminal's speech recognition software (e.g., Google® Speech-to-Text) converts the audio into text format and sends it to the server as a request. The input is in the form of audio or text, and the output is text request data.

[0118] Step 2:

[0119] The server parses the received text request and begins data processing to gather information relevant to the request. Here, a natural language processing algorithm is used to analyze the request text and extract relevant keywords. The input is the user's request text, and the output is a list of keywords.

[0120] Step 3:

[0121] The server uses extracted keywords to collect information from the internet and specific databases. This process utilizes web crawling techniques and API calls. The collected information is categorized based on the request. The input is a list of keywords, and the output is organized information data categorized by type.

[0122] Step 4:

[0123] The server generates a document using a generative AI model (e.g., OpenAI® GPT) based on the organized information. This document is structured with chapters and bullet points to make it easy for the user to understand. The input is organized information data, and the output is text data in a document format.

[0124] Step 5:

[0125] The generated document is formatted in PDF format and laid out to highlight important information. Furthermore, the server uses text-to-speech software (e.g., Amazon Polly) to generate voice guidance. The input is the generated document, and the output is a PDF file and an audio file.

[0126] Step 6:

[0127] Ultimately, the server verifies the document's reliability against third-party sources. Source information and reference links are added to the document. Only highly reliable information is provided to the user. The input is the generated document, and the output is the document with source information added.

[0128] Step 7:

[0129] The terminal provides the user with the final document and audio. Visual information is displayed on the screen, and audio guidance is played from the speaker. The information is downloadable, and the user can review it and take action. The input is reliable documents and audio, and the output is the provision of information to the end user.

[0130] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0131] This invention relates to a system that recognizes the user's emotions and adjusts the format and content of information provided based on that information. This system includes an emotion engine and enables more personalized information provision in response to the user's requests. Specific embodiments are described below.

[0132] User emotion recognition

[0133] Users enter requests via voice or text through their device. This interface uses an emotion engine to analyze the user's emotions in real time based on their words, actions, and input. For example, if a user indicates an urgent situation such as "I urgently need to know how to obtain a passport," the emotion engine recognizes the stress and urgency.

[0134] Information gathering and coordination

[0135] The server collects information based on the results of the emotion engine's analysis in response to requests received from users. For example, for users in a hurry, it prioritizes collecting and selecting concise information that can be read quickly. In this way, it flexibly adjusts the information according to the user's emotions.

[0136] Document generation

[0137] The server customizes the document generation process based on the collected information to reflect the user's emotional state. If the emotion engine determines that the user's emotions are stable, it provides detailed information; if their emotions are unstable, it presents the information simply and intuitively.

[0138] Document provision

[0139] Users are provided with formatted documents via their devices. These documents are optimized based on an emotion engine and presented in the most appropriate format according to the user's emotional state. Users can then utilize this information in a timely and stress-free manner.

[0140] Specific example

[0141] For example, if a user is urgently seeking useful information while planning an overseas trip, the emotion engine detects their level of urgency, and the server quickly provides the most relevant and concise information. Documents are written in an emotion-appropriate tone and include links and checklists to help users find the information they need in a short amount of time.

[0142] By combining an emotion engine, the system of this invention goes beyond simply providing information; it understands the user's emotions and provides information tailored to their needs, thereby improving the user experience.

[0143] The following describes the processing flow.

[0144] Step 1:

[0145] Users enter requests via their devices and send them in voice or text format. This information is freely expressed according to the user's intent and purpose.

[0146] Step 2:

[0147] The device is equipped with an emotion engine that analyzes input voice and text to recognize the user's emotions. For example, it can identify emotions such as "hurried" or "excited" from the tone of voice and the content of the text.

[0148] Step 3:

[0149] The server receives user requests and sentiment data from the terminal and begins appropriate processing. Based on the received information, it extracts keywords and begins collecting related information.

[0150] Step 4:

[0151] The server combines the collected information with the results of the emotion engine and filters and sorts the information according to the user's emotional state. For example, if anxiety is detected, it prioritizes simple, actionable steps that can be performed immediately.

[0152] Step 5:

[0153] The server generates documents using organized information. In doing so, it adjusts the tone and style of the text according to the emotions expressed. For calm situations, it includes details; for anxious situations, it uses concise and reassuring language.

[0154] Step 6:

[0155] Once a document is generated, the server converts it to PDF format and arranges the layout to reflect emotionally relevant elements. This ensures that the information is visually harmonious.

[0156] Step 7:

[0157] The device receives the generated PDF document and provides it to the user. The user can efficiently and comfortably utilize the information by viewing it in an emotionally optimized format.

[0158] (Example 2)

[0159] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0160] Traditional information delivery systems fail to provide information appropriately, taking into account the user's emotional state, and thus the user experience is not necessarily improved. In particular, there is a challenge in that necessary information is not delivered quickly and appropriately when users are stressed or in a hurry.

[0161] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0162] In this invention, the server includes means for recognizing the emotional state of a user based on a received user request using an emotion analysis system, means for selecting and collecting information based on the emotional state, and means for generating a document using a generative AI model that takes the emotional state into consideration and providing it to the user. This makes it possible to provide information that is tailored to the user's emotional state, thereby improving the user experience.

[0163] An "information processing device" is a device that receives user requests in the form of voice or text and prepares the data necessary for subsequent processing.

[0164] An "emotion analysis system" is a technology or device that analyzes a user's words, actions, and input content to recognize that person's emotional state in real time.

[0165] A "generative AI model" is an artificial intelligence technology that generates text based on collected information and analysis results, and presents it to users in a more user-friendly way.

[0166] "Adding a source of information" means adding that source to a generated document in order to clearly indicate the source of the information on which it is based.

[0167] A description of embodiments for carrying out the present invention will be provided.

[0168] This system has the ability to recognize the user's emotions in real time and flexibly adjust the information provided according to that state. To achieve this, the system mainly uses the following hardware and software components.

[0169] terminal

[0170] The terminal is a device that receives requests from users via voice or text. Requests entered by the user are converted into text using speech recognition technology and then into a format that is easy to process. The terminal functions as the user interface, playing a role in both information input and result output.

[0171] server

[0172] The server receives requests sent from terminals and recognizes the user's emotional state through an emotion analysis system. This system utilizes natural language processing technology to analyze input words and their context, identifying emotions such as stress, tension, and relief. Furthermore, based on these analysis results, the server quickly collects relevant information from the internet and databases.

[0173] Generative AI Models

[0174] The collected information is transformed into a user-optimized document format using a generative AI model. The generative AI model extracts the key points of the information while adjusting the document's tone and style to reflect the user's emotional state.

[0175] Document provision

[0176] The generated document is provided to the user via a terminal. The terminal can format the document, such as by adding diagrams or lists, to make it easier to understand intuitively.

[0177] Specific example

[0178] For example, if a user is planning an overseas trip and feels an urgent need to know how to obtain a passport, the system analyzes that feeling, gathers relevant information, and generates a concise and easy-to-understand document. The document includes links and checklists, formatted to allow users to quickly obtain the necessary information.

[0179] Example of a prompt

[0180] "Please generate text that provides stress-reducing information for users who are in a hurry when planning their trip."

[0181] In this way, by combining emotion analysis and generative AI models, this system can not only provide information but also deliver information that is optimally tailored to the user's emotions.

[0182] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0183] Step 1:

[0184] The terminal receives voice or text input from the user. This input concerns the information the user is requesting. In the case of voice input, the terminal uses speech recognition technology to convert it into text and prepares it for transmission to the server. At this point, the output is text data in a format that the server can parse.

[0185] Step 2:

[0186] The server receives text data sent from the terminal and analyzes the user's emotions using an emotion analysis system. The input is text data recording the user's requests, and natural language processing techniques are used to analyze the words and context to identify emotions such as stress, tension, and relief. The output is the analysis result indicating the user's emotional state.

[0187] Step 3:

[0188] The server selects and collects information appropriate to the user based on the results of sentiment analysis. Specifically, it prioritizes picking up information that matches the user's situation through internet and database searches. The input for this step is the results of sentiment analysis and the user's requests, and the output is a selected and organized collection of information.

[0189] Step 4:

[0190] The server uses a generative AI model based on collected information to generate documents tailored to the user. The input consists of organized information and analyzed emotional states. The generative AI model extracts key points from the information while considering the tone and style appropriate to the user's emotional state to generate the text. The output is a customized document.

[0191] Step 5:

[0192] The terminal receives the document generated by the server and prepares it for the user. In this step, the obtained document is formatted to make it easy for the user to understand. For example, diagrams and list formats are used to aid visual understanding. The input is the generated document, and the output is the formatted document displayed on the terminal.

[0193] (Application Example 2)

[0194] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0195] In today's information society, there is a demand for information tailored to the individual emotions of each user. However, existing information systems generally only provide standardized information, making it difficult to offer services that are adapted to the user's psychological state. In particular, when providing information and support within the family, there is a need for a means to accurately recognize the emotions and psychological states of each family member and provide optimized information.

[0196] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0197] In this invention, the server includes emotion recognition means for analyzing the user's emotional state, means for collecting information based on received requests, and means for organizing the collected information and generating documents. This makes it possible to tailor and provide information based on the user's emotions.

[0198] "Users" refers to individual people who are the recipients of the service to which information is provided.

[0199] A "request" refers to the information requests or questions that users input into the system.

[0200] "Means of collecting information" refers to a system that finds and compiles relevant data in response to user requests.

[0201] "Means of generating documents" refers to the process of organizing collected information based on user requests and outputting it in a formalized form.

[0202] "Means of providing to users" refers to methods and equipment for presenting generated documents in a format that users can view.

[0203] "Emotion recognition means" refers to technology that analyzes and identifies emotions from a user's facial expressions and voice data.

[0204] "Means of adjusting and providing information content" refers to methods for optimizing and presenting the format and content of information according to the user's emotions.

[0205] This invention provides an emotion recognition and information provision system implemented in a home robot. This system senses the user's emotional state and provides information corresponding to that state. By utilizing a variety of devices, the system achieves more intuitive and personalized information delivery.

[0206] The server uses speech recognition software and video analysis technology as means of emotion recognition. Specifically, it utilizes Google Cloud's natural language processing service to analyze emotions from facial expressions and tone of voice. This makes it possible to understand the user's emotions in real time. Furthermore, audio and video data are collected via a home robot equipped with a camera and microphone. This home robot is equipped with an Intel RealSense camera and a microphone array for remote speech recognition, and it acquires emotion data.

[0207] The device uses a generative AI model based on acquired emotional data to collect and generate appropriate information. This process utilizes machine learning platforms such as TENSORFLOW®. When the user is experiencing stress, it can provide concise and timely information; when their emotions are stable, it can present detailed information in an easy-to-understand format. When providing information, it can also provide guidance via voice using speech synthesis technology such as Amazon Polly.

[0208] For example, if a user asks, "What would you like to cook today?", the robot will suggest a suitable menu based on the user's mood. Alternatively, by using a prompt such as, "A 13-year-old child is feeling stressed while studying math. Please provide suggestions for improving this situation," the robot can offer advice tailored to the home environment. This ensures reliable and valuable information is provided to the user.

[0209] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0210] Step 1:

[0211] Users input requests via voice or text through a home robot. If the input is voice, the robot's microphone captures it and sends it to the server as voice data. If the input is text, it is sent directly to the server as text data.

[0212] Step 2:

[0213] The server analyzes the received audio data using Google Cloud's natural language processing service to identify the user's emotions. The audio data is broken down into words and phrases, and features such as tone and speed of voice are analyzed. The results of this analysis are output as the emotional state.

[0214] Step 3:

[0215] The server then uses a generative AI model to collect information based on user requests. Considering the user's emotional state, it selects short, concise information for stressed users and detailed, comprehensive information for users who are more relaxed. The collected information is then passed on to the next step.

[0216] Step 4:

[0217] The server organizes the collected information and generates documents. It utilizes machine learning platforms such as TensorFlow to construct information layouts tailored to emotional states. The generated documents are arranged and formatted in a way that is easily understandable to the user.

[0218] Step 5:

[0219] The home robot, acting as a terminal, provides users with generated documents. The information is presented visually via a display or aurally using Amazon Polly's speech synthesis technology. Based on this information, users can make informed decisions and take appropriate actions.

[0220] Step 6:

[0221] The user reviews the suggested information. If they find it particularly helpful in reducing stress or solving problems, they can ask additional questions. These additional questions return them to step 1, starting a new information gathering cycle.

[0222] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0223] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0224] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0225] [Second Embodiment]

[0226] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0227] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0228] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0229] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0230] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0231] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0232] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0233] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0234] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0235] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0236] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0237] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0238] The system of the present invention efficiently acquires the information desired by the user and provides it as an accurate manual. This system integrates the processes of information collection, organization, generation, provision, and reliability verification. Specific embodiments of each function are described below.

[0239] User request received

[0240] First, the user enters a request through the terminal. This request is a specific request for information, such as "how to obtain a passport" or "how to enjoy a particular tourist destination." The terminal receives this request and sends it to the server through the interface.

[0241] Information gathering and organization

[0242] The server collects relevant information from the internet and specific databases based on the user's request. The collected information is then categorized according to the request, making it easy to understand what the user expects.

[0243] Document generation

[0244] Using the organized information, the server generates documents using natural language generation technology. These documents are presented in a user-friendly and easy-to-understand format, clearly structured with chapters, bullet points, and other visual aids.

[0245] Document provision

[0246] The generated document is formatted in PDF format and laid out to highlight important information. The document is then sent to the user via their device, who can view or download it.

[0247] Verification of information reliability and provision of source information

[0248] Furthermore, the server verifies the reliability of the collected information by comparing it with other reliable sources. During this process, source information and reference links are added to the documents to ensure users can use the information with confidence.

[0249] Specific example

[0250] For example, if a request is made for "How to enjoy Enoshima," the server will gather information on tourist spots, access, recommended activities and restaurants related to this destination, and organize it into an optimal travel plan. Then, based on this information, a guidebook-style manual will be created and provided to the user as a PDF. Links to the tourist destination's official website and travel review sites will also be included in the manual, eliminating the need for the user to conduct additional research themselves.

[0251] As described above, the system of the present invention provides users with easily understandable and useful information through a series of processes.

[0252] The following describes the processing flow.

[0253] Step 1:

[0254] The user uses the terminal's interface to enter an information request and press the send button. For example, a request might be, "I want to know tourist information for a specific area." The terminal receives this request from the user and sends it to the server.

[0255] Step 2:

[0256] The server analyzes the request received from the terminal and extracts relevant keywords. Based on these keywords, it initiates an information search across internet sources and specified databases.

[0257] Step 3:

[0258] The server filters the collected information, selecting only the most reliable data. Prioritizing information from official sources and specialized databases, it does so with a focus on obtaining reliable information from official sources and specialized databases.

[0259] Step 4:

[0260] The server categorizes the selected information and further organizes it based on its content. This organized information is then used for document generation in the next step.

[0261] Step 5:

[0262] The server uses natural language generation (NLG) to create user-friendly text based on organized information. The document is structured according to specific chapters and flows, clearly explaining key points.

[0263] Step 6:

[0264] The server visually formats the generated text and outputs it in PDF format. During this process, the layout and design are configured with user readability in mind.

[0265] Step 7:

[0266] The server cross-checks the reliability of the information and adds source information and relevant links to the document. This helps users gain a deeper understanding of the document's content.

[0267] Step 8:

[0268] The terminal receives the PDF manual sent from the server and displays it to the user. The user can view or save this manual and print it for use as needed.

[0269] (Example 1)

[0270] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0271] When users acquire specific information, challenges include organizing the collected information, verifying its reliability, and providing it in a user-friendly format. If this process is done manually, it is time-consuming and labor-intensive, and concerns remain regarding the accuracy and reliability of the collected information. Therefore, there is a need for a system that organizes and provides information to users in an efficient and reliable manner.

[0272] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0273] In this invention, the server includes means for receiving user requests, collecting information based on the received requests, and organizing it by category; means for generating documents using natural language generation technology based on the organized information; and means for formatting the generated documents and providing them to the user. As a result, users can quickly and efficiently obtain the information they need and use it with confidence regarding its reliability.

[0274] "Means for receiving user requests" refers to the function that allows users to input requests for information acquisition into the system via their terminals and to receive those inputs.

[0275] "Means of collecting information" refers to the process of obtaining relevant data from the internet or specific databases based on a received request.

[0276] "Methods for organizing by category" refer to methods for classifying collected information into different categories according to the requirements, thereby enabling quick access to the desired information.

[0277] "Methods for generating documents using natural language generation technology" refers to methods that utilize AI technology to generate text based on information organized by category, thereby creating documents in a format that is easy for humans to read.

[0278] "The means for formatting and providing documents to users" refers to the process of formatting the generated documents into an appropriate format, such as PDF, and providing them in a form accessible to users.

[0279] "The means for verifying the reliability of information by comparing it with other reliable information sources" refers to the method of verifying the accuracy of the collected information by comparing it with other public information sources or reliable data.

[0280] "The means for annotating information sources" refers to the process of explicitly indicating the reference sources or citations in the generated documents so that users can confirm the origin and background information of the information.

[0281] This system is for users to quickly and efficiently obtain specific information and provide it as a highly reliable document. The following shows the specific implementation forms of the present invention.

[0282] First, the user uses the terminal to request the required information. This request is input by keyboard or voice input, and specific content such as "Xiangnan tourist guide" is applicable. When the user sends the request, the terminal transfers this data to the server.

[0283] Based on this request, the server collects information. As hardware, a server with a CPU capable of high-speed processing is used, and for software, web crawler technology and APIs are used. The server accesses the Internet or a specific database to obtain relevant information and organizes this information by category.

[0284] Next, the server uses a generative AI model to generate a document. The generative AI model integrates natural language processing technology, and based on a prompt sentence such as "Please tell me the procedure for applying for a passport", it can generate a document in a form that is easy for people to read. This document is visually easy to understand and edited in a structure that utilizes chapter headings and bullet points.

[0285] The generated document is formatted using PDF creation software, resulting in a visually easy-to-understand form. Consideration is given to usability, such as highlighting important information. Finally, the server sends this PDF to the user via the terminal, and the user can view or download the information.

[0286] In this process, the server verifies the reliability of the collected information against other reliable data sources and attaches source information and reference links to the document. This allows users to use the information with confidence and saves them the trouble of additional investigation. Overall, this system realizes an efficient and reliable method for information acquisition and provision.

[0287] The flow of the specific process in Example 1 will be described using FIG. 11.

[0288] Step 1:

[0289] The user inputs an information request using the terminal. Using an input device (keyboard or voice input), for example, requests "A tourist guide to Shonan". The input data is sent to the server in a structured format. As a specific operation, the user inputs a request and presses the send button, and the data is processed.

[0290] Step 2:

[0291] The server collects information based on the information request received from the user. By using web crawler technology and APIs, relevant information is obtained from the Internet or specific databases. At this time, a database search is performed using keywords related to the requested topic, and the collected data is returned. As a specific operation, the server analyzes the request and accesses external sources to obtain relevant information.

[0292] Step 3:

[0293] The server organizes the collected information by category. Here, a text analysis algorithm is used to divide the information into different categories based on its content. The input to this process is the collected raw data, and the output is data organized by category. Specifically, the server applies the algorithm to classify the information and stores it in a structured format.

[0294] Step 4:

[0295] The server generates text using a generative AI model based on organized information. The input is categorized data, and the output is a document in a human-readable format. Specifically, prompt sentences are fed into the AI ​​model to generate the document. In this process, the AI ​​model outputs natural language based on past training data.

[0296] Step 5:

[0297] The server formats the generated document into PDF format. The input here is the initial generated text, and the output is a visually formatted PDF document. Specifically, the server uses PDF creation software to adjust the document's layout. Particularly important sections are highlighted by changing font size and color.

[0298] Step 6:

[0299] The server verifies the reliability of the collected information and adds source information to the document. The input is a reliable data source compared to the organized information, and the output is the final document with logical consistency and source information added. Specifically, the server verifies reliability by cross-referencing with multiple reliable sources.

[0300] Step 7:

[0301] Ultimately, the server sends the formatted PDF to the user via the terminal. The input is the completed PDF document, and the output is the electronic file sent to the user. Specifically, the server transfers the document to the user's terminal via the network connection. The user can then receive, view, or download it.

[0302] (Application Example 1)

[0303] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0304] In modern society, users have a growing need to quickly and accurately access the information they need from a vast amount of data. However, much of this information is scattered, and some of it is of questionable reliability, making it extremely difficult for users to organize and quickly access it. Furthermore, providing this information through audio and visual means requires additional technological capabilities.

[0305] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0306] In this invention, the server includes means for receiving user requests, means for collecting information based on the received requests, means for organizing the collected information and generating documents, means for formatting the generated documents and providing them to the user, means for presenting information for audio and visual guidance, and means for verifying information by comparing it with reliable information sources. This enables users to quickly and accurately obtain reliable information and to facilitate smooth introduction and purchase decision support through audio and visual guidance.

[0307] "Means for receiving user requests" refers to devices or systems that capture requests via voice input, touch interfaces, etc., in order to accurately obtain the information and support that users are seeking.

[0308] The "means for collecting information" refers to algorithms and programs for detecting and obtaining relevant information from the Internet or specific databases based on user requests.

[0309] The "means for generating a document" refers to processes and tools for constructing a document in a user-friendly format by making full use of natural language generation technology based on the organized information.

[0310] The "means for formatting and providing the document to the user" refers to a system for visually arranging the generated document in PDF or other formats in an easy-to-understand manner and distributing it in a state where the user can view it.

[0311] The "means for presenting information for voice and visual guidance" refers to technologies for displaying or playing information in an easy-to-understand form using a display or speaker to make it easier for users to access the information.

[0312] The "means for verifying information by comparison with a highly reliable information source" refers to a system or method for comparing and confirming with other authenticated information sources to ensure the accuracy and reliability of the collected information.

[0313] To realize this application example, the system first receives the user's request via the terminal. The user inputs the request using voice or text, for example, requests "Tell me about good restaurants nearby". This request is sent to the server, and the server collects relevant information from the Internet or databases based on the request.

[0314] The server organizes the collected information by category and generates a document using a generation AI model. This document is structured using headings and bullet points to be easy for the user to understand. For voice and visual guidance, the information is displayed on the terminal's display and played from the speaker using voice synthesis software.

[0315] Furthermore, the server verifies the reliability of the information by cross-referencing it with other reliable sources. This process allows users to use the information with confidence. For example, if a user who has moved to a new area asks, "What are some recommended tourist spots nearby?", the server will gather information on nearby tourist attractions and provide a visual map and audio guide.

[0316] An example of a prompt message is: "We are developing a system that can provide surrounding information based on the user's voice requests. Please generate reliable guidance and visual instructions."

[0317] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0318] Step 1:

[0319] The user enters a request using a device. In the case of voice input, an audio signal is sent to the device via the microphone. The device's speech recognition software (e.g., Google Speech-to-Text) converts the audio into text and sends it to the server as a request. The input is in the form of audio or text, and the output is text request data.

[0320] Step 2:

[0321] The server parses the received text request and begins data processing to gather information relevant to the request. Here, a natural language processing algorithm is used to analyze the request text and extract relevant keywords. The input is the user's request text, and the output is a list of keywords.

[0322] Step 3:

[0323] The server uses extracted keywords to collect information from the internet and specific databases. This process utilizes web crawling techniques and API calls. The collected information is categorized based on the request. The input is a list of keywords, and the output is organized information data categorized by type.

[0324] Step 4:

[0325] The server generates a document using a generative AI model (e.g., OpenAI GPT) based on the organized information. This document is structured with chapters and bullet points to make it easy for the user to understand. The input is organized information data, and the output is text data in a document format.

[0326] Step 5:

[0327] The generated document is formatted in PDF format and laid out to highlight important information. Furthermore, the server uses text-to-speech software (e.g., Amazon Polly) to generate voice guidance. The input is the generated document, and the output is a PDF file and an audio file.

[0328] Step 6:

[0329] Ultimately, the server verifies the document's reliability against third-party sources. Source information and reference links are added to the document. Only highly reliable information is provided to the user. The input is the generated document, and the output is the document with source information added.

[0330] Step 7:

[0331] The terminal provides the user with the final document and audio. Visual information is displayed on the screen, and audio guidance is played from the speaker. The information is downloadable, and the user can review it and take action. The input is reliable documents and audio, and the output is the provision of information to the end user.

[0332] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0333] This invention relates to a system that recognizes the user's emotions and adjusts the format and content of information provided based on that information. This system includes an emotion engine and enables more personalized information provision in response to the user's requests. Specific embodiments are described below.

[0334] User emotion recognition

[0335] Users enter requests via voice or text through their device. This interface uses an emotion engine to analyze the user's emotions in real time based on their words, actions, and input. For example, if a user indicates an urgent situation such as "I urgently need to know how to obtain a passport," the emotion engine recognizes the stress and urgency.

[0336] Information gathering and coordination

[0337] The server collects information based on the results of the emotion engine's analysis in response to requests received from users. For example, for users in a hurry, it prioritizes collecting and selecting concise information that can be read quickly. In this way, it flexibly adjusts the information according to the user's emotions.

[0338] Document generation

[0339] The server customizes the document generation process based on the collected information to reflect the user's emotional state. If the emotion engine determines that the user's emotions are stable, it provides detailed information; if their emotions are unstable, it presents the information simply and intuitively.

[0340] Document provision

[0341] Users are provided with formatted documents via their devices. These documents are optimized based on an emotion engine and presented in the most appropriate format according to the user's emotional state. Users can then utilize this information in a timely and stress-free manner.

[0342] Specific example

[0343] For example, if a user is urgently seeking useful information while planning an overseas trip, the emotion engine detects their level of urgency, and the server quickly provides the most relevant and concise information. Documents are written in an emotion-appropriate tone and include links and checklists to help users find the information they need in a short amount of time.

[0344] By combining an emotion engine, the system of this invention goes beyond simply providing information; it understands the user's emotions and provides information tailored to their needs, thereby improving the user experience.

[0345] The following describes the processing flow.

[0346] Step 1:

[0347] Users enter requests via their devices and send them in voice or text format. This information is freely expressed according to the user's intent and purpose.

[0348] Step 2:

[0349] The device is equipped with an emotion engine that analyzes input voice and text to recognize the user's emotions. For example, it can identify emotions such as "hurried" or "excited" from the tone of voice and the content of the text.

[0350] Step 3:

[0351] The server receives user requests and sentiment data from the terminal and begins appropriate processing. Based on the received information, it extracts keywords and begins collecting related information.

[0352] Step 4:

[0353] The server combines the collected information with the results of the emotion engine and filters and sorts the information according to the user's emotional state. For example, if anxiety is detected, it prioritizes simple, actionable steps that can be performed immediately.

[0354] Step 5:

[0355] The server generates documents using organized information. In doing so, it adjusts the tone and style of the text according to the emotions expressed. For calm situations, it includes details; for anxious situations, it uses concise and reassuring language.

[0356] Step 6:

[0357] Once a document is generated, the server converts it to PDF format and arranges the layout to reflect emotionally relevant elements. This ensures that the information is visually harmonious.

[0358] Step 7:

[0359] The device receives the generated PDF document and provides it to the user. The user can efficiently and comfortably utilize the information by viewing it in an emotionally optimized format.

[0360] (Example 2)

[0361] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0362] Traditional information delivery systems fail to provide information appropriately, taking into account the user's emotional state, and thus the user experience is not necessarily improved. In particular, there is a challenge in that necessary information is not delivered quickly and appropriately when users are stressed or in a hurry.

[0363] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0364] In this invention, the server includes means for recognizing the emotional state of a user based on a received user request using an emotion analysis system, means for selecting and collecting information based on the emotional state, and means for generating a document using a generative AI model that takes the emotional state into consideration and providing it to the user. This makes it possible to provide information that is tailored to the user's emotional state, thereby improving the user experience.

[0365] An "information processing device" is a device that receives user requests in the form of voice or text and prepares the data necessary for subsequent processing.

[0366] An "emotion analysis system" is a technology or device that analyzes a user's words, actions, and input content to recognize that person's emotional state in real time.

[0367] A "generative AI model" is an artificial intelligence technology that generates text based on collected information and analysis results, and presents it to users in a more user-friendly way.

[0368] "Adding a source of information" means adding that source to a generated document in order to clearly indicate the source of the information on which it is based.

[0369] A description of embodiments for carrying out the present invention will be provided.

[0370] This system has the ability to recognize the user's emotions in real time and flexibly adjust the information provided according to that state. To achieve this, the system mainly uses the following hardware and software components.

[0371] terminal

[0372] The terminal is a device that receives requests from users via voice or text. Requests entered by the user are converted into text using speech recognition technology and then into a format that is easy to process. The terminal functions as the user interface, playing a role in both information input and result output.

[0373] server

[0374] The server receives requests sent from terminals and recognizes the user's emotional state through an emotion analysis system. This system utilizes natural language processing technology to analyze input words and their context, identifying emotions such as stress, tension, and relief. Furthermore, based on these analysis results, the server quickly collects relevant information from the internet and databases.

[0375] Generative AI Models

[0376] The collected information is transformed into a user-optimized document format using a generative AI model. The generative AI model extracts the key points of the information while adjusting the document's tone and style to reflect the user's emotional state.

[0377] Document provision

[0378] The generated document is provided to the user via a terminal. The terminal can format the document, such as by adding diagrams or lists, to make it easier to understand intuitively.

[0379] Specific example

[0380] For example, if a user is planning an overseas trip and feels an urgent need to know how to obtain a passport, the system analyzes that feeling, gathers relevant information, and generates a concise and easy-to-understand document. The document includes links and checklists, formatted to allow users to quickly obtain the necessary information.

[0381] Example of a prompt

[0382] "Please generate text that provides stress-reducing information for users who are in a hurry when planning their trip."

[0383] In this way, by combining emotion analysis and generative AI models, this system can not only provide information but also deliver information that is optimally tailored to the user's emotions.

[0384] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0385] Step 1:

[0386] The terminal receives voice or text input from the user. This input concerns the information the user is requesting. In the case of voice input, the terminal uses speech recognition technology to convert it into text and prepares it for transmission to the server. At this point, the output is text data in a format that the server can parse.

[0387] Step 2:

[0388] The server receives text data sent from the terminal and analyzes the user's emotions using an emotion analysis system. The input is text data recording the user's requests, and natural language processing techniques are used to analyze the words and context to identify emotions such as stress, tension, and relief. The output is the analysis result indicating the user's emotional state.

[0389] Step 3:

[0390] The server selects and collects information appropriate to the user based on the results of sentiment analysis. Specifically, it prioritizes picking up information that matches the user's situation through internet and database searches. The input for this step is the results of sentiment analysis and the user's requests, and the output is a selected and organized collection of information.

[0391] Step 4:

[0392] The server uses a generative AI model based on collected information to generate documents tailored to the user. The input consists of organized information and analyzed emotional states. The generative AI model extracts key points from the information while considering the tone and style appropriate to the user's emotional state to generate the text. The output is a customized document.

[0393] Step 5:

[0394] The terminal receives the document generated by the server and prepares it for the user. In this step, the obtained document is formatted to make it easy for the user to understand. For example, diagrams and list formats are used to aid visual understanding. The input is the generated document, and the output is the formatted document displayed on the terminal.

[0395] (Application Example 2)

[0396] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0397] In today's information society, there is a demand for information tailored to the individual emotions of each user. However, existing information systems generally only provide standardized information, making it difficult to offer services that are adapted to the user's psychological state. In particular, when providing information and support within the family, there is a need for a means to accurately recognize the emotions and psychological states of each family member and provide optimized information.

[0398] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0399] In this invention, the server includes emotion recognition means for analyzing the user's emotional state, means for collecting information based on received requests, and means for organizing the collected information and generating documents. This makes it possible to tailor and provide information based on the user's emotions.

[0400] "Users" refers to individual people who are the recipients of the service to which information is provided.

[0401] A "request" refers to the information requests or questions that users input into the system.

[0402] "Means of collecting information" refers to a system that finds and compiles relevant data in response to user requests.

[0403] "Means of generating documents" refers to the process of organizing collected information based on user requests and outputting it in a formalized form.

[0404] "Means of providing to users" refers to methods and equipment for presenting generated documents in a format that users can view.

[0405] "Emotion recognition means" refers to technology that analyzes and identifies emotions from a user's facial expressions and voice data.

[0406] "Means of adjusting and providing information content" refers to methods for optimizing and presenting the format and content of information according to the user's emotions.

[0407] This invention provides an emotion recognition and information provision system implemented in a home robot. This system senses the user's emotional state and provides information corresponding to that state. By utilizing a variety of devices, the system achieves more intuitive and personalized information delivery.

[0408] The server uses speech recognition software and video analysis technology as means of emotion recognition. Specifically, it utilizes Google Cloud's natural language processing service to analyze emotions from facial expressions and tone of voice. This makes it possible to understand the user's emotions in real time. Furthermore, audio and video data are collected via a home robot equipped with a camera and microphone. This home robot is equipped with an Intel RealSense camera and a microphone array for remote speech recognition, and it acquires emotion data.

[0409] The device uses a generative AI model based on acquired emotional data to collect and generate appropriate information. This process utilizes machine learning platforms such as TensorFlow. When the user is experiencing stress, it can provide concise and timely information; when their emotions are stable, it can present detailed information in an easy-to-understand format. When providing information, it can also provide guidance via voice using speech synthesis technology such as Amazon Polly.

[0410] For example, if a user asks, "What would you like to cook today?", the robot will suggest a suitable menu based on the user's mood. Alternatively, by using a prompt such as, "A 13-year-old child is feeling stressed while studying math. Please provide suggestions for improving this situation," the robot can offer advice tailored to the home environment. This ensures reliable and valuable information is provided to the user.

[0411] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0412] Step 1:

[0413] Users input requests via voice or text through a home robot. If the input is voice, the robot's microphone captures it and sends it to the server as voice data. If the input is text, it is sent directly to the server as text data.

[0414] Step 2:

[0415] The server analyzes the received audio data using Google Cloud's natural language processing service to identify the user's emotions. The audio data is broken down into words and phrases, and features such as tone and speed of voice are analyzed. The results of this analysis are output as the emotional state.

[0416] Step 3:

[0417] The server then uses a generative AI model to collect information based on user requests. Considering the user's emotional state, it selects short, concise information for stressed users and detailed, comprehensive information for users who are more relaxed. The collected information is then passed on to the next step.

[0418] Step 4:

[0419] The server organizes the collected information and generates documents. It utilizes machine learning platforms such as TensorFlow to construct information layouts tailored to emotional states. The generated documents are arranged and formatted in a way that is easily understandable to the user.

[0420] Step 5:

[0421] The home robot, acting as a terminal, provides users with generated documents. The information is presented visually via a display or aurally using Amazon Polly's speech synthesis technology. Based on this information, users can make informed decisions and take appropriate actions.

[0422] Step 6:

[0423] The user reviews the suggested information. If they find it particularly helpful in reducing stress or solving problems, they can ask additional questions. These additional questions return them to step 1, starting a new information gathering cycle.

[0424] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0425] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0426] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0427] [Third Embodiment]

[0428] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0429] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0430] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0431] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0432] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0433] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0434] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0435] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0436] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0437] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0438] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0439] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0440] The system of the present invention efficiently acquires the information desired by the user and provides it as an accurate manual. This system integrates the processes of information collection, organization, generation, provision, and reliability verification. Specific embodiments of each function are described below.

[0441] User request received

[0442] First, the user enters a request through the terminal. This request is a specific request for information, such as "how to obtain a passport" or "how to enjoy a particular tourist destination." The terminal receives this request and sends it to the server through the interface.

[0443] Information gathering and organization

[0444] The server collects relevant information from the internet and specific databases based on the user's request. The collected information is then categorized according to the request, making it easy to understand what the user expects.

[0445] Document generation

[0446] Using the organized information, the server generates documents using natural language generation technology. These documents are presented in a user-friendly and easy-to-understand format, clearly structured with chapters, bullet points, and other visual aids.

[0447] Document provision

[0448] The generated document is formatted in PDF format and laid out to highlight important information. The document is then sent to the user via their device, who can view or download it.

[0449] Verification of information reliability and provision of source information

[0450] Furthermore, the server verifies the reliability of the collected information by comparing it with other reliable sources. During this process, source information and reference links are added to the documents to ensure users can use the information with confidence.

[0451] Specific example

[0452] For example, if a request is made for "How to enjoy Enoshima," the server will gather information on tourist spots, access, recommended activities and restaurants related to this destination, and organize it into an optimal travel plan. Then, based on this information, a guidebook-style manual will be created and provided to the user as a PDF. Links to the tourist destination's official website and travel review sites will also be included in the manual, eliminating the need for the user to conduct additional research themselves.

[0453] As described above, the system of the present invention provides users with easily understandable and useful information through a series of processes.

[0454] The following describes the processing flow.

[0455] Step 1:

[0456] The user uses the terminal's interface to enter an information request and press the send button. For example, a request might be, "I want to know tourist information for a specific area." The terminal receives this request from the user and sends it to the server.

[0457] Step 2:

[0458] The server analyzes the request received from the terminal and extracts relevant keywords. Based on these keywords, it initiates an information search across internet sources and specified databases.

[0459] Step 3:

[0460] The server filters the collected information, selecting only the most reliable data. Prioritizing information from official sources and specialized databases, it does so with a focus on obtaining reliable information from official sources and specialized databases.

[0461] Step 4:

[0462] The server categorizes the selected information and further organizes it based on its content. This organized information is then used for document generation in the next step.

[0463] Step 5:

[0464] The server uses natural language generation (NLG) to create user-friendly text based on organized information. The document is structured according to specific chapters and flows, clearly explaining key points.

[0465] Step 6:

[0466] The server visually formats the generated text and outputs it in PDF format. During this process, the layout and design are configured with user readability in mind.

[0467] Step 7:

[0468] The server cross-checks the reliability of the information and adds source information and relevant links to the document. This helps users gain a deeper understanding of the document's content.

[0469] Step 8:

[0470] The terminal receives the PDF manual sent from the server and displays it to the user. The user can view or save this manual and print it for use as needed.

[0471] (Example 1)

[0472] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0473] When users acquire specific information, challenges include organizing the collected information, verifying its reliability, and providing it in a user-friendly format. If this process is done manually, it is time-consuming and labor-intensive, and concerns remain regarding the accuracy and reliability of the collected information. Therefore, there is a need for a system that organizes and provides information to users in an efficient and reliable manner.

[0474] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0475] In this invention, the server includes means for receiving user requests, collecting information based on the received requests, and organizing it by category; means for generating documents using natural language generation technology based on the organized information; and means for formatting the generated documents and providing them to the user. As a result, users can quickly and efficiently obtain the information they need and use it with confidence regarding its reliability.

[0476] "Means for receiving user requests" refers to the function that allows users to input requests for information acquisition into the system via their terminals and to receive those inputs.

[0477] "Means of collecting information" refers to the process of obtaining relevant data from the internet or specific databases based on a received request.

[0478] "Methods for organizing by category" refer to methods for classifying collected information into different categories according to the requirements, thereby enabling quick access to the desired information.

[0479] "Methods for generating documents using natural language generation technology" refers to methods that utilize AI technology to generate text based on information organized by category, thereby creating documents in a format that is easy for humans to read.

[0480] "Means of formatting and providing documents to users" refers to the process of formatting generated documents into an appropriate format, such as PDF, and providing them in a form that users can access.

[0481] "Means of verifying the reliability of information by comparing it with other reliable sources" refers to methods of verifying the accuracy of collected information by cross-referencing it with other public sources or reliable data.

[0482] "Means of adding information about the source" refers to the process of clearly indicating the source or origin of information in a generated document so that users can verify the origin and background information of the information.

[0483] This system is designed to enable users to quickly and efficiently obtain specific information and provide it as a reliable document. The following describes specific embodiments of the present invention.

[0484] The user first uses their device to request the information they need. This request can be entered via keyboard or voice input and may include specific details such as "a tourist guide for Shonan." Once the user sends the request, the device transfers this data to the server.

[0485] The server collects information based on this request. The hardware used is a server with a high-speed CPU, and the software utilizes web crawler technology and APIs. The server accesses the internet and specific databases to retrieve relevant information and organizes this information into categories.

[0486] Next, the server generates a document using a generative AI model. This model integrates natural language processing technology, allowing it to generate a human-readable document based on a prompt such as, for example, "Please explain the procedure for applying for a passport." This document is then visually organized and presented clearly, utilizing chapters and bullet points.

[0487] The generated document is formatted using PDF creation software, resulting in a visually easy-to-understand format. User-friendliness is prioritized, with important information highlighted. Finally, the server sends this PDF to the user via their terminal, allowing them to view or download the information.

[0488] In this process, the server verifies the reliability of the collected information against other reliable data sources and adds source information and reference links to the document. This allows users to use the information with confidence and saves them the trouble of conducting additional research themselves. Overall, this system provides an efficient and reliable method of information acquisition and delivery.

[0489] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0490] Step 1:

[0491] The user enters an information request using a terminal. Using an input device (keyboard or voice input), they might request, for example, "Shonan tourist guide." The entered data is sent to the server in a structured format. Specifically, the user enters the request and presses the submit button, at which point the data is processed.

[0492] Step 2:

[0493] The server collects information based on information requests received from users. Using web crawler technology and APIs, it retrieves relevant information from the internet and specific databases. During this process, database searches are performed using keywords related to the requested topic, and the collected data is returned. Specifically, the server parses the request and accesses external sources to retrieve relevant information.

[0494] Step 3:

[0495] The server organizes the collected information by category. Here, a text analysis algorithm is used to divide the information into different categories based on its content. The input to this process is the collected raw data, and the output is data organized by category. Specifically, the server applies the algorithm to classify the information and stores it in a structured format.

[0496] Step 4:

[0497] The server generates text using a generative AI model based on organized information. The input is categorized data, and the output is a document in a human-readable format. Specifically, prompt sentences are fed into the AI ​​model to generate the document. In this process, the AI ​​model outputs natural language based on past training data.

[0498] Step 5:

[0499] The server formats the generated document into PDF format. The input here is the initial generated text, and the output is a visually formatted PDF document. Specifically, the server uses PDF creation software to adjust the document's layout. Particularly important sections are highlighted by changing font size and color.

[0500] Step 6:

[0501] The server verifies the reliability of the collected information and adds source information to the document. The input is a reliable data source compared to the organized information, and the output is the final document with logical consistency and source information added. Specifically, the server verifies reliability by cross-referencing with multiple reliable sources.

[0502] Step 7:

[0503] Ultimately, the server sends the formatted PDF to the user via the terminal. The input is the completed PDF document, and the output is the electronic file sent to the user. Specifically, the server transfers the document to the user's terminal via the network connection. The user can then receive, view, or download it.

[0504] (Application Example 1)

[0505] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0506] In modern society, users have a growing need to quickly and accurately access the information they need from a vast amount of data. However, much of this information is scattered, and some of it is of questionable reliability, making it extremely difficult for users to organize and quickly access it. Furthermore, providing this information through audio and visual means requires additional technological capabilities.

[0507] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0508] In this invention, the server includes means for receiving user requests, means for collecting information based on the received requests, means for organizing the collected information and generating documents, means for formatting the generated documents and providing them to the user, means for presenting information for audio and visual guidance, and means for verifying information by comparing it with reliable information sources. This enables users to quickly and accurately obtain reliable information and to facilitate smooth introduction and purchase decision support through audio and visual guidance.

[0509] "Means for receiving user requests" refers to devices or systems that capture requests via voice input, touch interfaces, etc., in order to accurately obtain the information and support that users are seeking.

[0510] "Means of collecting information" refers to algorithms and programs that search for and retrieve relevant information from the internet or specific databases based on user requests.

[0511] "Means of generating documents" refer to the processes and tools used to construct documents in a user-friendly format, based on organized information and utilizing natural language generation technology.

[0512] "Means of formatting and providing documents to users" refers to a system for visually arranging generated documents in PDF or other formats in an easily viewable manner and distributing them in a format that users can view.

[0513] "Means of presenting information for audio and visual guidance" refers to technologies that use displays and speakers to display or play information in an easy-to-understand format, making it easier for users to access the information.

[0514] "Means of verifying information by cross-referencing with reliable sources" refers to systems and methods that compare and verify collected information with other certified sources in order to ensure its accuracy and reliability.

[0515] To implement this application, the system first receives a user request via a terminal. The user inputs the request using voice or text, for example, "Tell me about a good restaurant nearby." This request is sent to the server, which then collects relevant information from the internet or databases based on the request.

[0516] The server organizes the collected information by category and generates documents using a generative AI model. These documents are structured with chapters and bullet points to facilitate user understanding. For audio and visual guidance, the information is displayed on the terminal's screen and played back through the speaker using speech synthesis software.

[0517] Furthermore, the server verifies the reliability of the information by cross-referencing it with other reliable sources. This process allows users to use the information with confidence. For example, if a user who has moved to a new area asks, "What are some recommended tourist spots nearby?", the server will gather information on nearby tourist attractions and provide a visual map and audio guide.

[0518] An example of a prompt message is: "We are developing a system that can provide surrounding information based on the user's voice requests. Please generate reliable guidance and visual instructions."

[0519] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0520] Step 1:

[0521] The user enters a request using a device. In the case of voice input, an audio signal is sent to the device via the microphone. The device's speech recognition software (e.g., Google Speech-to-Text) converts the audio into text and sends it to the server as a request. The input is in the form of audio or text, and the output is text request data.

[0522] Step 2:

[0523] The server parses the received text request and begins data processing to gather information relevant to the request. Here, a natural language processing algorithm is used to analyze the request text and extract relevant keywords. The input is the user's request text, and the output is a list of keywords.

[0524] Step 3:

[0525] The server uses extracted keywords to collect information from the internet and specific databases. This process utilizes web crawling techniques and API calls. The collected information is categorized based on the request. The input is a list of keywords, and the output is organized information data categorized by type.

[0526] Step 4:

[0527] The server generates a document using a generative AI model (e.g., OpenAI GPT) based on the organized information. This document is structured with chapters and bullet points to make it easy for the user to understand. The input is organized information data, and the output is text data in a document format.

[0528] Step 5:

[0529] The generated document is formatted in PDF format and laid out to highlight important information. Furthermore, the server uses text-to-speech software (e.g., Amazon Polly) to generate voice guidance. The input is the generated document, and the output is a PDF file and an audio file.

[0530] Step 6:

[0531] Ultimately, the server verifies the document's reliability against third-party sources. Source information and reference links are added to the document. Only highly reliable information is provided to the user. The input is the generated document, and the output is the document with source information added.

[0532] Step 7:

[0533] The terminal provides the user with the final document and audio. Visual information is displayed on the screen, and audio guidance is played from the speaker. The information is downloadable, and the user can review it and take action. The input is reliable documents and audio, and the output is the provision of information to the end user.

[0534] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0535] This invention relates to a system that recognizes the user's emotions and adjusts the format and content of information provided based on that information. This system includes an emotion engine and enables more personalized information provision in response to the user's requests. Specific embodiments are described below.

[0536] User emotion recognition

[0537] Users enter requests via voice or text through their device. This interface uses an emotion engine to analyze the user's emotions in real time based on their words, actions, and input. For example, if a user indicates an urgent situation such as "I urgently need to know how to obtain a passport," the emotion engine recognizes the stress and urgency.

[0538] Information gathering and coordination

[0539] The server collects information based on the results of the emotion engine's analysis in response to requests received from users. For example, for users in a hurry, it prioritizes collecting and selecting concise information that can be read quickly. In this way, it flexibly adjusts the information according to the user's emotions.

[0540] Document generation

[0541] The server customizes the document generation process based on the collected information to reflect the user's emotional state. If the emotion engine determines that the user's emotions are stable, it provides detailed information; if their emotions are unstable, it presents the information simply and intuitively.

[0542] Document provision

[0543] Users are provided with formatted documents via their devices. These documents are optimized based on an emotion engine and presented in the most appropriate format according to the user's emotional state. Users can then utilize this information in a timely and stress-free manner.

[0544] Specific example

[0545] For example, if a user is urgently seeking useful information while planning an overseas trip, the emotion engine detects their level of urgency, and the server quickly provides the most relevant and concise information. Documents are written in an emotion-appropriate tone and include links and checklists to help users find the information they need in a short amount of time.

[0546] By combining an emotion engine, the system of this invention goes beyond simply providing information; it understands the user's emotions and provides information tailored to their needs, thereby improving the user experience.

[0547] The following describes the processing flow.

[0548] Step 1:

[0549] Users enter requests via their devices and send them in voice or text format. This information is freely expressed according to the user's intent and purpose.

[0550] Step 2:

[0551] The device is equipped with an emotion engine that analyzes input voice and text to recognize the user's emotions. For example, it can identify emotions such as "hurried" or "excited" from the tone of voice and the content of the text.

[0552] Step 3:

[0553] The server receives user requests and sentiment data from the terminal and begins appropriate processing. Based on the received information, it extracts keywords and begins collecting related information.

[0554] Step 4:

[0555] The server combines the collected information with the results of the emotion engine and filters and sorts the information according to the user's emotional state. For example, if anxiety is detected, it prioritizes simple, actionable steps that can be performed immediately.

[0556] Step 5:

[0557] The server generates documents using organized information. In doing so, it adjusts the tone and style of the text according to the emotions expressed. For calm situations, it includes details; for anxious situations, it uses concise and reassuring language.

[0558] Step 6:

[0559] Once a document is generated, the server converts it to PDF format and arranges the layout to reflect emotionally relevant elements. This ensures that the information is visually harmonious.

[0560] Step 7:

[0561] The device receives the generated PDF document and provides it to the user. The user can efficiently and comfortably utilize the information by viewing it in an emotionally optimized format.

[0562] (Example 2)

[0563] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0564] Traditional information delivery systems fail to provide information appropriately, taking into account the user's emotional state, and thus the user experience is not necessarily improved. In particular, there is a challenge in that necessary information is not delivered quickly and appropriately when users are stressed or in a hurry.

[0565] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0566] In this invention, the server includes means for recognizing the emotional state of a user based on a received user request using an emotion analysis system, means for selecting and collecting information based on the emotional state, and means for generating a document using a generative AI model that takes the emotional state into consideration and providing it to the user. This makes it possible to provide information that is tailored to the user's emotional state, thereby improving the user experience.

[0567] An "information processing device" is a device that receives user requests in the form of voice or text and prepares the data necessary for subsequent processing.

[0568] An "emotion analysis system" is a technology or device that analyzes a user's words, actions, and input content to recognize that person's emotional state in real time.

[0569] A "generative AI model" is an artificial intelligence technology that generates text based on collected information and analysis results, and presents it to users in a more user-friendly way.

[0570] "Adding a source of information" means adding that source to a generated document in order to clearly indicate the source of the information on which it is based.

[0571] A description of embodiments for carrying out the present invention will be provided.

[0572] This system has the ability to recognize the user's emotions in real time and flexibly adjust the information provided according to that state. To achieve this, the system mainly uses the following hardware and software components.

[0573] terminal

[0574] The terminal is a device that receives requests from users via voice or text. Requests entered by the user are converted into text using speech recognition technology and then into a format that is easy to process. The terminal functions as the user interface, playing a role in both information input and result output.

[0575] server

[0576] The server receives requests sent from terminals and recognizes the user's emotional state through an emotion analysis system. This system utilizes natural language processing technology to analyze input words and their context, identifying emotions such as stress, tension, and relief. Furthermore, based on these analysis results, the server quickly collects relevant information from the internet and databases.

[0577] Generative AI Models

[0578] The collected information is transformed into a user-optimized document format using a generative AI model. The generative AI model extracts the key points of the information while adjusting the document's tone and style to reflect the user's emotional state.

[0579] Document provision

[0580] The generated document is provided to the user via a terminal. The terminal can format the document, such as by adding diagrams or lists, to make it easier to understand intuitively.

[0581] Specific example

[0582] For example, if a user is planning an overseas trip and feels an urgent need to know how to obtain a passport, the system analyzes that feeling, gathers relevant information, and generates a concise and easy-to-understand document. The document includes links and checklists, formatted to allow users to quickly obtain the necessary information.

[0583] Example of a prompt

[0584] "Please generate text that provides stress-reducing information for users who are in a hurry when planning their trip."

[0585] In this way, by combining emotion analysis and generative AI models, this system can not only provide information but also deliver information that is optimally tailored to the user's emotions.

[0586] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0587] Step 1:

[0588] The terminal receives voice or text input from the user. This input concerns the information the user is requesting. In the case of voice input, the terminal uses speech recognition technology to convert it into text and prepares it for transmission to the server. At this point, the output is text data in a format that the server can parse.

[0589] Step 2:

[0590] The server receives text data sent from the terminal and analyzes the user's emotions using an emotion analysis system. The input is text data recording the user's requests, and natural language processing techniques are used to analyze the words and context to identify emotions such as stress, tension, and relief. The output is the analysis result indicating the user's emotional state.

[0591] Step 3:

[0592] The server selects and collects information appropriate to the user based on the results of sentiment analysis. Specifically, it prioritizes picking up information that matches the user's situation through internet and database searches. The input for this step is the results of sentiment analysis and the user's requests, and the output is a selected and organized collection of information.

[0593] Step 4:

[0594] The server uses a generative AI model based on collected information to generate documents tailored to the user. The input consists of organized information and analyzed emotional states. The generative AI model extracts key points from the information while considering the tone and style appropriate to the user's emotional state to generate the text. The output is a customized document.

[0595] Step 5:

[0596] The terminal receives the document generated by the server and prepares it for the user. In this step, the obtained document is formatted to make it easy for the user to understand. For example, diagrams and list formats are used to aid visual understanding. The input is the generated document, and the output is the formatted document displayed on the terminal.

[0597] (Application Example 2)

[0598] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0599] In today's information society, there is a demand for information tailored to the individual emotions of each user. However, existing information systems generally only provide standardized information, making it difficult to offer services that are adapted to the user's psychological state. In particular, when providing information and support within the family, there is a need for a means to accurately recognize the emotions and psychological states of each family member and provide optimized information.

[0600] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0601] In this invention, the server includes emotion recognition means for analyzing the user's emotional state, means for collecting information based on received requests, and means for organizing the collected information and generating documents. This makes it possible to tailor and provide information based on the user's emotions.

[0602] "Users" refers to individual people who are the recipients of the service to which information is provided.

[0603] A "request" refers to the information requests or questions that users input into the system.

[0604] "Means of collecting information" refers to a system that finds and compiles relevant data in response to user requests.

[0605] "Means of generating documents" refers to the process of organizing collected information based on user requests and outputting it in a formalized form.

[0606] "Means of providing to users" refers to methods and equipment for presenting generated documents in a format that users can view.

[0607] "Emotion recognition means" refers to technology that analyzes and identifies emotions from a user's facial expressions and voice data.

[0608] "Means of adjusting and providing information content" refers to methods for optimizing and presenting the format and content of information according to the user's emotions.

[0609] This invention provides an emotion recognition and information provision system implemented in a home robot. This system senses the user's emotional state and provides information corresponding to that state. By utilizing a variety of devices, the system achieves more intuitive and personalized information delivery.

[0610] The server uses speech recognition software and video analysis technology as means of emotion recognition. Specifically, it utilizes Google Cloud's natural language processing service to analyze emotions from facial expressions and tone of voice. This makes it possible to understand the user's emotions in real time. Furthermore, audio and video data are collected via a home robot equipped with a camera and microphone. This home robot is equipped with an Intel RealSense camera and a microphone array for remote speech recognition, and it acquires emotion data.

[0611] The device uses a generative AI model based on acquired emotional data to collect and generate appropriate information. This process utilizes machine learning platforms such as TensorFlow. When the user is experiencing stress, it can provide concise and timely information; when their emotions are stable, it can present detailed information in an easy-to-understand format. When providing information, it can also provide guidance via voice using speech synthesis technology such as Amazon Polly.

[0612] For example, if a user asks, "What would you like to cook today?", the robot will suggest a suitable menu based on the user's mood. Alternatively, by using a prompt such as, "A 13-year-old child is feeling stressed while studying math. Please provide suggestions for improving this situation," the robot can offer advice tailored to the home environment. This ensures reliable and valuable information is provided to the user.

[0613] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0614] Step 1:

[0615] Users input requests via voice or text through a home robot. If the input is voice, the robot's microphone captures it and sends it to the server as voice data. If the input is text, it is sent directly to the server as text data.

[0616] Step 2:

[0617] The server analyzes the received audio data using Google Cloud's natural language processing service to identify the user's emotions. The audio data is broken down into words and phrases, and features such as tone and speed of voice are analyzed. The results of this analysis are output as the emotional state.

[0618] Step 3:

[0619] The server then uses a generative AI model to collect information based on user requests. Considering the user's emotional state, it selects short, concise information for stressed users and detailed, comprehensive information for users who are more relaxed. The collected information is then passed on to the next step.

[0620] Step 4:

[0621] The server organizes the collected information and generates documents. It utilizes machine learning platforms such as TensorFlow to construct information layouts tailored to emotional states. The generated documents are arranged and formatted in a way that is easily understandable to the user.

[0622] Step 5:

[0623] The home robot, acting as a terminal, provides users with generated documents. The information is presented visually via a display or aurally using Amazon Polly's speech synthesis technology. Based on this information, users can make informed decisions and take appropriate actions.

[0624] Step 6:

[0625] The user reviews the suggested information. If they find it particularly helpful in reducing stress or solving problems, they can ask additional questions. These additional questions return them to step 1, starting a new information gathering cycle.

[0626] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0627] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0628] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0629] [Fourth Embodiment]

[0630] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0631] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0632] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0633] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0634] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0635] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0636] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0637] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0638] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0639] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0640] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0641] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0642] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0643] The system of the present invention efficiently acquires the information desired by the user and provides it as an accurate manual. This system integrates the processes of information collection, organization, generation, provision, and reliability verification. Specific embodiments of each function are described below.

[0644] User request received

[0645] First, the user enters a request through the terminal. This request is a specific request for information, such as "how to obtain a passport" or "how to enjoy a particular tourist destination." The terminal receives this request and sends it to the server through the interface.

[0646] Information gathering and organization

[0647] The server collects relevant information from the internet and specific databases based on the user's request. The collected information is then categorized according to the request, making it easy to understand what the user expects.

[0648] Document generation

[0649] Using the organized information, the server generates documents using natural language generation technology. These documents are presented in a user-friendly and easy-to-understand format, clearly structured with chapters, bullet points, and other visual aids.

[0650] Document provision

[0651] The generated document is formatted in PDF format and laid out to highlight important information. The document is then sent to the user via their device, who can view or download it.

[0652] Verification of information reliability and provision of source information

[0653] Furthermore, the server verifies the reliability of the collected information by comparing it with other reliable sources. During this process, source information and reference links are added to the documents to ensure users can use the information with confidence.

[0654] Specific example

[0655] For example, if a request is made for "How to enjoy Enoshima," the server will gather information on tourist spots, access, recommended activities and restaurants related to this destination, and organize it into an optimal travel plan. Then, based on this information, a guidebook-style manual will be created and provided to the user as a PDF. Links to the tourist destination's official website and travel review sites will also be included in the manual, eliminating the need for the user to conduct additional research themselves.

[0656] As described above, the system of the present invention provides users with easily understandable and useful information through a series of processes.

[0657] The following describes the processing flow.

[0658] Step 1:

[0659] The user uses the terminal's interface to enter an information request and press the send button. For example, a request might be, "I want to know tourist information for a specific area." The terminal receives this request from the user and sends it to the server.

[0660] Step 2:

[0661] The server analyzes the request received from the terminal and extracts relevant keywords. Based on these keywords, it initiates an information search across internet sources and specified databases.

[0662] Step 3:

[0663] The server filters the collected information, selecting only the most reliable data. Prioritizing information from official sources and specialized databases, it does so with a focus on obtaining reliable information from official sources and specialized databases.

[0664] Step 4:

[0665] The server categorizes the selected information and further organizes it based on its content. This organized information is then used for document generation in the next step.

[0666] Step 5:

[0667] The server uses natural language generation (NLG) to create user-friendly text based on organized information. The document is structured according to specific chapters and flows, clearly explaining key points.

[0668] Step 6:

[0669] The server visually formats the generated text and outputs it in PDF format. During this process, the layout and design are configured with user readability in mind.

[0670] Step 7:

[0671] The server cross-checks the reliability of the information and adds source information and relevant links to the document. This helps users gain a deeper understanding of the document's content.

[0672] Step 8:

[0673] The terminal receives the PDF manual sent from the server and displays it to the user. The user can view or save this manual and print it for use as needed.

[0674] (Example 1)

[0675] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0676] When users acquire specific information, challenges include organizing the collected information, verifying its reliability, and providing it in a user-friendly format. If this process is done manually, it is time-consuming and labor-intensive, and concerns remain regarding the accuracy and reliability of the collected information. Therefore, there is a need for a system that organizes and provides information to users in an efficient and reliable manner.

[0677] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0678] In this invention, the server includes means for receiving user requests, collecting information based on the received requests, and organizing it by category; means for generating documents using natural language generation technology based on the organized information; and means for formatting the generated documents and providing them to the user. As a result, users can quickly and efficiently obtain the information they need and use it with confidence regarding its reliability.

[0679] "Means for receiving user requests" refers to the function that allows users to input requests for information acquisition into the system via their terminals and to receive those inputs.

[0680] "Means of collecting information" refers to the process of obtaining relevant data from the internet or specific databases based on a received request.

[0681] "Methods for organizing by category" refer to methods for classifying collected information into different categories according to the requirements, thereby enabling quick access to the desired information.

[0682] "Methods for generating documents using natural language generation technology" refers to methods that utilize AI technology to generate text based on information organized by category, thereby creating documents in a format that is easy for humans to read.

[0683] "Means of formatting and providing documents to users" refers to the process of formatting generated documents into an appropriate format, such as PDF, and providing them in a form that users can access.

[0684] "Means of verifying the reliability of information by comparing it with other reliable sources" refers to methods of verifying the accuracy of collected information by cross-referencing it with other public sources or reliable data.

[0685] "Means of adding information about the source" refers to the process of clearly indicating the source or origin of information in a generated document so that users can verify the origin and background information of the information.

[0686] This system is designed to enable users to quickly and efficiently obtain specific information and provide it as a reliable document. The following describes specific embodiments of the present invention.

[0687] The user first uses their device to request the information they need. This request can be entered via keyboard or voice input and may include specific details such as "a tourist guide for Shonan." Once the user sends the request, the device transfers this data to the server.

[0688] The server collects information based on this request. The hardware used is a server with a high-speed CPU, and the software utilizes web crawler technology and APIs. The server accesses the internet and specific databases to retrieve relevant information and organizes this information into categories.

[0689] Next, the server generates a document using a generative AI model. This model integrates natural language processing technology, allowing it to generate a human-readable document based on a prompt such as, for example, "Please explain the procedure for applying for a passport." This document is then visually organized and presented clearly, utilizing chapters and bullet points.

[0690] The generated document is formatted using PDF creation software, resulting in a visually easy-to-understand format. User-friendliness is prioritized, with important information highlighted. Finally, the server sends this PDF to the user via their terminal, allowing them to view or download the information.

[0691] In this process, the server verifies the reliability of the collected information against other reliable data sources and adds source information and reference links to the document. This allows users to use the information with confidence and saves them the trouble of conducting additional research themselves. Overall, this system provides an efficient and reliable method of information acquisition and delivery.

[0692] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0693] Step 1:

[0694] The user enters an information request using a terminal. Using an input device (keyboard or voice input), they might request, for example, "Shonan tourist guide." The entered data is sent to the server in a structured format. Specifically, the user enters the request and presses the submit button, at which point the data is processed.

[0695] Step 2:

[0696] The server collects information based on information requests received from users. Using web crawler technology and APIs, it retrieves relevant information from the internet and specific databases. During this process, database searches are performed using keywords related to the requested topic, and the collected data is returned. Specifically, the server parses the request and accesses external sources to retrieve relevant information.

[0697] Step 3:

[0698] The server organizes the collected information by category. Here, a text analysis algorithm is used to divide the information into different categories based on its content. The input to this process is the collected raw data, and the output is data organized by category. Specifically, the server applies the algorithm to classify the information and stores it in a structured format.

[0699] Step 4:

[0700] The server generates text using a generative AI model based on organized information. The input is categorized data, and the output is a document in a human-readable format. Specifically, prompt sentences are fed into the AI ​​model to generate the document. In this process, the AI ​​model outputs natural language based on past training data.

[0701] Step 5:

[0702] The server formats the generated document into PDF format. The input here is the initial generated text, and the output is a visually formatted PDF document. Specifically, the server uses PDF creation software to adjust the document's layout. Particularly important sections are highlighted by changing font size and color.

[0703] Step 6:

[0704] The server verifies the reliability of the collected information and adds source information to the document. The input is a reliable data source compared to the organized information, and the output is the final document with logical consistency and source information added. Specifically, the server verifies reliability by cross-referencing with multiple reliable sources.

[0705] Step 7:

[0706] Ultimately, the server sends the formatted PDF to the user via the terminal. The input is the completed PDF document, and the output is the electronic file sent to the user. Specifically, the server transfers the document to the user's terminal via the network connection. The user can then receive, view, or download it.

[0707] (Application Example 1)

[0708] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0709] In modern society, users have a growing need to quickly and accurately access the information they need from a vast amount of data. However, much of this information is scattered, and some of it is of questionable reliability, making it extremely difficult for users to organize and quickly access it. Furthermore, providing this information through audio and visual means requires additional technological capabilities.

[0710] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0711] In this invention, the server includes means for receiving user requests, means for collecting information based on the received requests, means for organizing the collected information and generating documents, means for formatting the generated documents and providing them to the user, means for presenting information for audio and visual guidance, and means for verifying information by comparing it with reliable information sources. This enables users to quickly and accurately obtain reliable information and to facilitate smooth introduction and purchase decision support through audio and visual guidance.

[0712] "Means for receiving user requests" refers to devices or systems that capture requests via voice input, touch interfaces, etc., in order to accurately obtain the information and support that users are seeking.

[0713] "Means of collecting information" refers to algorithms and programs that search for and retrieve relevant information from the internet or specific databases based on user requests.

[0714] "Means of generating documents" refer to the processes and tools used to construct documents in a user-friendly format, based on organized information and utilizing natural language generation technology.

[0715] "Means of formatting and providing documents to users" refers to a system for visually arranging generated documents in PDF or other formats in an easily viewable manner and distributing them in a format that users can view.

[0716] "Means of presenting information for audio and visual guidance" refers to technologies that use displays and speakers to display or play information in an easy-to-understand format, making it easier for users to access the information.

[0717] "Means of verifying information by cross-referencing with reliable sources" refers to systems and methods that compare and verify collected information with other certified sources in order to ensure its accuracy and reliability.

[0718] To implement this application, the system first receives a user request via a terminal. The user inputs the request using voice or text, for example, "Tell me about a good restaurant nearby." This request is sent to the server, which then collects relevant information from the internet or databases based on the request.

[0719] The server organizes the collected information by category and generates documents using a generative AI model. These documents are structured with chapters and bullet points to facilitate user understanding. For audio and visual guidance, the information is displayed on the terminal's screen and played back through the speaker using speech synthesis software.

[0720] Furthermore, the server verifies the reliability of the information by cross-referencing it with other reliable sources. This process allows users to use the information with confidence. For example, if a user who has moved to a new area asks, "What are some recommended tourist spots nearby?", the server will gather information on nearby tourist attractions and provide a visual map and audio guide.

[0721] An example of a prompt message is: "We are developing a system that can provide surrounding information based on the user's voice requests. Please generate reliable guidance and visual instructions."

[0722] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0723] Step 1:

[0724] The user enters a request using a device. In the case of voice input, an audio signal is sent to the device via the microphone. The device's speech recognition software (e.g., Google Speech-to-Text) converts the audio into text and sends it to the server as a request. The input is in the form of audio or text, and the output is text request data.

[0725] Step 2:

[0726] The server parses the received text request and begins data processing to gather information relevant to the request. Here, a natural language processing algorithm is used to analyze the request text and extract relevant keywords. The input is the user's request text, and the output is a list of keywords.

[0727] Step 3:

[0728] The server uses extracted keywords to collect information from the internet and specific databases. This process utilizes web crawling techniques and API calls. The collected information is categorized based on the request. The input is a list of keywords, and the output is organized information data categorized by type.

[0729] Step 4:

[0730] The server generates a document using a generative AI model (e.g., OpenAI GPT) based on the organized information. This document is structured with chapters and bullet points to make it easy for the user to understand. The input is organized information data, and the output is text data in a document format.

[0731] Step 5:

[0732] The generated document is formatted in PDF format and laid out to highlight important information. Furthermore, the server uses text-to-speech software (e.g., Amazon Polly) to generate voice guidance. The input is the generated document, and the output is a PDF file and an audio file.

[0733] Step 6:

[0734] Ultimately, the server verifies the document's reliability against third-party sources. Source information and reference links are added to the document. Only highly reliable information is provided to the user. The input is the generated document, and the output is the document with source information added.

[0735] Step 7:

[0736] The terminal provides the user with the final document and audio. Visual information is displayed on the screen, and audio guidance is played from the speaker. The information is downloadable, and the user can review it and take action. The input is reliable documents and audio, and the output is the provision of information to the end user.

[0737] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0738] This invention relates to a system that recognizes the user's emotions and adjusts the format and content of information provided based on that information. This system includes an emotion engine and enables more personalized information provision in response to the user's requests. Specific embodiments are described below.

[0739] User emotion recognition

[0740] Users enter requests via voice or text through their device. This interface uses an emotion engine to analyze the user's emotions in real time based on their words, actions, and input. For example, if a user indicates an urgent situation such as "I urgently need to know how to obtain a passport," the emotion engine recognizes the stress and urgency.

[0741] Information gathering and coordination

[0742] The server collects information based on the results of the emotion engine's analysis in response to requests received from users. For example, for users in a hurry, it prioritizes collecting and selecting concise information that can be read quickly. In this way, it flexibly adjusts the information according to the user's emotions.

[0743] Document generation

[0744] The server customizes the document generation process based on the collected information to reflect the user's emotional state. If the emotion engine determines that the user's emotions are stable, it provides detailed information; if their emotions are unstable, it presents the information simply and intuitively.

[0745] Document provision

[0746] Users are provided with formatted documents via their devices. These documents are optimized based on an emotion engine and presented in the most appropriate format according to the user's emotional state. Users can then utilize this information in a timely and stress-free manner.

[0747] Specific example

[0748] For example, if a user is urgently seeking useful information while planning an overseas trip, the emotion engine detects their level of urgency, and the server quickly provides the most relevant and concise information. Documents are written in an emotion-appropriate tone and include links and checklists to help users find the information they need in a short amount of time.

[0749] By combining an emotion engine, the system of this invention goes beyond simply providing information; it understands the user's emotions and provides information tailored to their needs, thereby improving the user experience.

[0750] The following describes the processing flow.

[0751] Step 1:

[0752] Users enter requests via their devices and send them in voice or text format. This information is freely expressed according to the user's intent and purpose.

[0753] Step 2:

[0754] The device is equipped with an emotion engine that analyzes input voice and text to recognize the user's emotions. For example, it can identify emotions such as "hurried" or "excited" from the tone of voice and the content of the text.

[0755] Step 3:

[0756] The server receives user requests and sentiment data from the terminal and begins appropriate processing. Based on the received information, it extracts keywords and begins collecting related information.

[0757] Step 4:

[0758] The server combines the collected information with the results of the emotion engine and filters and sorts the information according to the user's emotional state. For example, if anxiety is detected, it prioritizes simple, actionable steps that can be performed immediately.

[0759] Step 5:

[0760] The server generates documents using organized information. In doing so, it adjusts the tone and style of the text according to the emotions expressed. For calm situations, it includes details; for anxious situations, it uses concise and reassuring language.

[0761] Step 6:

[0762] Once a document is generated, the server converts it to PDF format and arranges the layout to reflect emotionally relevant elements. This ensures that the information is visually harmonious.

[0763] Step 7:

[0764] The device receives the generated PDF document and provides it to the user. The user can efficiently and comfortably utilize the information by viewing it in an emotionally optimized format.

[0765] (Example 2)

[0766] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0767] Traditional information delivery systems fail to provide information appropriately, taking into account the user's emotional state, and thus the user experience is not necessarily improved. In particular, there is a challenge in that necessary information is not delivered quickly and appropriately when users are stressed or in a hurry.

[0768] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0769] In this invention, the server includes means for recognizing the emotional state of a user based on a received user request using an emotion analysis system, means for selecting and collecting information based on the emotional state, and means for generating a document using a generative AI model that takes the emotional state into consideration and providing it to the user. This makes it possible to provide information that is tailored to the user's emotional state, thereby improving the user experience.

[0770] An "information processing device" is a device that receives user requests in the form of voice or text and prepares the data necessary for subsequent processing.

[0771] An "emotion analysis system" is a technology or device that analyzes a user's words, actions, and input content to recognize that person's emotional state in real time.

[0772] A "generative AI model" is an artificial intelligence technology that generates text based on collected information and analysis results, and presents it to users in a more user-friendly way.

[0773] "Adding a source of information" means adding that source to a generated document in order to clearly indicate the source of the information on which it is based.

[0774] A description of embodiments for carrying out the present invention will be provided.

[0775] This system has the ability to recognize the user's emotions in real time and flexibly adjust the information provided according to that state. To achieve this, the system mainly uses the following hardware and software components.

[0776] terminal

[0777] The terminal is a device that receives requests from users via voice or text. Requests entered by the user are converted into text using speech recognition technology and then into a format that is easy to process. The terminal functions as the user interface, playing a role in both information input and result output.

[0778] server

[0779] The server receives requests sent from terminals and recognizes the user's emotional state through an emotion analysis system. This system utilizes natural language processing technology to analyze input words and their context, identifying emotions such as stress, tension, and relief. Furthermore, based on these analysis results, the server quickly collects relevant information from the internet and databases.

[0780] Generative AI Models

[0781] The collected information is transformed into a user-optimized document format using a generative AI model. The generative AI model extracts the key points of the information while adjusting the document's tone and style to reflect the user's emotional state.

[0782] Document provision

[0783] The generated document is provided to the user via a terminal. The terminal can format the document, such as by adding diagrams or lists, to make it easier to understand intuitively.

[0784] Specific example

[0785] For example, if a user is planning an overseas trip and feels an urgent need to know how to obtain a passport, the system analyzes that feeling, gathers relevant information, and generates a concise and easy-to-understand document. The document includes links and checklists, formatted to allow users to quickly obtain the necessary information.

[0786] Example of a prompt

[0787] "Please generate text that provides stress-reducing information for users who are in a hurry when planning their trip."

[0788] In this way, by combining emotion analysis and generative AI models, this system can not only provide information but also deliver information that is optimally tailored to the user's emotions.

[0789] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0790] Step 1:

[0791] The terminal receives voice or text input from the user. This input concerns the information the user is requesting. In the case of voice input, the terminal uses speech recognition technology to convert it into text and prepares it for transmission to the server. At this point, the output is text data in a format that the server can parse.

[0792] Step 2:

[0793] The server receives text data sent from the terminal and analyzes the user's emotions using an emotion analysis system. The input is text data recording the user's requests, and natural language processing techniques are used to analyze the words and context to identify emotions such as stress, tension, and relief. The output is the analysis result indicating the user's emotional state.

[0794] Step 3:

[0795] The server selects and collects information appropriate to the user based on the results of sentiment analysis. Specifically, it prioritizes picking up information that matches the user's situation through internet and database searches. The input for this step is the results of sentiment analysis and the user's requests, and the output is a selected and organized collection of information.

[0796] Step 4:

[0797] The server uses a generative AI model based on collected information to generate documents tailored to the user. The input consists of organized information and analyzed emotional states. The generative AI model extracts key points from the information while considering the tone and style appropriate to the user's emotional state to generate the text. The output is a customized document.

[0798] Step 5:

[0799] The terminal receives the document generated by the server and prepares it for the user. In this step, the obtained document is formatted to make it easy for the user to understand. For example, diagrams and list formats are used to aid visual understanding. The input is the generated document, and the output is the formatted document displayed on the terminal.

[0800] (Application Example 2)

[0801] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0802] In today's information society, there is a demand for information tailored to the individual emotions of each user. However, existing information systems generally only provide standardized information, making it difficult to offer services that are adapted to the user's psychological state. In particular, when providing information and support within the family, there is a need for a means to accurately recognize the emotions and psychological states of each family member and provide optimized information.

[0803] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0804] In this invention, the server includes emotion recognition means for analyzing the user's emotional state, means for collecting information based on received requests, and means for organizing the collected information and generating documents. This makes it possible to tailor and provide information based on the user's emotions.

[0805] "Users" refers to individual people who are the recipients of the service to which information is provided.

[0806] A "request" refers to the information requests or questions that users input into the system.

[0807] "Means of collecting information" refers to a system that finds and compiles relevant data in response to user requests.

[0808] "Means of generating documents" refers to the process of organizing collected information based on user requests and outputting it in a formalized form.

[0809] "Means of providing to users" refers to methods and equipment for presenting generated documents in a format that users can view.

[0810] "Emotion recognition means" refers to technology that analyzes and identifies emotions from a user's facial expressions and voice data.

[0811] "Means of adjusting and providing information content" refers to methods for optimizing and presenting the format and content of information according to the user's emotions.

[0812] This invention provides an emotion recognition and information provision system implemented in a home robot. This system senses the user's emotional state and provides information corresponding to that state. By utilizing a variety of devices, the system achieves more intuitive and personalized information delivery.

[0813] The server uses speech recognition software and video analysis technology as means of emotion recognition. Specifically, it utilizes Google Cloud's natural language processing service to analyze emotions from facial expressions and tone of voice. This makes it possible to understand the user's emotions in real time. Furthermore, audio and video data are collected via a home robot equipped with a camera and microphone. This home robot is equipped with an Intel RealSense camera and a microphone array for remote speech recognition, and it acquires emotion data.

[0814] The device uses a generative AI model based on acquired emotional data to collect and generate appropriate information. This process utilizes machine learning platforms such as TensorFlow. When the user is experiencing stress, it can provide concise and timely information; when their emotions are stable, it can present detailed information in an easy-to-understand format. When providing information, it can also provide guidance via voice using speech synthesis technology such as Amazon Polly.

[0815] For example, if a user asks, "What would you like to cook today?", the robot will suggest a suitable menu based on the user's mood. Alternatively, by using a prompt such as, "A 13-year-old child is feeling stressed while studying math. Please provide suggestions for improving this situation," the robot can offer advice tailored to the home environment. This ensures reliable and valuable information is provided to the user.

[0816] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0817] Step 1:

[0818] Users input requests via voice or text through a home robot. If the input is voice, the robot's microphone captures it and sends it to the server as voice data. If the input is text, it is sent directly to the server as text data.

[0819] Step 2:

[0820] The server analyzes the received audio data using Google Cloud's natural language processing service to identify the user's emotions. The audio data is broken down into words and phrases, and features such as tone and speed of voice are analyzed. The results of this analysis are output as the emotional state.

[0821] Step 3:

[0822] The server then uses a generative AI model to collect information based on user requests. Considering the user's emotional state, it selects short, concise information for stressed users and detailed, comprehensive information for users who are more relaxed. The collected information is then passed on to the next step.

[0823] Step 4:

[0824] The server organizes the collected information and generates documents. It utilizes machine learning platforms such as TensorFlow to construct information layouts tailored to emotional states. The generated documents are arranged and formatted in a way that is easily understandable to the user.

[0825] Step 5:

[0826] The home robot, acting as a terminal, provides users with generated documents. The information is presented visually via a display or aurally using Amazon Polly's speech synthesis technology. Based on this information, users can make informed decisions and take appropriate actions.

[0827] Step 6:

[0828] The user reviews the suggested information. If they find it particularly helpful in reducing stress or solving problems, they can ask additional questions. These additional questions return them to step 1, starting a new information gathering cycle.

[0829] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0830] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0831] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0832] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0833] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0834] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0835] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0836] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0837] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0838] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0839] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0840] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0841] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0842] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0843] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0844] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0845] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0846] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0847] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0848] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0849] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0850] The following is further disclosed regarding the embodiments described above.

[0851] (Claim 1)

[0852] A means of receiving user requests,

[0853] Means for collecting information based on received requests,

[0854] A means of organizing collected information and generating documents,

[0855] A means of formatting the generated document and providing it to the user,

[0856] A system that includes this.

[0857] (Claim 2)

[0858] The system according to claim 1, further comprising means for verifying the reliability of information related to a request.

[0859] (Claim 3)

[0860] The system according to claim 1, further comprising means for adding source information to the generated document.

[0861] "Example 1"

[0862] (Claim 1)

[0863] A means of receiving user requests,

[0864] A means of collecting information based on received requests and organizing it by category,

[0865] A means of generating documents using natural language generation technology based on organized information,

[0866] A means of formatting the generated document and providing it to the user,

[0867] A system that includes this.

[0868] (Claim 2)

[0869] The system according to claim 1, further comprising means for verifying the reliability of collected information by comparing it with other reliable sources.

[0870] (Claim 3)

[0871] The system according to claim 1, further comprising means for adding information about the source to the generated document.

[0872] "Application Example 1"

[0873] (Claim 1)

[0874] A means of receiving user requests,

[0875] Means for collecting information based on received requests,

[0876] A means of organizing collected information and generating documents,

[0877] A means of formatting the generated document and providing it to the user,

[0878] Means for presenting information to provide guidance via sound and visual means,

[0879] A means of verifying information by cross-referencing it with reliable sources,

[0880] A system that includes this.

[0881] (Claim 2)

[0882] The system according to claim 1, further comprising means for verifying the reliability of information related to a request, and providing voice guidance.

[0883] (Claim 3)

[0884] The system according to claim 1, further comprising means for adding source information to the generated document and for providing a visual display in response to a user's voice request.

[0885] "Example 2 of combining an emotion engine"

[0886] (Claim 1)

[0887] A means for receiving user requests via an information processing device,

[0888] A means of recognizing the user's emotional state using an emotion analysis system based on the received request,

[0889] A means of selecting and collecting information based on emotional state,

[0890] A means of organizing collected information while considering emotional states and generating documents using a generative AI model,

[0891] A means of formatting and providing to users documents generated according to the user's emotional state,

[0892] A system that includes this.

[0893] (Claim 2)

[0894] The system according to claim 1, further comprising means for verifying the reliability of selected information.

[0895] (Claim 3)

[0896] The system according to claim 1, further comprising means for adding information sources to the generated documents.

[0897] "Application example 2 when combining with an emotional engine"

[0898] (Claim 1)

[0899] A means of receiving user requests,

[0900] Means for collecting information based on received requests,

[0901] A means of organizing collected information and generating documents,

[0902] A means of formatting the generated document and providing it to the user,

[0903] An emotion recognition method for analyzing the emotional state of users,

[0904] A means of adjusting and providing information content based on the user's emotions,

[0905] A system that includes this.

[0906] (Claim 2)

[0907] The system according to claim 1, further comprising means for verifying the reliability of information related to a request.

[0908] (Claim 3)

[0909] The system according to claim 1, further comprising means for adding source information to the generated document. [Explanation of Symbols]

[0910] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A means of receiving user requests, Means for collecting information based on received requests, A means of organizing collected information and generating documents, A means of formatting the generated document and providing it to the user, Means for presenting information to provide guidance via sound and visual means, A means of verifying information by cross-referencing it with reliable sources, A system that includes this.

2. The system according to claim 1, further comprising means for verifying the reliability of information related to a request, and providing voice guidance.

3. The system according to claim 1, further comprising means for adding source information to the generated document and for providing a visual display in response to a user's voice request.