system
The information processing system addresses the challenge of providing natural and emotionally responsive character interactions by allowing users to select characters and using AI models to generate personalized and emotionally tailored responses, enhancing user experience.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-16
- Publication Date
- 2026-06-26
Smart Images

Figure 2026105398000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a character of the chatbot, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance as a response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] In the real world, having a personal communication with a specific character or idol is a difficult task for many people. There is a need to provide an environment where users can experience the interaction with this character with a sense of presence. Furthermore, in a normal dialogue system, it is difficult to update information in real time and conduct a natural conversation including character-specific expressions. Thus, there is a need for a reliable dialogue system to provide new experiential values to users.
Means for Solving the Problems
[0005] This invention provides an information processing system that allows a user to select a specific character and engage in a natural dialogue with that character. The user terminal receives the character selection and transmits this information to the server. Subsequently, the server associates the latest information obtained from external information sources with the character and continuously updates it. The server analyzes the user's message, and based on the analysis results, an AI model is used to generate a response appropriate to the character. This response is then transmitted to the user terminal, allowing the user to experience a natural dialogue that includes real-time updates. In this way, the system solves the problem by providing a system that enables dialogue with a character that has the sense of presence and individuality that the user desires.
[0006] "Character selection" is the act of a user deciding which specific character they wish to interact with from among several options.
[0007] A "user terminal" is a device used by a user to interact with a character, and includes computing devices such as smartphones and tablets on which the LINE app or similar applications are installed.
[0008] A "server" refers to a computer system that manages a series of processes, including receiving and analyzing messages from users and generating character-based responses.
[0009] "External information sources" refer to external data providers that systems access to obtain the latest information, such as news sites, data feeds, and APIs on the internet.
[0010] An "AI model" refers to a group of machine learning and deep learning algorithms that use natural language processing technology to generate appropriate responses to user messages.
[0011] "Analysis" is the process of understanding and structuring the content and intent of messages received from users using natural language processing technology and other methods.
[0012] "Response generation" refers to the process of generating natural conversations that are tailored to the character's characteristics, using analysis results and related information.
[0013] An "information processing system" refers to a collection of technological elements that provide a technical foundation for enabling interaction between users and characters, through the coordinated operation of users, terminals, and servers. [Brief explanation of the drawing]
[0014] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12]It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when combined with an emotion engine. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when combined with an emotion engine.
Mode for Carrying Out the Invention
[0015] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0016] First, the terms used in the following description will be explained.
[0017] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0018] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0019] In the following embodiments, the signed storage is one or more non-volatile storage devices that store various programs and various parameters. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes.
[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0022] [First Embodiment]
[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0035] This invention provides users with a new value in their conversational experience by constructing an information processing system that enables users to interact with specific characters in real time. The system mainly consists of a user terminal, a server, and an external information source.
[0036] The devices used by users are digital devices such as smartphones and tablets, and they connect to the server using the LINE app. Through the LINE interface, users can select the character they want to interact with, and the selected character information is sent to the server via the device.
[0037] The server associates the received character selection data with the user profile and stores it in the system's database. It then accesses external sources to collect the latest data, such as news, technology information, and trends. The server analyzes the collected data using its proprietary natural language processing technology and adjusts the information to match the personality and tone of the chosen character.
[0038] When a user sends a message to a character via the LINE app, the device relays the message to a server. The server analyzes the message, identifying its content and purpose. Based on the analysis, an AI model generates the optimal response, adjusting it to incorporate language and tone appropriate for the character. This generated response is then sent from the server to the device and displayed within the LINE app.
[0039] For example, if a user asks, "What's a good movie out there lately?", the server can refer to the latest movie information from external sources and respond in the character's tone, "I recommend XYZ Movies as a recent popular title." In this way, users can enjoy natural conversations with the character they have chosen.
[0040] In this embodiment, users can enjoy a rich conversational experience by engaging in natural conversations with characters and being provided with constantly updated information.
[0041] The following describes the processing flow.
[0042] Step 1:
[0043] The user accesses the system through the LINE app and selects a specific character. The device receives the user's selection and sends that information to the server.
[0044] Step 2:
[0045] The server stores the received character selection information in the database along with the user's profile information. This establishes an association between the user and the character.
[0046] Step 3:
[0047] The server accesses external information sources to regularly retrieve the latest news, technology, fashion, and other information. The collected data is analyzed by the server and updated in relation to the characters.
[0048] Step 4:
[0049] The user sends a message to the character using the LINE app. The device then forwards this message to the server.
[0050] Step 5:
[0051] The server analyzes messages sent by users using natural language processing tools to identify the intent and content of the messages.
[0052] Step 6:
[0053] Based on the analyzed message content, the server uses an AI model to generate a response tailored to the character. The generated response is adjusted to match the character's tone and style.
[0054] Step 7:
[0055] The server sends the generated response to the device. The device displays the response on the LINE app, and the user checks the message from the character.
[0056] Step 8:
[0057] The user receives a response from the character and can continue the conversation. The system repeats this process, maintaining a continuous conversation with the user.
[0058] (Example 1)
[0059] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0060] With the advancement of modern information and communication technology, there is a growing need for systems that allow users to obtain the latest information in real time while engaging in natural conversations with interactive objects. However, existing systems often fail to adequately ensure real-time information and natural conversation, making it difficult to improve the user experience.
[0061] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0062] In this invention, the server includes means for receiving selection information from a communication device that allows the user to select an interactive object and transmitting it to a management device; means for collecting the latest data from an external information source and updating the information by associating that data with the interactive object; means for analyzing the communication content received from the user and generating a response appropriate to the interactive object based on that communication content; and means for adjusting the response using a generation AI model tailored to the characteristics of the interactive object. As a result, the user can converse with the selected interactive object in real time and in a natural manner and instantly obtain the latest information.
[0063] A "communication device" is a digital device used by a user to select an interactive object and communicate, and includes smartphones and tablets.
[0064] A "management device" is a device that collects data from external information sources and generates responses based on selected information and analysis results from communication devices; a server falls into this category.
[0065] A "dialogue object" is a character that the user selects and engages in conversation with.
[0066] "External information sources" refer to information providers from which the management device collects data, such as news, trend information, and technology information.
[0067] A "generative AI model" is an artificial intelligence technology that generates the optimal response based on the communication content received from the user and adjusts the response to suit the dialogue object.
[0068] This invention is an information processing system that enables users to have real-time conversations with interactive objects using a communication device. The communication device is a digital device such as a smartphone or tablet, and connects to a management device using an interactive application. A server is used as the management device and performs major processing such as information collection, analysis, and generation.
[0069] Users can use a communication device to select interactive objects through a dedicated interface (e.g., a messaging application). The selected information is then sent from the communication device to the server.
[0070] The server accesses external information sources. These external sources collect up-to-date data, including news, trends, and technical information, using APIs and web crawling technologies. This data is stored in a database on the server and analyzed using natural language processing techniques. The analyzed information is then tailored based on the characteristics of the corresponding dialogue object.
[0071] When a user sends a message through a communication device, the terminal relays the message to a server. The server performs natural language analysis on the message, interprets its intent, and then uses a generative AI model to generate the optimal response. This response is adjusted to match the tone and style of the dialogue object.
[0072] For example, if a user asks, "What are some recent music events worth checking out?", the server can retrieve the latest information on music events from external sources and generate a response that takes into account the language used by the dialogue object. This would provide the user with a natural conversational experience, such as, "For the latest event information, we recommend the Y Music Festival held at location X."
[0073] For example, a prompt message for an AI model might be something like, "When a user asks about recent music events, generate a response in the tone of a conversational object based on external information." This prompt forms the basis for the AI model to generate an appropriate response that is relevant to the user's question.
[0074] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0075] Step 1:
[0076] The user selects an interactive object using a communication device. The communication device sends this selection information to the server as a data packet. The input is information about the interactive object selected by the user, and the output is the data sent to the server. In this step, information is collected from the user interface.
[0077] Step 2:
[0078] The server associates the received selection information with the user's profile and stores it in the database. The input is the selection data from the communication device, and the output is the operation to save it to the database. Here, data organization is performed using the profile management function.
[0079] Step 3:
[0080] The server accesses external information sources and collects the necessary data. This includes news, trends, and technology information. Input is requests from external APIs, and output is the latest retrieved data. Data ingestion is achieved using API calls and web crawling technologies.
[0081] Step 4:
[0082] The server analyzes the collected data using natural language processing techniques and adjusts it to match the characteristics of the dialogue object. The input is data obtained from an external source, and the output is the adjusted data. This step involves analysis using an NLP algorithm.
[0083] Step 5:
[0084] A user sends a message using a communication device. The communication device relays this message to a server. The input is the user's message, and the output is the message transfer to the server. Here, user interaction is relayed.
[0085] Step 6:
[0086] The server analyzes the received message and understands its intent. The input is the user message, and the output is the analysis result. In this step, intent analysis is performed using machine learning techniques.
[0087] Step 7:
[0088] The server generates responses using a generative AI model based on the analysis results, and adjusts them to match the characteristics of the dialogue object. The input consists of the analyzed user message and necessary external data, and the output is the adjusted response. Instructions are given to the AI model using prompts to generate natural-sounding responses.
[0089] Step 8:
[0090] The server sends the generated response to the communication device, which then displays the response to the user. The input is the response data from the server, and the output is the display on the user interface. This step involves data transfer and display.
[0091] (Application Example 1)
[0092] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0093] In today's home environment, there is a need for information processing technology that allows users to naturally acquire the information they desire in their daily lives, and to obtain it through friendly interactions with specific characters. Furthermore, this requires that home robots facilitate smooth interaction with users, thereby improving convenience and enrichment, and enabling effective communication.
[0094] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0095] In this invention, the server includes means for receiving character selection from a user device and transmitting it to an information processing device; means for acquiring the latest information from an external information source, adjusting the information based on its personality and associating it with the character; means for analyzing a message received from the user, generating a response appropriate to the character based on its content, and outputting the generated response as voice; and means for providing the response through a home robot to support the user's daily life. This makes it possible for the user to acquire everyday information while engaging in natural conversation with a specific character.
[0096] An "information processing system" is a technological foundation that enables users to interact with specific characters, and it performs tasks such as message analysis and response generation.
[0097] A "user device" is a digital device used by a user to interact with a character, and includes smartphones and tablets.
[0098] An "information processing device" is a computer system that receives and processes information from a user device, and includes servers.
[0099] "External information sources" refer to databases and online services that provide the latest data on news, technology, trends, and more.
[0100] A "character" is a virtual person or animal with its own personality that the user interacts with, and is used as the object of that interaction.
[0101] "Analysis means" refers to the technical process performed to process messages received from users and understand their intent and content.
[0102] "Response generation" is the process of generating a response appropriate to the character based on the analyzed message.
[0103] A "home robot" is a robotic device placed in a home environment to interact with and assist users.
[0104] "Adjusting information based on individuality" is the process of transforming and adapting acquired information to match the speech patterns and behaviors of a specific character.
[0105] To realize this invention, the user terminal, information processing device (server), and home robot must all work in coordination. A smartphone or tablet is used as the user terminal, and a messaging application such as the LINE app is launched to interact with a specific character. When the user selects a character and sends a message, the terminal transfers the data to the information processing device.
[0106] The information processing device (server) analyzes messages received from the user using a natural language processing engine. Based on the analyzed data, and combined with the latest information obtained from external sources, a generative AI model generates a response appropriate to the character. The generated response is sent to a home robot in text format and output as speech using speech synthesis technology. For example, it is possible to utilize Google's Text-to-Speech service.
[0107] The home robot receives responses from an information processing unit and engages in natural conversation with the user. This allows users to obtain everyday information while enjoying conversations with the character simply by asking questions to the home robot in their living room. For example, when a user asks, "What's the weather like today?", the robot might respond, "It's sunny today, and the temperature is 25 degrees. You'll need sunglasses."
[0108] In a concrete example, if a user asks about a particular hobby, the system could respond based on acquired external information, saying, "XYZ is currently a popular trend." In the prompt, the information processing device can instruct the generating AI model in the following way: "User question: What's the weather like today? Generate response from smart robot: Generate a response based on weather forecast API data and add advice for going out."
[0109] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0110] Step 1:
[0111] The user launches the LINE app on their device, selects a specific character, and enters a message. The input consists of the user message and character selection data. This data is sent from the device to the server.
[0112] Step 2:
[0113] The server associates the character selection data received from the terminal with the user profile and stores it in a database. This records the character selected by each user, creating the foundation for providing personalized dialogue experiences.
[0114] Step 3:
[0115] The server accesses external information sources to retrieve the latest information related to the user's message. Input consists of data requests from external sources. Data processing involves extracting necessary information based on specific keywords and adjusting it to suit the character's personality.
[0116] Step 4:
[0117] Messages received from users are analyzed on the server using natural language processing. The input is the user message, and the data processing involves grammatical analysis and semantic interpretation. As a result, the message's intent is extracted.
[0118] Step 5:
[0119] Based on the analysis results and acquired external information, the server generates character-appropriate responses using a generated AI model. Through the generated prompts, the AI model outputs appropriate response sentences. This includes adjusting the tone and wording to match the character's personality.
[0120] Step 6:
[0121] The generated response is sent from the server to the terminal, and then forwarded to the home robot. The input is the response text from the server, and the output is audio data. The robot uses speech synthesis technology to convert the text into speech.
[0122] Step 7:
[0123] The home robot completes the interaction by providing an audible response to the user. Here, the user receives the voice output and obtains information relevant to the intent of their question.
[0124] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0125] This invention provides an information processing system that incorporates an emotion engine that recognizes and adjusts the user's emotions when they interact with a specific character. This system consists of a user terminal, a server, and the emotion engine.
[0126] The user's device is a smartphone or tablet, and it connects to the server using the LINE app. The user selects a character they want to interact with through the LINE app, and the conversation with the character begins. When the user sends a message, the device forwards that message to the server.
[0127] When the server analyzes a user's message, it uses an emotion engine to recognize their emotions. The emotion engine utilizes natural language processing techniques to identify the user's emotional state based on the message's context and vocabulary. This emotion analysis influences the AI model that generates the character's response. The AI model selects appropriate tones and expressions to generate a response that is appropriate to the user's emotions.
[0128] The generated responses express emotions such as joy, sadness, and surprise, making them more relatable to the user. These responses are sent from the server to the device and displayed on the LINE app. In this way, users can continue interacting with the character while feeling an emotional connection.
[0129] For example, if a user sends a message saying, "Something really great happened today!", the server's emotion engine can recognize this emotion and have the character respond with something like, "That's wonderful! What happened?", sharing in the joy. This allows the user to enjoy a more intimate conversation with the character.
[0130] In this way, the present invention provides a dialogue system that includes adjustments based on the user's emotion recognition, enabling a more natural and enriching dialogue experience.
[0131] The following describes the processing flow.
[0132] Step 1:
[0133] The user accesses the system through the LINE app and selects a specific character they want to interact with. The device then sends this selection information to the server.
[0134] Step 2:
[0135] The server stores the selected character information in a database, associating it with the user's profile. This ensures that interactions with the character are properly managed in future conversations.
[0136] Step 3:
[0137] When a user sends a message to a character, the device forwards that message to the server. The message content is sent to the server as text data.
[0138] Step 4:
[0139] The server uses an emotion engine to analyze the messages sent by the user. Based on the message content, context, and vocabulary used, it identifies the user's emotional state.
[0140] Step 5:
[0141] Based on the analyzed emotional information, the server's AI model generates a character response. The emotional information is reflected in the tone and content of the response, resulting in a response that includes emotional resonance.
[0142] Step 6:
[0143] The generated response is sent from the server to the device and displayed to the user on the LINE app. The user receives an emotionally appropriate response from the character and continues the conversation.
[0144] Step 7:
[0145] The user can continue the conversation based on the response they receive. The server repeats this process, continuously updating the character's responses in response to new messages sent by the user.
[0146] (Example 2)
[0147] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0148] Conventional information processing systems have the problem that user-character interactions are mechanical and difficult to respond appropriately to the user's emotional state. As a result, it is difficult to provide a natural and rich dialogue experience.
[0149] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0150] In this invention, the server includes means for analyzing received messages and identifying emotional states using natural language processing techniques, means for generating adapted responses using generative models based on the identified emotional states, and means for transmitting the generated responses to the user terminal. This makes it possible to generate adjusted responses based on the user's emotions.
[0151] A "user terminal" is a communication device connected to an information processing device, and is a device used by users to input or receive information.
[0152] "Character selection" refers to the act of a user choosing a virtual entity with whom they will engage in dialogue.
[0153] A "data processing device" refers to a device that receives, transmits, analyzes, and generates responses to information via a communication network.
[0154] "Natural language processing technology" is a general term for technologies that enable computers to understand and process human language.
[0155] "Emotional state" refers to the emotional responses and tendencies of the user, and is the subject of analysis.
[0156] A "generative model" refers to a computational algorithm or program used to generate a response based on input information.
[0157] An "adapted response" refers to the result of a dialogue that has been appropriately adjusted based on the user's emotional state and the context of the conversation.
[0158] This invention provides an information processing system that recognizes a user's emotions and generates a corresponding response when the user engages in a natural conversation with a specific character. The system incorporates a user terminal, a server, and a generative AI model that performs emotion analysis.
[0159] The user terminal will be a communication device such as a smartphone or tablet. This allows the user to connect to the server using an application such as the LINE app and begin interacting with the character. When the user sends a message, the terminal forwards that message to the server.
[0160] The server uses natural language processing techniques to perform sentiment analysis when processing received user messages. Specifically, it analyzes the context of the received message and uses its built-in software to identify the user's emotional state. The results of this sentiment analysis are used to influence the generative AI model and generate appropriately tailored responses.
[0161] For example, if a user sends a message saying, "Something really great happened today!", the server's sentiment analysis system recognizes the emotion "happy." Based on this, the AI model generates an empathetic response to the user's feelings, such as, "That's wonderful! What happened?" Prompts may be used in this response generation process, for example, "A user has reported some good news. Please think of a response that shares their joy."
[0162] The generated response is sent back to the user's device from the server and displayed on the LINE app screen. This allows users to engage in conversations with characters that reflect their actual emotions, enabling them to experience natural and rich communication.
[0163] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0164] Step 1:
[0165] Users launch the LINE app on a device such as a smartphone or tablet and select the character they want to interact with.
[0166] Input: User character selection
[0167] Operation: The user selects a character using the LINE app interface.
[0168] Output: Information about the selected character is sent from the terminal to the server.
[0169] Step 2:
[0170] The user types and sends a message to the selected character.
[0171] Input: User's message text
[0172] Action: The user enters text into the message input field in the LINE app and taps the send button.
[0173] Output: The entered message text is processed on the terminal and forwarded to the server.
[0174] Step 3:
[0175] The server analyzes the user's message and performs sentiment recognition.
[0176] Input: Sent message text
[0177] Operation: The server uses natural language processing techniques to analyze messages and identify the user's emotional state from the message's context and vocabulary.
[0178] Output: The user's emotional state is determined, and the results of the emotion analysis are passed to the generating AI model.
[0179] Step 4:
[0180] The server generates character responses using a generative AI model based on the emotion analysis results.
[0181] Input: Sentiment analysis results
[0182] Operation: Based on the sentiment analysis results, the server inputs prompt sentences into the generative AI model, which then generates a response that takes into account appropriate tone and content.
[0183] Output: Response text generated using the prompt.
[0184] Step 5:
[0185] The server sends the generated response to the user's terminal.
[0186] Input: Response text
[0187] Operation: The server packages the generated response as a message and sends it to the user terminal via the communication protocol.
[0188] Output: The response message will be displayed on the user's LINE app.
[0189] Step 6:
[0190] The user receives a response generated within the LINE app and continues the conversation with the character.
[0191] Input: Response message from the server
[0192] Operation: The user reads the response displayed on the LINE app interface and decides whether to continue the conversation by typing a new message.
[0193] Output: Input of a new message or end of the interaction.
[0194] (Application Example 2)
[0195] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0196] In modern times, interaction with virtual characters is becoming commonplace, and there is a growing demand for richer communication experiences, particularly in the fields of entertainment and interaction. Existing systems struggle to adequately recognize the user's emotional state and generate empathetic responses accordingly. As a result, users may experience unnaturalness or low satisfaction. To address this challenge, a system is needed that recognizes the user's emotions and generates natural and empathetic responses.
[0197] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0198] In this invention, the server includes means for receiving a virtual character selection from a user device and transmitting it to a computer; means for obtaining the latest information from an external information resource and updating that information in association with the virtual character; means for analyzing a message received from the user and identifying its emotional state using natural language processing technology; means for generating a response for the virtual character based on the emotion analysis results and transmitting the generated response to the user device; and means for utilizing an AI model using prompt sentences to generate an emotion-appropriate response. This enables natural dialogue that is in line with the user's emotions.
[0199] An "information processing device" is a computer system that processes and analyzes data based on user input and generates responses or outputs.
[0200] A "user device" is a device operated by a user to communicate with an information processing device, and includes smartphones and tablets.
[0201] "Virtual character selection" refers to the act of selecting a digital personality or anthropomorphic character to interact with.
[0202] A "computer" is an electronic device equipped with a central processing unit for processing information and performing necessary calculations.
[0203] "External information resources" refer to databases and information provision services that exist outside the system and are used to retrieve information as needed.
[0204] "Latest information" refers to the most recent data obtained from external information resources that reflects the current situation and state.
[0205] "Natural language processing technology" is a technology that analyzes human language and uses computers to understand and process its meaning and emotions.
[0206] "Emotional state" is an evaluation that indicates the emotional state and psychological tendencies contained in the user's input text.
[0207] A "prompt message" is text that provides instructions or context for an AI model to generate a response.
[0208] An "AI model" is an artificial intelligence program that uses machine learning algorithms to learn patterns from data and generate responses.
[0209] The system of this invention is designed to enable natural communication based on the user's emotion recognition through interaction with a virtual character. This system consists mainly of a user terminal, a server, and an emotion engine.
[0210] The user selects a virtual character using a user device such as a smartphone or tablet and begins an interaction. The user device sends the character selection information to the server. At this time, the device functions as a route that receives messages from the user and forwards them to the server.
[0211] The server is responsible for processing incoming user messages. An emotion engine built into the server analyzes the emotional state of the message using natural language processing techniques. The analyzed emotional state is then used by a generative AI model, along with specific prompts, to derive a response appropriate to the user's emotions. This AI model generates responses using advanced machine learning algorithms such as the OpenAI® API, enabling the virtual character to provide empathetic and natural dialogue with the user.
[0212] As a concrete example of this invention, consider a case where a user sends the message "I'm tired." In this case, the emotion engine identifies "fatigue" as an emotional state and supplies the AI model with a prompt message based on it: "Please write a response in an appropriate tone based on the user's emotion {fatigue}," generating a gentle and encouraging response. In this way, the system provides an empathetic response according to the user's emotional state, offering an engaging conversational experience.
[0213] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0214] Step 1:
[0215] The user selects a virtual character using their user terminal and begins an interaction. The input at this time is the selection information for the virtual character, and the terminal sends this information to the server. This determines the context of the interaction with the character.
[0216] Step 2:
[0217] The user enters a message into the user terminal. This input is the user's message content, and the terminal forwards this message to the server. This prepares the foundation for the message to be parsed.
[0218] Step 3:
[0219] The server analyzes received user messages using an emotion engine. The input is the user's message, and the emotional state is identified using natural language processing techniques. This process outputs the user's emotions as states such as "joy" or "sadness."
[0220] Step 4:
[0221] The server generates prompt sentences based on the sentiment analysis results. The input is the analyzed sentiment state, and the prompt sentence is output by inserting the sentiment state into the prompt sentence template. This prepares appropriate input for the AI model.
[0222] Step 5:
[0223] The server inputs a prompt into the generating AI model to generate a response. The input is the generated prompt, and the AI model uses this prompt to generate a response that is appropriate to the user's emotions. This step provides the response content to be sent to the user.
[0224] Step 6:
[0225] The server sends the generated response to the user's terminal. The input is the response generated by the AI model, which is sent to the user's terminal as output. This process allows the user to receive emotionally empathetic responses from the virtual character.
[0226] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0227] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0228] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0229] [Second Embodiment]
[0230] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0231] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0232] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0233] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0234] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0235] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0236] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0237] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0238] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0239] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0240] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0241] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0242] This invention provides users with a new value in their conversational experience by constructing an information processing system that enables users to interact with specific characters in real time. The system mainly consists of a user terminal, a server, and an external information source.
[0243] The devices used by users are digital devices such as smartphones and tablets, and they connect to the server using the LINE app. Through the LINE interface, users can select the character they want to interact with, and the selected character information is sent to the server via the device.
[0244] The server associates the received character selection data with the user profile and stores it in the system's database. It then accesses external sources to collect the latest data, such as news, technology information, and trends. The server analyzes the collected data using its proprietary natural language processing technology and adjusts the information to match the personality and tone of the chosen character.
[0245] When a user sends a message to a character via the LINE app, the device relays the message to a server. The server analyzes the message, identifying its content and purpose. Based on the analysis, an AI model generates the optimal response, adjusting it to incorporate language and tone appropriate for the character. This generated response is then sent from the server to the device and displayed within the LINE app.
[0246] For example, if a user asks, "What's a good movie out there lately?", the server can refer to the latest movie information from external sources and respond in the character's tone, "I recommend XYZ Movies as a recent popular title." In this way, users can enjoy natural conversations with the character they have chosen.
[0247] In this embodiment, users can enjoy a rich conversational experience by engaging in natural conversations with characters and being provided with constantly updated information.
[0248] The following describes the processing flow.
[0249] Step 1:
[0250] The user accesses the system through the LINE app and selects a specific character. The device receives the user's selection and sends that information to the server.
[0251] Step 2:
[0252] The server stores the received character selection information in the database along with the user's profile information. This establishes an association between the user and the character.
[0253] Step 3:
[0254] The server accesses external information sources to regularly retrieve the latest news, technology, fashion, and other information. The collected data is analyzed by the server and updated in relation to the characters.
[0255] Step 4:
[0256] The user sends a message to the character using the LINE app. The device then forwards this message to the server.
[0257] Step 5:
[0258] The server analyzes messages sent by users using natural language processing tools to identify the intent and content of the messages.
[0259] Step 6:
[0260] Based on the analyzed message content, the server uses an AI model to generate a response tailored to the character. The generated response is adjusted to match the character's tone and style.
[0261] Step 7:
[0262] The server sends the generated response to the device. The device displays the response on the LINE app, and the user checks the message from the character.
[0263] Step 8:
[0264] The user receives a response from the character and can continue the conversation. The system repeats this process, maintaining a continuous conversation with the user.
[0265] (Example 1)
[0266] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0267] With the advancement of modern information and communication technology, there is a growing need for systems that allow users to obtain the latest information in real time while engaging in natural conversations with interactive objects. However, existing systems often fail to adequately ensure real-time information and natural conversation, making it difficult to improve the user experience.
[0268] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0269] In this invention, the server includes means for receiving selection information from a communication device that allows the user to select an interactive object and transmitting it to a management device; means for collecting the latest data from an external information source and updating the information by associating that data with the interactive object; means for analyzing the communication content received from the user and generating a response appropriate to the interactive object based on that communication content; and means for adjusting the response using a generation AI model tailored to the characteristics of the interactive object. As a result, the user can converse with the selected interactive object in real time and in a natural manner and instantly obtain the latest information.
[0270] A "communication device" is a digital device used by a user to select an interactive object and communicate, and includes smartphones and tablets.
[0271] A "management device" is a device that collects data from external information sources and generates responses based on selected information and analysis results from communication devices; a server falls into this category.
[0272] A "dialogue object" is a character that the user selects and engages in conversation with.
[0273] "External information sources" refer to information providers from which the management device collects data, such as news, trend information, and technology information.
[0274] A "generative AI model" is an artificial intelligence technology that generates the optimal response based on the communication content received from the user and adjusts the response to suit the dialogue object.
[0275] This invention is an information processing system that enables users to have real-time conversations with interactive objects using a communication device. The communication device is a digital device such as a smartphone or tablet, and connects to a management device using an interactive application. A server is used as the management device and performs major processing such as information collection, analysis, and generation.
[0276] Users can use a communication device to select interactive objects through a dedicated interface (e.g., a messaging application). The selected information is then sent from the communication device to the server.
[0277] The server accesses external information sources. These external sources collect up-to-date data, including news, trends, and technical information, using APIs and web crawling technologies. This data is stored in a database on the server and analyzed using natural language processing techniques. The analyzed information is then tailored based on the characteristics of the corresponding dialogue object.
[0278] When a user sends a message through a communication device, the terminal relays the message to a server. The server performs natural language analysis on the message, interprets its intent, and then uses a generative AI model to generate the optimal response. This response is adjusted to match the tone and style of the dialogue object.
[0279] For example, when the user asks "What are the recent popular music events?", the server can obtain the latest information about music events from external information sources and generate a response considering the diction of the dialogue object. It provides a natural dialogue experience for the user, such as "As the latest event information, the Y Music Festival held at Location X is recommended."
[0280] If an example of a prompt sentence for the AI model is given, it would be an instruction like "When the user asks about recent music events, generate a response in the tone of the dialogue object based on external information." This prompt serves as the basis for the AI model to generate an appropriate response to the user's question.
[0281] The flow of the specific process in Example 1 will be described using FIG. 11.
[0282] Step 1:
[0283] The user selects a dialogue object using the communication device. The communication device transmits this selection information to the server as a data packet. The input is the information of the dialogue object selected by the user, and the output is the transmission data to the server. In this step, information is collected from the user interface.
[0284] Step 2:
[0285] The server associates the received selection information with the user's profile and saves it in the database. The input is the selection data from the communication device, and the output is the save operation to the database. Here, data arrangement using the profile management function is performed.
[0286] Step 3:
[0287] The server accesses external information sources and collects the necessary data. This includes news, trends, and technology information. Input is requests from external APIs, and output is the latest retrieved data. Data ingestion is achieved using API calls and web crawling technologies.
[0288] Step 4:
[0289] The server analyzes the collected data using natural language processing techniques and adjusts it to match the characteristics of the dialogue object. The input is data obtained from an external source, and the output is the adjusted data. This step involves analysis using an NLP algorithm.
[0290] Step 5:
[0291] A user sends a message using a communication device. The communication device relays this message to a server. The input is the user's message, and the output is the message transfer to the server. Here, user interaction is relayed.
[0292] Step 6:
[0293] The server analyzes the received message and understands its intent. The input is the user message, and the output is the analysis result. In this step, intent analysis is performed using machine learning techniques.
[0294] Step 7:
[0295] The server generates responses using a generative AI model based on the analysis results, and adjusts them to match the characteristics of the dialogue object. The input consists of the analyzed user message and necessary external data, and the output is the adjusted response. Instructions are given to the AI model using prompts to generate natural-sounding responses.
[0296] Step 8:
[0297] The server sends the generated response to the communication device, which then displays the response to the user. The input is the response data from the server, and the output is the display on the user interface. This step involves data transfer and display.
[0298] (Application Example 1)
[0299] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0300] In today's home environment, there is a need for information processing technology that allows users to naturally acquire the information they desire in their daily lives, and to obtain it through friendly interactions with specific characters. Furthermore, this requires that home robots facilitate smooth interaction with users, thereby improving convenience and enrichment, and enabling effective communication.
[0301] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0302] In this invention, the server includes means for receiving character selection from a user device and transmitting it to an information processing device; means for acquiring the latest information from an external information source, adjusting the information based on its personality and associating it with the character; means for analyzing a message received from the user, generating a response appropriate to the character based on its content, and outputting the generated response as voice; and means for providing the response through a home robot to support the user's daily life. This makes it possible for the user to acquire everyday information while engaging in natural conversation with a specific character.
[0303] An "information processing system" is a technological foundation that enables users to interact with specific characters, and it performs tasks such as message analysis and response generation.
[0304] The "user device" is a digital device used by a user to interact with a character, including smartphones and tablets.
[0305] The "information processing device" is a computer system that receives information from a user device and performs processing, including servers.
[0306] The "external information source" refers to databases and online services that provide up-to-date data such as news, technical information, and trends.
[0307] A "character" is a virtual person, animal, etc. with personality that a user interacts with and is used as an interaction target.
[0308] The "analysis means" is a technical process that processes messages received from a user and understands their intent and content.
[0309] "Response generation" is a process of generating a response suitable for a character based on the analyzed message.
[0310] The "domestic robot" is a robot device arranged to interact with and assist a user in a domestic environment.
[0311] "Adjusting information based on personality" is a process of converting and adapting the acquired information to match the tone and behavior of a specific character.
[0312] To realize this invention, a user terminal, an information processing device (server), and a domestic robot need to operate in cooperation. As the user terminal, a smartphone or a tablet is used, and a messaging application such as the LINE application is launched to interact with a specific character. When the user selects a character and sends a message, the terminal transfers data to the information processing device.
[0313] The information processing device (server) analyzes messages received from the user using a natural language processing engine. Based on the analyzed data, and combined with the latest information obtained from external sources, a generative AI model generates a response appropriate to the character. The generated response is sent to a home robot in text format and output as speech using speech synthesis technology. For example, Google's Text-to-Speech service can be used.
[0314] The home robot receives responses from an information processing unit and engages in natural conversation with the user. This allows users to obtain everyday information while enjoying conversations with the character simply by asking questions to the home robot in their living room. For example, when a user asks, "What's the weather like today?", the robot might respond, "It's sunny today, and the temperature is 25 degrees. You'll need sunglasses."
[0315] In a concrete example, if a user asks about a particular hobby, the system could respond based on acquired external information, saying, "XYZ is currently a popular trend." In the prompt, the information processing device can instruct the generating AI model in the following way: "User question: What's the weather like today? Generate response from smart robot: Generate a response based on weather forecast API data and add advice for going out."
[0316] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0317] Step 1:
[0318] The user launches the LINE app on their device, selects a specific character, and enters a message. The input consists of the user message and character selection data. This data is sent from the device to the server.
[0319] Step 2:
[0320] The server associates the character selection data received from the terminal with the user profile and stores it in a database. This records the character selected by each user, creating the foundation for providing personalized dialogue experiences.
[0321] Step 3:
[0322] The server accesses external information sources to retrieve the latest information related to the user's message. Input consists of data requests from external sources. Data processing involves extracting necessary information based on specific keywords and adjusting it to suit the character's personality.
[0323] Step 4:
[0324] Messages received from users are analyzed on the server using natural language processing. The input is the user message, and the data processing involves grammatical analysis and semantic interpretation. As a result, the message's intent is extracted.
[0325] Step 5:
[0326] Based on the analysis results and acquired external information, the server generates character-appropriate responses using a generated AI model. Through the generated prompts, the AI model outputs appropriate response sentences. This includes adjusting the tone and wording to match the character's personality.
[0327] Step 6:
[0328] The generated response is sent from the server to the terminal, and then forwarded to the home robot. The input is the response text from the server, and the output is audio data. The robot uses speech synthesis technology to convert the text into speech.
[0329] Step 7:
[0330] The home robot completes the interaction by providing an audible response to the user. Here, the user receives the voice output and obtains information relevant to the intent of their question.
[0331] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0332] This invention provides an information processing system that incorporates an emotion engine that recognizes and adjusts the user's emotions when they interact with a specific character. This system consists of a user terminal, a server, and the emotion engine.
[0333] The user's device is a smartphone or tablet, and it connects to the server using the LINE app. The user selects a character they want to interact with through the LINE app, and the conversation with the character begins. When the user sends a message, the device forwards that message to the server.
[0334] When the server analyzes a user's message, it uses an emotion engine to recognize their emotions. The emotion engine utilizes natural language processing techniques to identify the user's emotional state based on the message's context and vocabulary. This emotion analysis influences the AI model that generates the character's response. The AI model selects appropriate tones and expressions to generate a response that is appropriate to the user's emotions.
[0335] The generated responses express emotions such as joy, sadness, and surprise, making them more relatable to the user. These responses are sent from the server to the device and displayed on the LINE app. In this way, users can continue interacting with the character while feeling an emotional connection.
[0336] For example, if a user sends a message saying, "Something really great happened today!", the server's emotion engine can recognize this emotion and have the character respond with something like, "That's wonderful! What happened?", sharing in the joy. This allows the user to enjoy a more intimate conversation with the character.
[0337] In this way, the present invention provides a dialogue system that includes adjustments based on the user's emotion recognition, enabling a more natural and enriching dialogue experience.
[0338] The following describes the processing flow.
[0339] Step 1:
[0340] The user accesses the system through the LINE app and selects a specific character they want to interact with. The device then sends this selection information to the server.
[0341] Step 2:
[0342] The server stores the selected character information in a database, associating it with the user's profile. This ensures that interactions with the character are properly managed in future conversations.
[0343] Step 3:
[0344] When a user sends a message to a character, the device forwards that message to the server. The message content is sent to the server as text data.
[0345] Step 4:
[0346] The server uses an emotion engine to analyze the messages sent by the user. Based on the message content, context, and vocabulary used, it identifies the user's emotional state.
[0347] Step 5:
[0348] Based on the analyzed emotional information, the server's AI model generates a character response. The emotional information is reflected in the tone and content of the response, resulting in a response that includes emotional resonance.
[0349] Step 6:
[0350] The generated response is sent from the server to the device and displayed to the user on the LINE app. The user receives an emotionally appropriate response from the character and continues the conversation.
[0351] Step 7:
[0352] The user can continue the conversation based on the response they receive. The server repeats this process, continuously updating the character's responses in response to new messages sent by the user.
[0353] (Example 2)
[0354] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0355] Conventional information processing systems have the problem that user-character interactions are mechanical and difficult to respond appropriately to the user's emotional state. As a result, it is difficult to provide a natural and rich dialogue experience.
[0356] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0357] In this invention, the server includes means for analyzing received messages and identifying emotional states using natural language processing techniques, means for generating adapted responses using generative models based on the identified emotional states, and means for transmitting the generated responses to the user terminal. This makes it possible to generate adjusted responses based on the user's emotions.
[0358] A "user terminal" is a communication device connected to an information processing device, and is a device used by users to input or receive information.
[0359] "Character selection" refers to the act of a user choosing a virtual entity with whom they will engage in dialogue.
[0360] A "data processing device" refers to a device that receives, transmits, analyzes, and generates responses to information via a communication network.
[0361] "Natural language processing technology" is a general term for technologies that enable computers to understand and process human language.
[0362] "Emotional state" refers to the emotional responses and tendencies of the user, and is the subject of analysis.
[0363] A "generative model" refers to a computational algorithm or program used to generate a response based on input information.
[0364] An "adapted response" refers to the result of a dialogue that has been appropriately adjusted based on the user's emotional state and the context of the conversation.
[0365] This invention provides an information processing system that recognizes a user's emotions and generates a corresponding response when the user engages in a natural conversation with a specific character. The system incorporates a user terminal, a server, and a generative AI model that performs emotion analysis.
[0366] The user terminal will be a communication device such as a smartphone or tablet. This allows the user to connect to the server using an application such as the LINE app and begin interacting with the character. When the user sends a message, the terminal forwards that message to the server.
[0367] The server uses natural language processing techniques to perform sentiment analysis when processing received user messages. Specifically, it analyzes the context of the received message and uses its built-in software to identify the user's emotional state. The results of this sentiment analysis are used to influence the generative AI model and generate appropriately tailored responses.
[0368] For example, if a user sends a message saying, "Something really great happened today!", the server's sentiment analysis system recognizes the emotion "happy." Based on this, the AI model generates an empathetic response to the user's feelings, such as, "That's wonderful! What happened?" Prompts may be used in this response generation process, for example, "A user has reported some good news. Please think of a response that shares their joy."
[0369] The generated response is sent back to the user's device from the server and displayed on the LINE app screen. This allows users to engage in conversations with characters that reflect their actual emotions, enabling them to experience natural and rich communication.
[0370] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0371] Step 1:
[0372] Users launch the LINE app on a device such as a smartphone or tablet and select the character they want to interact with.
[0373] Input: User character selection
[0374] Operation: The user selects a character using the LINE app interface.
[0375] Output: Information about the selected character is sent from the terminal to the server.
[0376] Step 2:
[0377] The user types and sends a message to the selected character.
[0378] Input: User's message text
[0379] Action: The user enters text into the message input field in the LINE app and taps the send button.
[0380] Output: The entered message text is processed on the terminal and forwarded to the server.
[0381] Step 3:
[0382] The server analyzes the user's message and performs sentiment recognition.
[0383] Input: Sent message text
[0384] Operation: The server uses natural language processing techniques to analyze messages and identify the user's emotional state from the message's context and vocabulary.
[0385] Output: The user's emotional state is determined, and the results of the emotion analysis are passed to the generating AI model.
[0386] Step 4:
[0387] The server generates character responses using a generative AI model based on the emotion analysis results.
[0388] Input: Sentiment analysis results
[0389] Operation: Based on the sentiment analysis results, the server inputs prompt sentences into the generative AI model, which then generates a response that takes into account appropriate tone and content.
[0390] Output: Response text generated using the prompt.
[0391] Step 5:
[0392] The server sends the generated response to the user's terminal.
[0393] Input: Response text
[0394] Operation: The server packages the generated response as a message and sends it to the user terminal via the communication protocol.
[0395] Output: The response message will be displayed on the user's LINE app.
[0396] Step 6:
[0397] The user receives a response generated within the LINE app and continues the conversation with the character.
[0398] Input: Response message from the server
[0399] Operation: The user reads the response displayed on the LINE app interface and decides whether to continue the conversation by typing a new message.
[0400] Output: Input of a new message or end of the interaction.
[0401] (Application Example 2)
[0402] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the smart glasses 214 as the "terminal".
[0403] In modern times, interaction with virtual characters is becoming commonplace, and there is a growing demand for richer communication experiences, particularly in the fields of entertainment and interaction. Existing systems struggle to adequately recognize the user's emotional state and generate empathetic responses accordingly. As a result, users may experience unnaturalness or low satisfaction. To address this challenge, a system is needed that recognizes the user's emotions and generates natural and empathetic responses.
[0404] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0405] In this invention, the server includes means for receiving a virtual character selection from a user device and transmitting it to a computer; means for obtaining the latest information from an external information resource and updating that information in association with the virtual character; means for analyzing a message received from the user and identifying its emotional state using natural language processing technology; means for generating a response for the virtual character based on the emotion analysis results and transmitting the generated response to the user device; and means for utilizing an AI model using prompt sentences to generate an emotion-appropriate response. This enables natural dialogue that is in line with the user's emotions.
[0406] An "information processing device" is a computer system that processes and analyzes data based on user input and generates responses or outputs.
[0407] A "user device" is a device operated by a user to communicate with an information processing device, and includes smartphones and tablets.
[0408] "Virtual character selection" refers to the act of selecting a digital personality or anthropomorphic character to interact with.
[0409] A "computer" is an electronic device equipped with a central processing unit for processing information and performing necessary calculations.
[0410] "External information resources" refer to databases and information provision services that exist outside the system and are used to retrieve information as needed.
[0411] "Latest information" refers to the most recent data obtained from external information resources that reflects the current situation and state.
[0412] "Natural language processing technology" is a technology that analyzes human language and uses computers to understand and process its meaning and emotions.
[0413] "Emotional state" is an evaluation that indicates the emotional state and psychological tendencies contained in the user's input text.
[0414] A "prompt message" is text that provides instructions or context for an AI model to generate a response.
[0415] An "AI model" is an artificial intelligence program that uses machine learning algorithms to learn patterns from data and generate responses.
[0416] The system of this invention is designed to enable natural communication based on the user's emotion recognition through interaction with a virtual character. This system consists mainly of a user terminal, a server, and an emotion engine.
[0417] The user selects a virtual character using a user device such as a smartphone or tablet and begins an interaction. The user device sends the character selection information to the server. At this time, the device functions as a route that receives messages from the user and forwards them to the server.
[0418] The server is responsible for processing incoming user messages. An emotion engine built into the server analyzes the emotional state of the message using natural language processing techniques. The analyzed emotional state is then used by a generative AI model, along with specific prompts, to derive a response appropriate to the user's emotions. This AI model generates responses using advanced machine learning algorithms such as the OpenAI API, enabling the virtual character to provide empathetic and natural dialogue with the user.
[0419] As a concrete example of this invention, consider a case where a user sends the message "I'm tired." In this case, the emotion engine identifies "fatigue" as an emotional state and supplies the AI model with a prompt message based on it: "Please write a response in an appropriate tone based on the user's emotion {fatigue}," generating a gentle and encouraging response. In this way, the system provides an empathetic response according to the user's emotional state, offering an engaging conversational experience.
[0420] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0421] Step 1:
[0422] The user selects a virtual character using their user terminal and begins an interaction. The input at this time is the selection information for the virtual character, and the terminal sends this information to the server. This determines the context of the interaction with the character.
[0423] Step 2:
[0424] The user enters a message into the user terminal. This input is the user's message content, and the terminal forwards this message to the server. This prepares the foundation for the message to be parsed.
[0425] Step 3:
[0426] The server analyzes received user messages using an emotion engine. The input is the user's message, and the emotional state is identified using natural language processing techniques. This process outputs the user's emotions as states such as "joy" or "sadness."
[0427] Step 4:
[0428] The server generates prompt sentences based on the sentiment analysis results. The input is the analyzed sentiment state, and the prompt sentence is output by inserting the sentiment state into the prompt sentence template. This prepares appropriate input for the AI model.
[0429] Step 5:
[0430] The server inputs a prompt into the generating AI model to generate a response. The input is the generated prompt, and the AI model uses this prompt to generate a response that is appropriate to the user's emotions. This step provides the response content to be sent to the user.
[0431] Step 6:
[0432] The server sends the generated response to the user's terminal. The input is the response generated by the AI model, which is sent to the user's terminal as output. This process allows the user to receive emotionally empathetic responses from the virtual character.
[0433] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0434] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0435] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0436] [Third Embodiment]
[0437] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0438] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0439] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0440] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0441] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0442] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0443] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0444] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0445] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0446] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0447] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0448] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0449] This invention provides users with a new value in their conversational experience by constructing an information processing system that enables users to interact with specific characters in real time. The system mainly consists of a user terminal, a server, and an external information source.
[0450] The devices used by users are digital devices such as smartphones and tablets, and they connect to the server using the LINE app. Through the LINE interface, users can select the character they want to interact with, and the selected character information is sent to the server via the device.
[0451] The server associates the received character selection data with the user profile and stores it in the system's database. It then accesses external sources to collect the latest data, such as news, technology information, and trends. The server analyzes the collected data using its proprietary natural language processing technology and adjusts the information to match the personality and tone of the chosen character.
[0452] When a user sends a message to a character via the LINE app, the device relays the message to a server. The server analyzes the message, identifying its content and purpose. Based on the analysis, an AI model generates the optimal response, adjusting it to incorporate language and tone appropriate for the character. This generated response is then sent from the server to the device and displayed within the LINE app.
[0453] For example, if a user asks, "What's a good movie out there lately?", the server can refer to the latest movie information from external sources and respond in the character's tone, "I recommend XYZ Movies as a recent popular title." In this way, users can enjoy natural conversations with the character they have chosen.
[0454] In this embodiment, users can enjoy a rich conversational experience by engaging in natural conversations with characters and being provided with constantly updated information.
[0455] The following describes the processing flow.
[0456] Step 1:
[0457] The user accesses the system through the LINE app and selects a specific character. The device receives the user's selection and sends that information to the server.
[0458] Step 2:
[0459] The server stores the received character selection information in the database along with the user's profile information. This establishes an association between the user and the character.
[0460] Step 3:
[0461] The server accesses external information sources to regularly retrieve the latest news, technology, fashion, and other information. The collected data is analyzed by the server and updated in relation to the characters.
[0462] Step 4:
[0463] The user sends a message to the character using the LINE app. The device then forwards this message to the server.
[0464] Step 5:
[0465] The server analyzes messages sent by users using natural language processing tools to identify the intent and content of the messages.
[0466] Step 6:
[0467] Based on the analyzed message content, the server uses an AI model to generate a response tailored to the character. The generated response is adjusted to match the character's tone and style.
[0468] Step 7:
[0469] The server sends the generated response to the device. The device displays the response on the LINE app, and the user checks the message from the character.
[0470] Step 8:
[0471] The user receives a response from the character and can continue the conversation. The system repeats this process, maintaining a continuous conversation with the user.
[0472] (Example 1)
[0473] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0474] With the advancement of modern information and communication technology, there is a growing need for systems that allow users to obtain the latest information in real time while engaging in natural conversations with interactive objects. However, existing systems often fail to adequately ensure real-time information and natural conversation, making it difficult to improve the user experience.
[0475] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0476] In this invention, the server includes means for receiving selection information from a communication device that allows the user to select an interactive object and transmitting it to a management device; means for collecting the latest data from an external information source and updating the information by associating that data with the interactive object; means for analyzing the communication content received from the user and generating a response appropriate to the interactive object based on that communication content; and means for adjusting the response using a generation AI model tailored to the characteristics of the interactive object. As a result, the user can converse with the selected interactive object in real time and in a natural manner and instantly obtain the latest information.
[0477] A "communication device" is a digital device used by a user to select an interactive object and communicate, and includes smartphones and tablets.
[0478] A "management device" is a device that collects data from external information sources and generates responses based on selected information and analysis results from communication devices; a server falls into this category.
[0479] A "dialogue object" is a character that the user selects and engages in conversation with.
[0480] "External information sources" refer to information providers from which the management device collects data, such as news, trend information, and technology information.
[0481] A "generative AI model" is an artificial intelligence technology that generates the optimal response based on the communication content received from the user and adjusts the response to suit the dialogue object.
[0482] This invention is an information processing system that enables users to have real-time conversations with interactive objects using a communication device. The communication device is a digital device such as a smartphone or tablet, and connects to a management device using an interactive application. A server is used as the management device and performs major processing such as information collection, analysis, and generation.
[0483] Users can use a communication device to select interactive objects through a dedicated interface (e.g., a messaging application). The selected information is then sent from the communication device to the server.
[0484] The server accesses external information sources. These external sources collect up-to-date data, including news, trends, and technical information, using APIs and web crawling technologies. This data is stored in a database on the server and analyzed using natural language processing techniques. The analyzed information is then tailored based on the characteristics of the corresponding dialogue object.
[0485] When a user sends a message through a communication device, the terminal relays the message to a server. The server performs natural language analysis on the message, interprets its intent, and then uses a generative AI model to generate the optimal response. This response is adjusted to match the tone and style of the dialogue object.
[0486] For example, if a user asks, "What are some recent music events worth checking out?", the server can retrieve the latest information on music events from external sources and generate a response that takes into account the language used by the dialogue object. This would provide the user with a natural conversational experience, such as, "For the latest event information, we recommend the Y Music Festival held at location X."
[0487] For example, a prompt message for an AI model might be something like, "When a user asks about recent music events, generate a response in the tone of a conversational object based on external information." This prompt forms the basis for the AI model to generate an appropriate response that is relevant to the user's question.
[0488] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0489] Step 1:
[0490] The user selects an interactive object using a communication device. The communication device sends this selection information to the server as a data packet. The input is information about the interactive object selected by the user, and the output is the data sent to the server. In this step, information is collected from the user interface.
[0491] Step 2:
[0492] The server associates the received selection information with the user's profile and stores it in the database. The input is the selection data from the communication device, and the output is the operation to save it to the database. Here, data organization is performed using the profile management function.
[0493] Step 3:
[0494] The server accesses external information sources and collects the necessary data. This includes news, trends, and technology information. Input is requests from external APIs, and output is the latest retrieved data. Data ingestion is achieved using API calls and web crawling technologies.
[0495] Step 4:
[0496] The server analyzes the collected data using natural language processing techniques and adjusts it to match the characteristics of the dialogue object. The input is data obtained from an external source, and the output is the adjusted data. This step involves analysis using an NLP algorithm.
[0497] Step 5:
[0498] A user sends a message using a communication device. The communication device relays this message to a server. The input is the user's message, and the output is the message transfer to the server. Here, user interaction is relayed.
[0499] Step 6:
[0500] The server analyzes the received message and understands its intent. The input is the user message, and the output is the analysis result. In this step, intent analysis is performed using machine learning techniques.
[0501] Step 7:
[0502] The server generates responses using a generative AI model based on the analysis results, and adjusts them to match the characteristics of the dialogue object. The input consists of the analyzed user message and necessary external data, and the output is the adjusted response. Instructions are given to the AI model using prompts to generate natural-sounding responses.
[0503] Step 8:
[0504] The server sends the generated response to the communication device, which then displays the response to the user. The input is the response data from the server, and the output is the display on the user interface. This step involves data transfer and display.
[0505] (Application Example 1)
[0506] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0507] In today's home environment, there is a need for information processing technology that allows users to naturally acquire the information they desire in their daily lives, and to obtain it through friendly interactions with specific characters. Furthermore, this requires that home robots facilitate smooth interaction with users, thereby improving convenience and enrichment, and enabling effective communication.
[0508] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0509] In this invention, the server includes means for receiving character selection from a user device and transmitting it to an information processing device; means for acquiring the latest information from an external information source, adjusting the information based on its personality and associating it with the character; means for analyzing a message received from the user, generating a response appropriate to the character based on its content, and outputting the generated response as voice; and means for providing the response through a home robot to support the user's daily life. This makes it possible for the user to acquire everyday information while engaging in natural conversation with a specific character.
[0510] An "information processing system" is a technological foundation that enables users to interact with specific characters, and it performs tasks such as message analysis and response generation.
[0511] A "user device" is a digital device used by a user to interact with a character, and includes smartphones and tablets.
[0512] An "information processing device" is a computer system that receives and processes information from a user device, and includes servers.
[0513] "External information sources" refer to databases and online services that provide the latest data on news, technology, trends, and more.
[0514] A "character" is a virtual person or animal with its own personality that the user interacts with, and is used as the object of that interaction.
[0515] "Analysis means" refers to the technical process performed to process messages received from users and understand their intent and content.
[0516] "Response generation" is the process of generating a response appropriate to the character based on the analyzed message.
[0517] A "home robot" is a robotic device placed in a home environment to interact with and assist users.
[0518] "Adjusting information based on individuality" is the process of transforming and adapting acquired information to match the speech patterns and behaviors of a specific character.
[0519] To realize this invention, the user terminal, information processing device (server), and home robot must all work in coordination. A smartphone or tablet is used as the user terminal, and a messaging application such as the LINE app is launched to interact with a specific character. When the user selects a character and sends a message, the terminal transfers the data to the information processing device.
[0520] The information processing device (server) analyzes messages received from the user using a natural language processing engine. Based on the analyzed data, and combined with the latest information obtained from external sources, a generative AI model generates a response appropriate to the character. The generated response is sent to a home robot in text format and output as speech using speech synthesis technology. For example, Google's Text-to-Speech service can be used.
[0521] The home robot receives responses from an information processing unit and engages in natural conversation with the user. This allows users to obtain everyday information while enjoying conversations with the character simply by asking questions to the home robot in their living room. For example, when a user asks, "What's the weather like today?", the robot might respond, "It's sunny today, and the temperature is 25 degrees. You'll need sunglasses."
[0522] In a concrete example, if a user asks about a particular hobby, the system could respond based on acquired external information, saying, "XYZ is currently a popular trend." In the prompt, the information processing device can instruct the generating AI model in the following way: "User question: What's the weather like today? Generate response from smart robot: Generate a response based on weather forecast API data and add advice for going out."
[0523] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0524] Step 1:
[0525] The user launches the LINE app on their device, selects a specific character, and enters a message. The input consists of the user message and character selection data. This data is sent from the device to the server.
[0526] Step 2:
[0527] The server associates the character selection data received from the terminal with the user profile and stores it in a database. This records the character selected by each user, creating the foundation for providing personalized dialogue experiences.
[0528] Step 3:
[0529] The server accesses external information sources to retrieve the latest information related to the user's message. Input consists of data requests from external sources. Data processing involves extracting necessary information based on specific keywords and adjusting it to suit the character's personality.
[0530] Step 4:
[0531] Messages received from users are analyzed on the server using natural language processing. The input is the user message, and the data processing involves grammatical analysis and semantic interpretation. As a result, the message's intent is extracted.
[0532] Step 5:
[0533] Based on the analysis results and acquired external information, the server generates character-appropriate responses using a generated AI model. Through the generated prompts, the AI model outputs appropriate response sentences. This includes adjusting the tone and wording to match the character's personality.
[0534] Step 6:
[0535] The generated response is sent from the server to the terminal, and then forwarded to the home robot. The input is the response text from the server, and the output is audio data. The robot uses speech synthesis technology to convert the text into speech.
[0536] Step 7:
[0537] The home robot completes the interaction by providing an audible response to the user. Here, the user receives the voice output and obtains information relevant to the intent of their question.
[0538] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0539] This invention provides an information processing system that incorporates an emotion engine that recognizes and adjusts the user's emotions when they interact with a specific character. This system consists of a user terminal, a server, and the emotion engine.
[0540] The user's device is a smartphone or tablet, and it connects to the server using the LINE app. The user selects a character they want to interact with through the LINE app, and the conversation with the character begins. When the user sends a message, the device forwards that message to the server.
[0541] When the server analyzes a user's message, it uses an emotion engine to recognize their emotions. The emotion engine utilizes natural language processing techniques to identify the user's emotional state based on the message's context and vocabulary. This emotion analysis influences the AI model that generates the character's response. The AI model selects appropriate tones and expressions to generate a response that is appropriate to the user's emotions.
[0542] The generated responses express emotions such as joy, sadness, and surprise, making them more relatable to the user. These responses are sent from the server to the device and displayed on the LINE app. In this way, users can continue interacting with the character while feeling an emotional connection.
[0543] For example, if a user sends a message saying, "Something really great happened today!", the server's emotion engine can recognize this emotion and have the character respond with something like, "That's wonderful! What happened?", sharing in the joy. This allows the user to enjoy a more intimate conversation with the character.
[0544] In this way, the present invention provides a dialogue system that includes adjustments based on the user's emotion recognition, enabling a more natural and enriching dialogue experience.
[0545] The following describes the processing flow.
[0546] Step 1:
[0547] The user accesses the system through the LINE app and selects a specific character they want to interact with. The device then sends this selection information to the server.
[0548] Step 2:
[0549] The server stores the selected character information in a database, associating it with the user's profile. This ensures that interactions with the character are properly managed in future conversations.
[0550] Step 3:
[0551] When a user sends a message to a character, the device forwards that message to the server. The message content is sent to the server as text data.
[0552] Step 4:
[0553] The server uses an emotion engine to analyze the messages sent by the user. Based on the message content, context, and vocabulary used, it identifies the user's emotional state.
[0554] Step 5:
[0555] Based on the analyzed emotional information, the server's AI model generates a character response. The emotional information is reflected in the tone and content of the response, resulting in a response that includes emotional resonance.
[0556] Step 6:
[0557] The generated response is sent from the server to the device and displayed to the user on the LINE app. The user receives an emotionally appropriate response from the character and continues the conversation.
[0558] Step 7:
[0559] The user can continue the conversation based on the response they receive. The server repeats this process, continuously updating the character's responses in response to new messages sent by the user.
[0560] (Example 2)
[0561] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0562] Conventional information processing systems have the problem that user-character interactions are mechanical and difficult to respond appropriately to the user's emotional state. As a result, it is difficult to provide a natural and rich dialogue experience.
[0563] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0564] In this invention, the server includes means for analyzing received messages and identifying emotional states using natural language processing techniques, means for generating adapted responses using generative models based on the identified emotional states, and means for transmitting the generated responses to the user terminal. This makes it possible to generate adjusted responses based on the user's emotions.
[0565] A "user terminal" is a communication device connected to an information processing device, and is a device used by users to input or receive information.
[0566] "Character selection" refers to the act of a user choosing a virtual entity with whom they will engage in dialogue.
[0567] A "data processing device" refers to a device that receives, transmits, analyzes, and generates responses to information via a communication network.
[0568] "Natural language processing technology" is a general term for technologies that enable computers to understand and process human language.
[0569] "Emotional state" refers to the emotional responses and tendencies of the user, and is the subject of analysis.
[0570] A "generative model" refers to a computational algorithm or program used to generate a response based on input information.
[0571] An "adapted response" refers to the result of a dialogue that has been appropriately adjusted based on the user's emotional state and the context of the conversation.
[0572] This invention provides an information processing system that recognizes a user's emotions and generates a corresponding response when the user engages in a natural conversation with a specific character. The system incorporates a user terminal, a server, and a generative AI model that performs emotion analysis.
[0573] The user terminal will be a communication device such as a smartphone or tablet. This allows the user to connect to the server using an application such as the LINE app and begin interacting with the character. When the user sends a message, the terminal forwards that message to the server.
[0574] The server uses natural language processing techniques to perform sentiment analysis when processing received user messages. Specifically, it analyzes the context of the received message and uses its built-in software to identify the user's emotional state. The results of this sentiment analysis are used to influence the generative AI model and generate appropriately tailored responses.
[0575] For example, if a user sends a message saying, "Something really great happened today!", the server's sentiment analysis system recognizes the emotion "happy." Based on this, the AI model generates an empathetic response to the user's feelings, such as, "That's wonderful! What happened?" Prompts may be used in this response generation process, for example, "A user has reported some good news. Please think of a response that shares their joy."
[0576] The generated response is sent back to the user's device from the server and displayed on the LINE app screen. This allows users to engage in conversations with characters that reflect their actual emotions, enabling them to experience natural and rich communication.
[0577] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0578] Step 1:
[0579] Users launch the LINE app on a device such as a smartphone or tablet and select the character they want to interact with.
[0580] Input: User character selection
[0581] Operation: The user selects a character using the LINE app interface.
[0582] Output: Information about the selected character is sent from the terminal to the server.
[0583] Step 2:
[0584] The user types and sends a message to the selected character.
[0585] Input: User's message text
[0586] Action: The user enters text into the message input field in the LINE app and taps the send button.
[0587] Output: The entered message text is processed on the terminal and forwarded to the server.
[0588] Step 3:
[0589] The server analyzes the user's message and performs sentiment recognition.
[0590] Input: Sent message text
[0591] Operation: The server uses natural language processing techniques to analyze messages and identify the user's emotional state from the message's context and vocabulary.
[0592] Output: The user's emotional state is determined, and the results of the emotion analysis are passed to the generating AI model.
[0593] Step 4:
[0594] The server generates character responses using a generative AI model based on the emotion analysis results.
[0595] Input: Sentiment analysis results
[0596] Operation: Based on the sentiment analysis results, the server inputs prompt sentences into the generative AI model, which then generates a response that takes into account appropriate tone and content.
[0597] Output: Response text generated using the prompt.
[0598] Step 5:
[0599] The server sends the generated response to the user's terminal.
[0600] Input: Response text
[0601] Operation: The server packages the generated response as a message and sends it to the user terminal via the communication protocol.
[0602] Output: The response message will be displayed on the user's LINE app.
[0603] Step 6:
[0604] The user receives a response generated within the LINE app and continues the conversation with the character.
[0605] Input: Response message from the server
[0606] Operation: The user reads the response displayed on the LINE app interface and decides whether to continue the conversation by typing a new message.
[0607] Output: Input of a new message or end of the interaction.
[0608] (Application Example 2)
[0609] Next, we will explain Application Example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0610] In modern times, interaction with virtual characters is becoming commonplace, and there is a growing demand for richer communication experiences, particularly in the fields of entertainment and interaction. Existing systems struggle to adequately recognize the user's emotional state and generate empathetic responses accordingly. As a result, users may experience unnaturalness or low satisfaction. To address this challenge, a system is needed that recognizes the user's emotions and generates natural and empathetic responses.
[0611] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0612] In this invention, the server includes means for receiving a virtual character selection from a user device and transmitting it to a computer; means for obtaining the latest information from an external information resource and updating that information in association with the virtual character; means for analyzing a message received from the user and identifying its emotional state using natural language processing technology; means for generating a response for the virtual character based on the emotion analysis results and transmitting the generated response to the user device; and means for utilizing an AI model using prompt sentences to generate an emotion-appropriate response. This enables natural dialogue that is in line with the user's emotions.
[0613] An "information processing device" is a computer system that processes and analyzes data based on user input and generates responses or outputs.
[0614] A "user device" is a device operated by a user to communicate with an information processing device, and includes smartphones and tablets.
[0615] "Virtual character selection" refers to the act of selecting a digital personality or anthropomorphic character to interact with.
[0616] A "computer" is an electronic device equipped with a central processing unit for processing information and performing necessary calculations.
[0617] "External information resources" refer to databases and information provision services that exist outside the system and are used to retrieve information as needed.
[0618] "Latest information" refers to the most recent data obtained from external information resources that reflects the current situation and state.
[0619] "Natural language processing technology" is a technology that analyzes human language and uses computers to understand and process its meaning and emotions.
[0620] "Emotional state" is an evaluation that indicates the emotional state and psychological tendencies contained in the user's input text.
[0621] A "prompt message" is text that provides instructions or context for an AI model to generate a response.
[0622] An "AI model" is an artificial intelligence program that uses machine learning algorithms to learn patterns from data and generate responses.
[0623] The system of this invention is designed to enable natural communication based on the user's emotion recognition through interaction with a virtual character. This system consists mainly of a user terminal, a server, and an emotion engine.
[0624] The user selects a virtual character using a user device such as a smartphone or tablet and begins an interaction. The user device sends the character selection information to the server. At this time, the device functions as a route that receives messages from the user and forwards them to the server.
[0625] The server is responsible for processing incoming user messages. An emotion engine built into the server analyzes the emotional state of the message using natural language processing techniques. The analyzed emotional state is then used by a generative AI model, along with specific prompts, to derive a response appropriate to the user's emotions. This AI model generates responses using advanced machine learning algorithms such as the OpenAI API, enabling the virtual character to provide empathetic and natural dialogue with the user.
[0626] As a concrete example of this invention, consider a case where a user sends the message "I'm tired." In this case, the emotion engine identifies "fatigue" as an emotional state and supplies the AI model with a prompt message based on it: "Please write a response in an appropriate tone based on the user's emotion {fatigue}," generating a gentle and encouraging response. In this way, the system provides an empathetic response according to the user's emotional state, offering an engaging conversational experience.
[0627] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0628] Step 1:
[0629] The user selects a virtual character using their user terminal and begins an interaction. The input at this time is the selection information for the virtual character, and the terminal sends this information to the server. This determines the context of the interaction with the character.
[0630] Step 2:
[0631] The user enters a message into the user terminal. This input is the user's message content, and the terminal forwards this message to the server. This prepares the foundation for the message to be parsed.
[0632] Step 3:
[0633] The server analyzes received user messages using an emotion engine. The input is the user's message, and the emotional state is identified using natural language processing techniques. This process outputs the user's emotions as states such as "joy" or "sadness."
[0634] Step 4:
[0635] The server generates prompt sentences based on the sentiment analysis results. The input is the analyzed sentiment state, and the prompt sentence is output by inserting the sentiment state into the prompt sentence template. This prepares appropriate input for the AI model.
[0636] Step 5:
[0637] The server inputs a prompt into the generating AI model to generate a response. The input is the generated prompt, and the AI model uses this prompt to generate a response that is appropriate to the user's emotions. This step provides the response content to be sent to the user.
[0638] Step 6:
[0639] The server sends the generated response to the user's terminal. The input is the response generated by the AI model, which is sent to the user's terminal as output. This process allows the user to receive emotionally empathetic responses from the virtual character.
[0640] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0641] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0642] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0643] [Fourth Embodiment]
[0644] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0645] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0646] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0647] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0648] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0649] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0650] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0651] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0652] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0653] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0654] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0655] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0656] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0657] This invention provides users with a new value in their conversational experience by constructing an information processing system that enables users to interact with specific characters in real time. The system mainly consists of a user terminal, a server, and an external information source.
[0658] The devices used by users are digital devices such as smartphones and tablets, and they connect to the server using the LINE app. Through the LINE interface, users can select the character they want to interact with, and the selected character information is sent to the server via the device.
[0659] The server associates the received character selection data with the user profile and stores it in the system's database. It then accesses external sources to collect the latest data, such as news, technology information, and trends. The server analyzes the collected data using its proprietary natural language processing technology and adjusts the information to match the personality and tone of the chosen character.
[0660] When a user sends a message to a character via the LINE app, the device relays the message to a server. The server analyzes the message, identifying its content and purpose. Based on the analysis, an AI model generates the optimal response, adjusting it to incorporate language and tone appropriate for the character. This generated response is then sent from the server to the device and displayed within the LINE app.
[0661] For example, if a user asks, "What's a good movie out there lately?", the server can refer to the latest movie information from external sources and respond in the character's tone, "I recommend XYZ Movies as a recent popular title." In this way, users can enjoy natural conversations with the character they have chosen.
[0662] In this embodiment, users can enjoy a rich conversational experience by engaging in natural conversations with characters and being provided with constantly updated information.
[0663] The following describes the processing flow.
[0664] Step 1:
[0665] The user accesses the system through the LINE app and selects a specific character. The device receives the user's selection and sends that information to the server.
[0666] Step 2:
[0667] The server stores the received character selection information in the database along with the user's profile information. This establishes an association between the user and the character.
[0668] Step 3:
[0669] The server accesses external information sources to regularly retrieve the latest news, technology, fashion, and other information. The collected data is analyzed by the server and updated in relation to the characters.
[0670] Step 4:
[0671] The user sends a message to the character using the LINE app. The device then forwards this message to the server.
[0672] Step 5:
[0673] The server analyzes messages sent by users using natural language processing tools to identify the intent and content of the messages.
[0674] Step 6:
[0675] Based on the analyzed message content, the server uses an AI model to generate a response tailored to the character. The generated response is adjusted to match the character's tone and style.
[0676] Step 7:
[0677] The server sends the generated response to the device. The device displays the response on the LINE app, and the user checks the message from the character.
[0678] Step 8:
[0679] The user receives a response from the character and can continue the conversation. The system repeats this process, maintaining a continuous conversation with the user.
[0680] (Example 1)
[0681] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0682] With the advancement of modern information and communication technology, there is a growing need for systems that allow users to obtain the latest information in real time while engaging in natural conversations with interactive objects. However, existing systems often fail to adequately ensure real-time information and natural conversation, making it difficult to improve the user experience.
[0683] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0684] In this invention, the server includes means for receiving selection information from a communication device that allows the user to select an interactive object and transmitting it to a management device; means for collecting the latest data from an external information source and updating the information by associating that data with the interactive object; means for analyzing the communication content received from the user and generating a response appropriate to the interactive object based on that communication content; and means for adjusting the response using a generation AI model tailored to the characteristics of the interactive object. As a result, the user can converse with the selected interactive object in real time and in a natural manner and instantly obtain the latest information.
[0685] A "communication device" is a digital device used by a user to select an interactive object and communicate, and includes smartphones and tablets.
[0686] A "management device" is a device that collects data from external information sources and generates responses based on selected information and analysis results from communication devices; a server falls into this category.
[0687] A "dialogue object" is a character that the user selects and engages in conversation with.
[0688] "External information sources" refer to information providers from which the management device collects data, such as news, trend information, and technology information.
[0689] A "generative AI model" is an artificial intelligence technology that generates the optimal response based on the communication content received from the user and adjusts the response to suit the dialogue object.
[0690] This invention is an information processing system that enables users to have real-time conversations with interactive objects using a communication device. The communication device is a digital device such as a smartphone or tablet, and connects to a management device using an interactive application. A server is used as the management device and performs major processing such as information collection, analysis, and generation.
[0691] Users can use a communication device to select interactive objects through a dedicated interface (e.g., a messaging application). The selected information is then sent from the communication device to the server.
[0692] The server accesses external information sources. These external sources collect up-to-date data, including news, trends, and technical information, using APIs and web crawling technologies. This data is stored in a database on the server and analyzed using natural language processing techniques. The analyzed information is then tailored based on the characteristics of the corresponding dialogue object.
[0693] When a user sends a message through a communication device, the terminal relays the message to a server. The server performs natural language analysis on the message, interprets its intent, and then uses a generative AI model to generate the optimal response. This response is adjusted to match the tone and style of the dialogue object.
[0694] For example, if a user asks, "What are some recent music events worth checking out?", the server can retrieve the latest information on music events from external sources and generate a response that takes into account the language used by the dialogue object. This would provide the user with a natural conversational experience, such as, "For the latest event information, we recommend the Y Music Festival held at location X."
[0695] For example, a prompt message for an AI model might be something like, "When a user asks about recent music events, generate a response in the tone of a conversational object based on external information." This prompt forms the basis for the AI model to generate an appropriate response that is relevant to the user's question.
[0696] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0697] Step 1:
[0698] The user selects an interactive object using a communication device. The communication device sends this selection information to the server as a data packet. The input is information about the interactive object selected by the user, and the output is the data sent to the server. In this step, information is collected from the user interface.
[0699] Step 2:
[0700] The server associates the received selection information with the user's profile and stores it in the database. The input is the selection data from the communication device, and the output is the operation to save it to the database. Here, data organization is performed using the profile management function.
[0701] Step 3:
[0702] The server accesses external information sources and collects the necessary data. This includes news, trends, and technology information. Input is requests from external APIs, and output is the latest retrieved data. Data ingestion is achieved using API calls and web crawling technologies.
[0703] Step 4:
[0704] The server analyzes the collected data using natural language processing techniques and adjusts it to match the characteristics of the dialogue object. The input is data obtained from an external source, and the output is the adjusted data. This step involves analysis using an NLP algorithm.
[0705] Step 5:
[0706] A user sends a message using a communication device. The communication device relays this message to a server. The input is the user's message, and the output is the message transfer to the server. Here, user interaction is relayed.
[0707] Step 6:
[0708] The server analyzes the received message and understands its intent. The input is the user message, and the output is the analysis result. In this step, intent analysis is performed using machine learning techniques.
[0709] Step 7:
[0710] The server generates responses using a generative AI model based on the analysis results, and adjusts them to match the characteristics of the dialogue object. The input consists of the analyzed user message and necessary external data, and the output is the adjusted response. Instructions are given to the AI model using prompts to generate natural-sounding responses.
[0711] Step 8:
[0712] The server sends the generated response to the communication device, which then displays the response to the user. The input is the response data from the server, and the output is the display on the user interface. This step involves data transfer and display.
[0713] (Application Example 1)
[0714] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0715] In today's home environment, there is a need for information processing technology that allows users to naturally acquire the information they desire in their daily lives, and to obtain it through friendly interactions with specific characters. Furthermore, this requires that home robots facilitate smooth interaction with users, thereby improving convenience and enrichment, and enabling effective communication.
[0716] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0717] In this invention, the server includes means for receiving character selection from a user device and transmitting it to an information processing device; means for acquiring the latest information from an external information source, adjusting the information based on its personality and associating it with the character; means for analyzing a message received from the user, generating a response appropriate to the character based on its content, and outputting the generated response as voice; and means for providing the response through a home robot to support the user's daily life. This makes it possible for the user to acquire everyday information while engaging in natural conversation with a specific character.
[0718] An "information processing system" is a technological foundation that enables users to interact with specific characters, and it performs tasks such as message analysis and response generation.
[0719] A "user device" is a digital device used by a user to interact with a character, and includes smartphones and tablets.
[0720] An "information processing device" is a computer system that receives and processes information from a user device, and includes servers.
[0721] "External information sources" refer to databases and online services that provide the latest data on news, technology, trends, and more.
[0722] A "character" is a virtual person or animal with its own personality that the user interacts with, and is used as the object of that interaction.
[0723] "Analysis means" refers to the technical process performed to process messages received from users and understand their intent and content.
[0724] "Response generation" is the process of generating a response appropriate to the character based on the analyzed message.
[0725] A "home robot" is a robotic device placed in a home environment to interact with and assist users.
[0726] "Adjusting information based on individuality" is the process of transforming and adapting acquired information to match the speech patterns and behaviors of a specific character.
[0727] To realize this invention, the user terminal, information processing device (server), and home robot must all work in coordination. A smartphone or tablet is used as the user terminal, and a messaging application such as the LINE app is launched to interact with a specific character. When the user selects a character and sends a message, the terminal transfers the data to the information processing device.
[0728] The information processing device (server) analyzes messages received from the user using a natural language processing engine. Based on the analyzed data, and combined with the latest information obtained from external sources, a generative AI model generates a response appropriate to the character. The generated response is sent to a home robot in text format and output as speech using speech synthesis technology. For example, Google's Text-to-Speech service can be used.
[0729] The home robot receives responses from an information processing unit and engages in natural conversation with the user. This allows users to obtain everyday information while enjoying conversations with the character simply by asking questions to the home robot in their living room. For example, when a user asks, "What's the weather like today?", the robot might respond, "It's sunny today, and the temperature is 25 degrees. You'll need sunglasses."
[0730] In a concrete example, if a user asks about a particular hobby, the system could respond based on acquired external information, saying, "XYZ is currently a popular trend." In the prompt, the information processing device can instruct the generating AI model in the following way: "User question: What's the weather like today? Generate response from smart robot: Generate a response based on weather forecast API data and add advice for going out."
[0731] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0732] Step 1:
[0733] The user launches the LINE app on their device, selects a specific character, and enters a message. The input consists of the user message and character selection data. This data is sent from the device to the server.
[0734] Step 2:
[0735] The server associates the character selection data received from the terminal with the user profile and stores it in a database. This records the character selected by each user, creating the foundation for providing personalized dialogue experiences.
[0736] Step 3:
[0737] The server accesses external information sources to retrieve the latest information related to the user's message. Input consists of data requests from external sources. Data processing involves extracting necessary information based on specific keywords and adjusting it to suit the character's personality.
[0738] Step 4:
[0739] Messages received from users are analyzed on the server using natural language processing. The input is the user message, and the data processing involves grammatical analysis and semantic interpretation. As a result, the message's intent is extracted.
[0740] Step 5:
[0741] Based on the analysis results and acquired external information, the server generates character-appropriate responses using a generated AI model. Through the generated prompts, the AI model outputs appropriate response sentences. This includes adjusting the tone and wording to match the character's personality.
[0742] Step 6:
[0743] The generated response is sent from the server to the terminal, and then forwarded to the home robot. The input is the response text from the server, and the output is audio data. The robot uses speech synthesis technology to convert the text into speech.
[0744] Step 7:
[0745] The home robot completes the interaction by providing an audible response to the user. Here, the user receives the voice output and obtains information relevant to the intent of their question.
[0746] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0747] This invention provides an information processing system that incorporates an emotion engine that recognizes and adjusts the user's emotions when they interact with a specific character. This system consists of a user terminal, a server, and the emotion engine.
[0748] The user's device is a smartphone or tablet, and it connects to the server using the LINE app. The user selects a character they want to interact with through the LINE app, and the conversation with the character begins. When the user sends a message, the device forwards that message to the server.
[0749] When the server analyzes a user's message, it uses an emotion engine to recognize their emotions. The emotion engine utilizes natural language processing techniques to identify the user's emotional state based on the message's context and vocabulary. This emotion analysis influences the AI model that generates the character's response. The AI model selects appropriate tones and expressions to generate a response that is appropriate to the user's emotions.
[0750] The generated responses express emotions such as joy, sadness, and surprise, making them more relatable to the user. These responses are sent from the server to the device and displayed on the LINE app. In this way, users can continue interacting with the character while feeling an emotional connection.
[0751] For example, if a user sends a message saying, "Something really great happened today!", the server's emotion engine can recognize this emotion and have the character respond with something like, "That's wonderful! What happened?", sharing in the joy. This allows the user to enjoy a more intimate conversation with the character.
[0752] In this way, the present invention provides a dialogue system that includes adjustments based on the user's emotion recognition, enabling a more natural and enriching dialogue experience.
[0753] The following describes the processing flow.
[0754] Step 1:
[0755] The user accesses the system through the LINE app and selects a specific character they want to interact with. The device then sends this selection information to the server.
[0756] Step 2:
[0757] The server stores the selected character information in a database, associating it with the user's profile. This ensures that interactions with the character are properly managed in future conversations.
[0758] Step 3:
[0759] When a user sends a message to a character, the device forwards that message to the server. The message content is sent to the server as text data.
[0760] Step 4:
[0761] The server uses an emotion engine to analyze the messages sent by the user. Based on the message content, context, and vocabulary used, it identifies the user's emotional state.
[0762] Step 5:
[0763] Based on the analyzed emotional information, the server's AI model generates a character response. The emotional information is reflected in the tone and content of the response, resulting in a response that includes emotional resonance.
[0764] Step 6:
[0765] The generated response is sent from the server to the device and displayed to the user on the LINE app. The user receives an emotionally appropriate response from the character and continues the conversation.
[0766] Step 7:
[0767] The user can continue the conversation based on the response they receive. The server repeats this process, continuously updating the character's responses in response to new messages sent by the user.
[0768] (Example 2)
[0769] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0770] Conventional information processing systems have the problem that user-character interactions are mechanical and difficult to respond appropriately to the user's emotional state. As a result, it is difficult to provide a natural and rich dialogue experience.
[0771] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0772] In this invention, the server includes means for analyzing received messages and identifying emotional states using natural language processing techniques, means for generating adapted responses using generative models based on the identified emotional states, and means for transmitting the generated responses to the user terminal. This makes it possible to generate adjusted responses based on the user's emotions.
[0773] A "user terminal" is a communication device connected to an information processing device, and is a device used by users to input or receive information.
[0774] "Character selection" refers to the act of a user choosing a virtual entity with whom they will engage in dialogue.
[0775] A "data processing device" refers to a device that receives, transmits, analyzes, and generates responses to information via a communication network.
[0776] "Natural language processing technology" is a general term for technologies that enable computers to understand and process human language.
[0777] "Emotional state" refers to the emotional responses and tendencies of the user, and is the subject of analysis.
[0778] A "generative model" refers to a computational algorithm or program used to generate a response based on input information.
[0779] An "adapted response" refers to the result of a dialogue that has been appropriately adjusted based on the user's emotional state and the context of the conversation.
[0780] This invention provides an information processing system that recognizes a user's emotions and generates a corresponding response when the user engages in a natural conversation with a specific character. The system incorporates a user terminal, a server, and a generative AI model that performs emotion analysis.
[0781] The user terminal will be a communication device such as a smartphone or tablet. This allows the user to connect to the server using an application such as the LINE app and begin interacting with the character. When the user sends a message, the terminal forwards that message to the server.
[0782] The server uses natural language processing techniques to perform sentiment analysis when processing received user messages. Specifically, it analyzes the context of the received message and uses its built-in software to identify the user's emotional state. The results of this sentiment analysis are used to influence the generative AI model and generate appropriately tailored responses.
[0783] For example, if a user sends a message saying, "Something really great happened today!", the server's sentiment analysis system recognizes the emotion "happy." Based on this, the AI model generates an empathetic response to the user's feelings, such as, "That's wonderful! What happened?" Prompts may be used in this response generation process, for example, "A user has reported some good news. Please think of a response that shares their joy."
[0784] The generated response is sent back to the user's device from the server and displayed on the LINE app screen. This allows users to engage in conversations with characters that reflect their actual emotions, enabling them to experience natural and rich communication.
[0785] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0786] Step 1:
[0787] Users launch the LINE app on a device such as a smartphone or tablet and select the character they want to interact with.
[0788] Input: User character selection
[0789] Operation: The user selects a character using the LINE app interface.
[0790] Output: Information about the selected character is sent from the terminal to the server.
[0791] Step 2:
[0792] The user types and sends a message to the selected character.
[0793] Input: User's message text
[0794] Action: The user enters text into the message input field in the LINE app and taps the send button.
[0795] Output: The entered message text is processed on the terminal and forwarded to the server.
[0796] Step 3:
[0797] The server analyzes the user's message and performs sentiment recognition.
[0798] Input: Sent message text
[0799] Operation: The server uses natural language processing techniques to analyze messages and identify the user's emotional state from the message's context and vocabulary.
[0800] Output: The user's emotional state is determined, and the results of the emotion analysis are passed to the generating AI model.
[0801] Step 4:
[0802] The server generates character responses using a generative AI model based on the emotion analysis results.
[0803] Input: Sentiment analysis results
[0804] Operation: Based on the sentiment analysis results, the server inputs prompt sentences into the generative AI model, which then generates a response that takes into account appropriate tone and content.
[0805] Output: Response text generated using the prompt.
[0806] Step 5:
[0807] The server sends the generated response to the user's terminal.
[0808] Input: Response text
[0809] Operation: The server packages the generated response as a message and sends it to the user terminal via the communication protocol.
[0810] Output: The response message will be displayed on the user's LINE app.
[0811] Step 6:
[0812] The user receives a response generated within the LINE app and continues the conversation with the character.
[0813] Input: Response message from the server
[0814] Operation: The user reads the response displayed on the LINE app interface and decides whether to continue the conversation by typing a new message.
[0815] Output: Input of a new message or end of the interaction.
[0816] (Application Example 2)
[0817] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0818] In modern times, interaction with virtual characters is becoming commonplace, and there is a growing demand for richer communication experiences, particularly in the fields of entertainment and interaction. Existing systems struggle to adequately recognize the user's emotional state and generate empathetic responses accordingly. As a result, users may experience unnaturalness or low satisfaction. To address this challenge, a system is needed that recognizes the user's emotions and generates natural and empathetic responses.
[0819] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0820] In this invention, the server includes means for receiving a virtual character selection from a user device and transmitting it to a computer; means for obtaining the latest information from an external information resource and updating that information in association with the virtual character; means for analyzing a message received from the user and identifying its emotional state using natural language processing technology; means for generating a response for the virtual character based on the emotion analysis results and transmitting the generated response to the user device; and means for utilizing an AI model using prompt sentences to generate an emotion-appropriate response. This enables natural dialogue that is in line with the user's emotions.
[0821] An "information processing device" is a computer system that processes and analyzes data based on user input and generates responses or outputs.
[0822] A "user device" is a device operated by a user to communicate with an information processing device, and includes smartphones and tablets.
[0823] "Virtual character selection" refers to the act of selecting a digital personality or anthropomorphic character to interact with.
[0824] A "computer" is an electronic device equipped with a central processing unit for processing information and performing necessary calculations.
[0825] "External information resources" refer to databases and information provision services that exist outside the system and are used to retrieve information as needed.
[0826] "Latest information" refers to the most recent data obtained from external information resources that reflects the current situation and state.
[0827] "Natural language processing technology" is a technology that analyzes human language and uses computers to understand and process its meaning and emotions.
[0828] "Emotional state" is an evaluation that indicates the emotional state and psychological tendencies contained in the user's input text.
[0829] A "prompt message" is text that provides instructions or context for an AI model to generate a response.
[0830] An "AI model" is an artificial intelligence program that uses machine learning algorithms to learn patterns from data and generate responses.
[0831] The system of this invention is designed to enable natural communication based on the user's emotion recognition through interaction with a virtual character. This system consists mainly of a user terminal, a server, and an emotion engine.
[0832] The user selects a virtual character using a user device such as a smartphone or tablet and begins an interaction. The user device sends the character selection information to the server. At this time, the device functions as a route that receives messages from the user and forwards them to the server.
[0833] The server is responsible for processing incoming user messages. An emotion engine built into the server analyzes the emotional state of the message using natural language processing techniques. The analyzed emotional state is then used by a generative AI model, along with specific prompts, to derive a response appropriate to the user's emotions. This AI model generates responses using advanced machine learning algorithms such as the OpenAI API, enabling the virtual character to provide empathetic and natural dialogue with the user.
[0834] As a concrete example of this invention, consider a case where a user sends the message "I'm tired." In this case, the emotion engine identifies "fatigue" as an emotional state and supplies the AI model with a prompt message based on it: "Please write a response in an appropriate tone based on the user's emotion {fatigue}," generating a gentle and encouraging response. In this way, the system provides an empathetic response according to the user's emotional state, offering an engaging conversational experience.
[0835] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0836] Step 1:
[0837] The user selects a virtual character using their user terminal and begins an interaction. The input at this time is the selection information for the virtual character, and the terminal sends this information to the server. This determines the context of the interaction with the character.
[0838] Step 2:
[0839] The user enters a message into the user terminal. This input is the user's message content, and the terminal forwards this message to the server. This prepares the foundation for the message to be parsed.
[0840] Step 3:
[0841] The server analyzes received user messages using an emotion engine. The input is the user's message, and the emotional state is identified using natural language processing techniques. This process outputs the user's emotions as states such as "joy" or "sadness."
[0842] Step 4:
[0843] The server generates prompt sentences based on the sentiment analysis results. The input is the analyzed sentiment state, and the prompt sentence is output by inserting the sentiment state into the prompt sentence template. This prepares appropriate input for the AI model.
[0844] Step 5:
[0845] The server inputs a prompt into the generating AI model to generate a response. The input is the generated prompt, and the AI model uses this prompt to generate a response that is appropriate to the user's emotions. This step provides the response content to be sent to the user.
[0846] Step 6:
[0847] The server sends the generated response to the user's terminal. The input is the response generated by the AI model, which is sent to the user's terminal as output. This process allows the user to receive emotionally empathetic responses from the virtual character.
[0848] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0849] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0850] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0851] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0852] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0853] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0854] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0855] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0856] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0857] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0858] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0859] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0860] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0861] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0862] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0863] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0864] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0865] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0866] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0867] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0868] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0869] The following is further disclosed regarding the embodiments described above.
[0870] (Claim 1)
[0871] An information processing system that enables conversation with a specific character,
[0872] A means of receiving character selections from the user terminal and sending them to the server,
[0873] A means of obtaining the latest information from external sources and updating that information in relation to the character,
[0874] A means for analyzing messages received from users and generating character-appropriate responses based on their content,
[0875] A means for sending the generated response to the user terminal,
[0876] A system that includes this.
[0877] (Claim 2)
[0878] The system according to claim 1, which selectively uses information obtained from external sources to generate responses based on the intent of the analyzed user message.
[0879] (Claim 3)
[0880] The system according to claim 1, which selects the optimal model from among multiple AI models and generates a response based on the character selection and the results of user message analysis.
[0881] "Example 1"
[0882] (Claim 1)
[0883] A means for receiving selection information from a communication device that allows the user to select an interactive object, and transmitting it to a management device,
[0884] A means of collecting the latest data from external sources and updating information by associating that data with interactive objects,
[0885] A means for analyzing the communication content received from the user and generating a response appropriate to the dialogue object based on that communication content,
[0886] Means for transmitting the generated response to a communication device,
[0887] A means of adjusting responses using a generative AI model tailored to the characteristics of the interactive object,
[0888] A system that includes this.
[0889] (Claim 2)
[0890] The system according to claim 1, which selectively uses data collected from external information sources to generate a response based on the intent of the analyzed communication content.
[0891] (Claim 3)
[0892] The system according to claim 1, which selects the optimal model from among multiple generating AI models and generates a response based on the selection of an interactive object and the results of analyzing the communication content.
[0893] "Application Example 1"
[0894] (Claim 1)
[0895] An information processing system that enables conversation with a specific character,
[0896] A means for receiving character selection from a user device and transmitting it to an information processing device,
[0897] A means of obtaining the latest information from external sources, adjusting the information based on individual characteristics, and associating it with characters,
[0898] A means for analyzing a message received from a user, generating a character-appropriate response based on its content, and outputting the generated response as audio,
[0899] A means of providing responses through home robots and supporting the user's daily life,
[0900] A system that includes this.
[0901] (Claim 2)
[0902] The system according to claim 1, wherein a response is generated using speech synthesis technology based on the intent of an analyzed user message and output from a home robot.
[0903] (Claim 3)
[0904] The system according to claim 1, which selects the optimal model from among multiple generative AI models based on character selection and user message analysis results, and adjusts user interaction in a home robot.
[0905] "Example 2 of combining an emotion engine"
[0906] (Claim 1)
[0907] A means for receiving character selection from a user terminal and transmitting it to a data processing device,
[0908] A means of analyzing received messages and identifying emotional states using natural language processing techniques,
[0909] A means for generating an adapted response using a generative model based on an identified emotional state,
[0910] A means for sending the generated response to the user terminal,
[0911] A system that includes this.
[0912] (Claim 2)
[0913] The system according to claim 1, which adjusts responses based on the emotional state of analyzed user messages to make interactions with users more natural.
[0914] (Claim 3)
[0915] The system according to claim 1, which selects an appropriate generative model and generates a response based on character selection and sentiment analysis results of user messages.
[0916] "Application example 2 when combining with an emotional engine"
[0917] (Claim 1)
[0918] An information processing device,
[0919] A means for receiving a virtual character selection from a user device and transmitting it to a computer,
[0920] A means of obtaining the latest information from external information resources and updating that information by associating it with a virtual character,
[0921] A means of analyzing messages received from users and identifying their emotional state using natural language processing technology,
[0922] A means for generating a virtual character's response based on emotion analysis results and transmitting the generated response to a user device,
[0923] A means of utilizing AI models using prompt sentences to generate responses adapted to emotions,
[0924] A system that includes this.
[0925] (Claim 2)
[0926] The system according to claim 1, which selectively uses information obtained from external information resources to generate responses based on the intent and emotion of the analyzed user message, and adjusts the tone and expression.
[0927] (Claim 3)
[0928] The system according to claim 1, which selects the most suitable AI model from among multiple AI models to generate a response based on the selection of a virtual character and the results of analyzing user messages, and then generates a response. [Explanation of Symbols]
[0929] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. An information processing system that enables conversation with a specific character, A means for receiving character selection from a user device and transmitting it to an information processing device, A means of obtaining the latest information from external sources, adjusting the information based on individual characteristics, and associating it with characters, A means for analyzing a message received from a user, generating a character-appropriate response based on its content, and outputting the generated response as audio, A means of providing responses through home robots and supporting the user's daily life, A system that includes this.
2. The system according to claim 1, wherein a response is generated using speech synthesis technology based on the intent of the analyzed user message and output from a home robot.
3. The system according to claim 1, which selects the optimal model from among multiple generative AI models based on character selection and user message analysis results, and adjusts user interaction in a home robot.