system
The system addresses the lack of personalization and user interaction in beverage systems by analyzing user inputs for preferences and emotions, suggesting tailored beverages, and fostering community engagement.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-12
- Publication Date
- 2026-06-24
Smart Images

Figure 2026103404000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, and includes steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a character of the chatbot, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] In a conventional beverage providing system, when a user selects a beverage according to their preferences and feelings, sufficient information is not provided, and it is difficult to immediately propose and provide a personalized beverage. In particular, when a user desires an optimal beverage according to their mood and situation at that time, the conventional method is not sufficient to meet that requirement. Therefore, the present invention aims to solve these problems.
Means for Solving the Problems
[0005] This invention includes means for analyzing user input information using natural language processing to identify the user's preferences and current emotions. Furthermore, it includes means for suggesting appropriate beverages based on this information and provides a function for quickly preparing and serving the beverage selected by the user. It also includes means for recording the user's selection history and updating the popular beverage rankings based on this history, thereby constantly offering new options to the user and promoting interaction within the community. In this way, it is possible to realize a beverage experience tailored to the individual needs of each user.
[0006] "Input information" refers to text or voice data that the user provides to the system via their device.
[0007] "Natural language processing means" refers to technologies that analyze input information received from users to identify their preferences and emotions.
[0008] "Emotion" is an element that represents the user's psychological state and is information that should be considered when suggesting beverages.
[0009] The "candidate generation means" is a function that selects an appropriate beverage to offer to the user based on identified preferences and emotions.
[0010] "Beverage production means" refers to the technical means for preparing and providing the beverage selected by the user.
[0011] "History management means" refers to a function that records the user's selection history and uses that data to update the popularity ranking.
[0012] A "community provision tool" is a function that provides a platform for users to share information and interact with other users.
[0013] "Voice analysis means" refers to a technology that converts a user's voice input into text data and analyzes its intent. [Brief explanation of the drawing]
[0014] [Figure 1] It is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.
Embodiments for Carrying Out the Invention
[0015] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0016] First, the terms used in the following description will be explained.
[0017] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0018] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0019] In the following embodiments, the numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.
[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0022] [First Embodiment]
[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0035] This invention is a system built to provide a personalized beverage experience according to the user's mood and preferences. The main components of the system include a terminal with a user interface, a server for data analysis and processing, and means for creating and serving beverages.
[0036] First, the user connects to the system using a device and inputs information about their mood and preferences. The device receives this information as voice or text and sends it to the server. If it is voice data, the server automatically converts it to text and then uses natural language processing technology to analyze the user's emotions and preferences in detail.
[0037] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the analysis results. The server searches a database and generates a list of beverage candidates that may satisfy the user's mood. These candidates are visually displayed on the terminal's screen and offered to the user as options.
[0038] When a user selects a beverage, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. Based on this information, the terminal details the beverage ingredients and preparation procedure to the user. In particular, if an automated cooking robot is installed, the selected beverage is prepared and served automatically.
[0039] Furthermore, the server records the user's selection history and analyzes this data in real time to update the popularity rankings. This ranking is displayed on the user's device as a reference for understanding trends. The system also provides a community function, allowing users to share their ratings and opinions on cocktails with other users. In this way, the invention can provide users with a comprehensive and personalized beverage experience.
[0040] For example, if a user enters "I'm feeling restless and would like a relaxing drink" into their device, the server will suggest options such as "lavender lemonade" or "chamomile tea." The beverage is then prepared and served according to the user's selection, and the popularity ranking is updated. In this way, a beverage experience tailored to the user's current state of mind can be easily provided.
[0041] The following describes the processing flow.
[0042] Step 1:
[0043] The user connects to the device and opens the free chat screen or voice input screen. The device enters input waiting mode and prompts the user to enter information about their mood or preferences.
[0044] Step 2:
[0045] The user inputs information such as "I want to relax" via text or voice. The device receives the input and, if it's voice, automatically converts it into text data.
[0046] Step 3:
[0047] The terminal sends user input to the server. The server applies natural language processing techniques to analyze the received text data.
[0048] Step 4:
[0049] The server identifies the user's emotions and preferences from the analyzed information. Based on this identification, it searches the database for relevant beverage candidates.
[0050] Step 5:
[0051] The server generates a list of beverage suggestions that match the user's mood based on the search results. This list is sent to the device and displayed visually on the device.
[0052] Step 6:
[0053] The user selects their desired beverage from the options displayed on the terminal. The terminal reports the user's selection to the server.
[0054] Step 7:
[0055] The server sends the recipe and preparation instructions for the selected beverage to the terminal. The terminal displays this information and instructs the user on the preparation steps.
[0056] Step 8:
[0057] If automated cooking equipment is available, the terminal issues a cocktail preparation command to the equipment, initiating the automated process. The user then receives their beverage as instructed.
[0058] Step 9:
[0059] The server records the user's selection history and updates the popularity rankings based on that data. This ranking information is provided to the terminal, allowing users to view the latest beverage trends.
[0060] Step 10:
[0061] Users can access the community via their devices and share ratings and opinions about beverages with other users. The server manages and displays this information.
[0062] (Example 1)
[0063] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0064] In modern society, there is a demand for personalized experiences based on the diverse preferences and emotions of users. However, conventional beverage serving systems have difficulty accurately suggesting beverages that reflect users' preferences and moods at any given time. Furthermore, there is a lack of mechanisms to encourage interaction among users, which hinders the formation of communities.
[0065] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0066] In this invention, the server includes a language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; and a beverage production means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to suggest and provide beverages that are tailored to the individual user's state and preferences. Furthermore, community building is promoted as users interact with other users and share evaluations and comments for motivation.
[0067] "Language processing means" are tools for analyzing user input information to identify user preferences and emotions.
[0068] A "candidate generation method" is a means for suggesting appropriate beverages based on the preferences and emotions of identified users.
[0069] A "beverage preparation means" is a means for preparing and providing a beverage selected by the user.
[0070] A "history management method" is a means of recording and analyzing a user's selection history to update the popularity rankings.
[0071] "Display means" refers to a means of presenting information to the user visually.
[0072] "Input assistance means" refers to means for supporting voice or text input.
[0073] This invention is a method for implementing a system designed to provide personalized beverages tailored to the user's preferences and emotions. The invention comprises a terminal with a user interface, a server for data processing, and means for creating and serving beverages.
[0074] First, the user connects to the system by operating a device and inputs information about their mood and preferences via voice or text. The device receives this information and sends it to the server over the network. The server's speech recognition function may convert the voice data into text, for example, by using a speech recognition API. In addition, natural language processing technology uses natural language processing libraries to analyze the user's emotions and preferences.
[0075] Based on the analysis results, the server generates beverage candidates. Using a database management system, it efficiently searches for beverages that match the user's preferences and generates a list. The generated candidates are sent to the terminal and presented to the user as options.
[0076] When a user selects a beverage, the server records the selection and sends instructions to the terminal for the beverage preparation device. Based on this information, the terminal guides the user with the necessary ingredients and procedures. If an automated preparation device is available, the selected beverage is automatically prepared and served.
[0077] For example, a user might type "I'm feeling restless, so I'd like a relaxing drink" into the terminal. The server analyzes this prompt and suggests options such as "lavender lemonade" or "chamomile tea." Based on the user's selection, the beverage is prepared, and in some cases, the popularity ranking is updated.
[0078] This invention specifically demonstrates the implementation of a system that can provide a beverage experience tailored to the needs of individual users and promote interaction among users.
[0079] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0080] Step 1:
[0081] The user uses a device to input information about their mood and preferences via voice or text. Upon receiving the input data, the device prepares the data according to the input format and sends it to the server. An internet connection is used for sending data to the server. A specific example of this process would be the user typing "I want to refresh" into a text box and clicking the submit button.
[0082] Step 2:
[0083] The server validates the data received from the terminal. If audio data is sent, it is converted to text using a speech recognition API. Based on this conversion, a natural language processing library is used to analyze the text data and extract the user's emotions and preferences. The input is audio or text data, and the output is user emotion and preference information based on the analysis.
[0084] Step 3:
[0085] The server uses the analysis results to generate beverage candidates. It executes appropriate queries against the database management system to retrieve beverages that match the user's mood and preferences. For example, it uses SQL to retrieve beverages related to "refreshment" and generates candidates such as "lime mint smoothie." The input is the user's mood and preference information, and the output is a list of beverage candidates.
[0086] Step 4:
[0087] The server sends the generated beverage candidates to the terminal, which then visually presents them to the user. The user interface displays the candidates in an easy-to-select format, allowing the user to choose. Specifically, the terminal displays the beverage names and brief descriptions on the screen in a list or card format. The input is candidate data from the server, and the output is the user's visual display.
[0088] Step 5:
[0089] The user selects their desired beverage from the presented options. The terminal sends this selection back to the server. The server records the information about the selected beverage and sends instructions to the terminal for the beverage production process. For example, the user selects "Lime Mint Smoothie," and the terminal notifies the server of this selection. The input is the user's selection, and the output is the selection notification to the server.
[0090] Step 6:
[0091] The terminal receives instructions from the server and displays the beverage ingredients and preparation steps to the user. If an automated cooking device is available, the terminal sends a signal to the device, and the selected beverage is automatically prepared. Specific display details include steps such as, "Required ingredients: lime, mint, honey; Preparation steps: mix the ingredients." Input is the preparation instructions from the server, and output is the display to the user or a signal to the cooking device.
[0092] Step 7:
[0093] The server stores the user's selection history in a database and analyzes and updates the popularity ranking. The updated ranking is sent to the terminal in real time and presented to the user as the latest information. Specifically, the server records data with a command such as "INSERT INTO history (user_id, drink_id) VALUES (...)" and updates the ranking using an analysis algorithm. The input is the selection history, and the output is the ranking information.
[0094] (Application Example 1)
[0095] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0096] In modern urban environments, there is a need for personalized beverage delivery systems that meet the diverse needs of residents. In particular, providing optimal beverages based on individual moods and preferences, and fostering interaction among residents, presents a challenging task. Furthermore, real-time beverage delivery is essential, providing an efficient and seamless experience.
[0097] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0098] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions, a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions, and a communication means that cooperates with beverage supply facilities in a smart city environment to provide beverages to the user in real time. This enables the immediate provision of optimal beverages tailored to the individual needs of the user, facilitating interaction among residents and facilitating an efficient beverage experience.
[0099] "Natural language processing means" refers to technologies that analyze user input information and identify the user's preferences and emotions from that information.
[0100] A "candidate generation method" is a method for extracting and processing information necessary to suggest the most suitable beverage to a user based on their identified preferences and emotions.
[0101] "Beverage production means" refers to a series of processes for preparing and specifically providing the beverage selected by the user.
[0102] A "history management method" is a method of recording a user's selection history and using that data to analyze and update the popularity rankings.
[0103] "Communication methods" refer to the interfaces and protocols necessary for beverage supply equipment and servers to cooperate within a smart city environment.
[0104] This invention is a system for providing personalized beverages tailored to user needs in a smart city. The system mainly consists of a server, a terminal with a user interface, and beverage supply equipment within the smart city environment.
[0105] The server uses natural language processing technology to analyze voice or text input from the user to identify their preferences and emotions. To do this, it uses the Google® Cloud Speech-to-Text API to convert voice data into text and the Google Cloud Natural Language API to analyze the text data. Based on the analysis results, the server selects beverages from its database that match the user's preferences and displays them as candidates on the device.
[0106] Users select a beverage from a list of options displayed on their device. This information is sent to a server, and the selected beverage is recorded in a history management system and used to update the popularity ranking among residents. The selected beverage is then communicated with vending machines and cafes within the smart city, and beverage supply equipment prepares the beverage in real time.
[0107] For example, if a user enters a request into the terminal saying, "I'm tired today, so I'd like a refreshing drink," the server will suggest options such as "mint lemonade" or "green smoothie." Depending on the user's selection, the chosen beverage will be immediately served at a cafe within the smart city, and this will also be reflected in the rankings.
[0108] An example of a prompt for this system would be: "Describe the process of suggesting the most suitable beverage when a user says they want to refresh themselves, and then providing that beverage immediately within the city."
[0109] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0110] Step 1:
[0111] Users use their devices to input information about their mood and preferences via voice or text. If the input data is voice, the Google Cloud Speech-to-Text API is used to convert the voice to text and send it to the server as text data.
[0112] Step 2:
[0113] The server analyzes the received text data using the Google Cloud Natural Language API. This analysis extracts keywords and sentiments to identify the user's preferences and state. The input data is text, and the output is the analysis result, which identifies the user's preferences and sentiments.
[0114] Step 3:
[0115] The server searches a beverage database for a suitable beverage based on identified preferences and emotions. It selects multiple beverage candidates using a candidate generation mechanism. The input is analytical information based on the user's preferences, and the output is a list of beverage candidates.
[0116] Step 4:
[0117] The server sends a list of options to the terminal. The terminal visually displays this list in its user interface, offering it to the user as a selection. The input is the beverage option list, and the output is the visual presentation to the user.
[0118] Step 5:
[0119] The user selects a beverage from the displayed beverage options. The terminal sends the user's selection information to the server, along with the ID of the selected beverage as data. The input is the user's selection, and the output is the transmission of the beverage ID based on that selection.
[0120] Step 6:
[0121] The server records the received beverage IDs using a history management system and updates the popularity ranking. Simultaneously, it transmits this information to beverage supply equipment within the smart city, instructing it to begin preparation. The input is the beverage ID, and the output is the history update and supply instruction.
[0122] Step 7:
[0123] Beverage supply systems within smart cities accurately prepare and provide beverages to users based on received information. Users can pick up their beverages at designated locations. The input is a supply instruction, and the output is the preparation and provision of beverages.
[0124] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0125] This invention relates to a technology that improves the accuracy of a system that suggests and provides the optimal beverage based on the user's mood and preferences by using emotion recognition. In addition to a terminal equipped with a user interface, a server for data analysis and processing, and means for generating and providing beverages, the system incorporates an emotion engine.
[0126] First, the user begins inputting information through the device. The user inputs information about their mood and preferences in text or voice. The device receives this, automatically converts the voice data into text, and sends it to the server. The server not only utilizes natural language processing technology to analyze the input data, but also identifies the user's emotions using an emotion engine.
[0127] Next, the server uses a candidate generation mechanism to suggest a beverage that best matches the user's mood, based on emotional data obtained from the emotion engine. Emotional data plays a particularly important role in evaluating the candidates, improving the accuracy of the suggestions. The generated candidates are sent to the terminal and visually displayed on the user interface.
[0128] Once the user selects their desired beverage from the presented options, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. The terminal then provides the user with a detailed recipe and preparation instructions, and, if necessary, activates an automated cooking robot to dispense the beverage.
[0129] Furthermore, the server integrates and records user selection history and sentiment data, updating popularity rankings in real time. This allows users to stay up-to-date on the latest trends. The system also includes a community feature, allowing users to share their ratings and opinions about beverages with other users.
[0130] For example, if a user types "I want to lift my spirits a little" into their device, the server uses natural language processing and an emotion engine to suggest "uplifting drinks" such as "Lemon Mint Sparkle" or "Ginger Blend Tea." Based on the emotion analysis results, the suggestions are fine-tuned, allowing the user to choose a more appropriate beverage. The selected beverage is quickly prepared, providing an experience that is tailored to the user's mood.
[0131] The following describes the processing flow.
[0132] Step 1:
[0133] The user accesses the system from their device and launches either the free chat screen or the voice input screen. The device then prepares to accept user input.
[0134] Step 2:
[0135] The user enters information about their mood in text or voice, for example, "I want to relax." If voice input is available, the device automatically converts it to text.
[0136] Step 3:
[0137] The terminal sends the user's text input data to the server. The server applies natural language processing algorithms to analyze key information related to the user's emotions and preferences from their words.
[0138] Step 4:
[0139] The server passes the analyzed data to the emotion engine to identify more detailed emotional states. This process allows the user's emotions to be recognized as concrete indicators.
[0140] Step 5:
[0141] Based on the output from the emotion engine, the server refers to a beverage database to select the most suitable beverage candidate. The candidate generation system then lists beverages that match the user's emotions.
[0142] Step 6:
[0143] The server sends a list of selected beverage options to the terminal. The terminal displays an interface that visually presents the beverage options to the user and prompts them to make a selection.
[0144] Step 7:
[0145] The user selects their desired beverage from the presented options. The selection information is sent to the server via the device.
[0146] Step 8:
[0147] The server sends the terminal a detailed recipe and preparation instructions for the selected beverage. The terminal displays the ingredients and preparation method to the user and assists them.
[0148] Step 9:
[0149] If possible, the terminal sends instructions to the automated cooking robot to prepare the beverage, and the beverage is then prepared for the user.
[0150] Step 10:
[0151] The server records user selection history and sentiment data, and updates the popularity rankings. Users can refer to these rankings to see what other users have chosen.
[0152] Step 11:
[0153] Users can access community features through their devices to share ratings and feedback, and exchange opinions with other users. The server oversees this activity and continuously updates community information.
[0154] (Example 2)
[0155] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0156] Existing systems that suggest the optimal beverage based on the user's mood and preferences struggle to accurately analyze the user's complex emotional state. Furthermore, they lack sufficient functionality to integrate past selection history with real-time emotions to provide trend information, resulting in a lack of consistency in the user experience.
[0157] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0158] In this invention, the server includes a text analysis means that analyzes user input information to identify a state and preferences; a candidate generation means that suggests an appropriate beverage based on the identified state and preferences; and a beverage generation means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to accurately analyze the user's emotional state and suggest an appropriate beverage based on that analysis.
[0159] A "text analysis tool" is an information processing system that analyzes user input information to identify their state and preferences.
[0160] A "candidate generation means" is a device or system that has the function of selecting appropriate beverage candidates based on specified conditions or preferences.
[0161] "Beverage production means" refers to a series of devices and processes for preparing and serving beverages selected by the user.
[0162] "History management means" refers to a device or system that records and analyzes a user's selection history and updates popularity ratings accordingly.
[0163] "Emotion recognition means" refers to technology that analyzes complex emotional states from user input information and identifies emotion labels.
[0164] This invention is implemented as a system that suggests and provides the optimal beverage based on the user's emotions and preferences. The system includes a terminal with a user interface, a server responsible for data analysis, and means for generating the beverage. Furthermore, emotion recognition means are incorporated to accurately analyze the emotional state.
[0165] The terminal receives input from the user (text or voice) and, in the case of voice input, automatically converts it into text data. Automatic speech recognition software can be used for this conversion. The converted text data is then sent to the server.
[0166] The server analyzes the received data using text analysis tools to identify the user's state and preferences. A natural language processing library is used for the analysis. Furthermore, emotion recognition tools identify the emotional state from the user's input. Based on the information obtained through this process, the server uses a candidate generation tool to generate the most suitable beverage candidates.
[0167] The generated beverage candidates are sent back to the terminal and visually displayed on the user interface. The user can select their desired beverage from the displayed options. This selection is sent to the server, and the beverage is prepared through the beverage production system. Automated cooking robots may be used in this process.
[0168] As a concrete example, consider a scenario where a user types "I want to lift my spirits a little" into their device. The server then uses natural language processing and emotion recognition to present options such as "Lemon Mint Sparkle" or "Ginger Blend Tea." The selected beverage is quickly prepared, enhancing the user's experience through its delivery.
[0169] Example prompt for a generative AI model: "Generate a list of beverages to suggest when the user types 'I want to relax.'"
[0170] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0171] Step 1:
[0172] Users input information about their mood and preferences via text or voice through their device. If the input is voice, the device uses speech recognition technology to convert the voice data into text. This converted text data is then sent to the server as output.
[0173] Step 2:
[0174] The server receives text data from the terminal as input. It then uses a natural language processing library to analyze the text data and extract keywords related to the user's state and preferences. This analysis process outputs information about the user's current emotions and preferences.
[0175] Step 3:
[0176] The server uses emotion recognition tools to further examine the analysis results and identify the user's emotion label (e.g., joy, calmness). The input is the analysis result from the previous step, and the server generates emotion data obtained from that input as output.
[0177] Step 4:
[0178] The server takes identified emotion data as input and uses a candidate generation mechanism to generate a list of optimal beverages. It searches the database for relevant beverage information and generates suggestions corresponding to the emotion data as output.
[0179] Step 5:
[0180] The generated list of beverage options is sent to the terminal and visually displayed on the user interface. The user interface provides descriptions and images for each beverage, making it easy for the user to select their desired beverage.
[0181] Step 6:
[0182] Information about the beverage selected by the user is sent from the terminal to the server. The server receives this information as input, generates preparation instructions for the selected beverage as output, and sends them to the terminal.
[0183] Step 7:
[0184] The terminal uses the output received from the server to activate an automated cooking robot and prepare the selected beverage. Once the beverage is ready, it is served to the user, and the process is complete.
[0185] (Application Example 2)
[0186] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0187] Providing users with the most suitable beverages quickly and accurately, tailored to their emotions and preferences, is crucial for increasing user satisfaction. However, conventional technology has struggled to accurately analyze user emotions and provide appropriate beverages based on that analysis. Furthermore, features for updating selection history and trend information in real time and sharing information with other users were limited, resulting in a lack of a sense of unity and shared experience in the user experience.
[0188] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0189] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; a beverage production means that works in conjunction with a home appliance to prepare and serve the beverage selected by the user from the suggested candidates; and a history management means that records and analyzes the user's selection history to update popular trend information in real time and visually display it on the user interface. This makes it possible to suggest and serve the optimal beverage based on the user's emotions and to promote communication through selection history and trend information.
[0190] "Natural language processing means that analyze user input information to identify preferences and emotions" refers to data processing technology that analyzes voice or text data provided by users to identify individual preferences and moods.
[0191] "A candidate generation method that suggests appropriate beverages based on identified preferences and emotions" is a technology that generates and presents beverage options that match the user's mood at the time, based on an analysis of the user's emotions.
[0192] A "beverage production method linked to home appliances" is a means for preparing and serving a beverage selected by the user using automated household equipment.
[0193] "A history management means that records and analyzes a user's selection history to update popular trend information in real time and visually display it on the user interface" is a technology for accumulating and analyzing a user's past selection data and generating and displaying the latest trend information.
[0194] "Information exchange and provision means" refers to a means of exchanging evaluations and opinions about beverages with other users and communicating with them.
[0195] "Voice analysis means" refers to a technology that instantly converts a user's voice input into text data and analyzes their emotional state.
[0196] The system for realizing this invention has a structure in which the user, server, and terminal communicate. A specific embodiment thereof is described below.
[0197] The server receives voice or text data sent by the user and analyzes the input information using natural language processing techniques. This process utilizes the Google Cloud Natural Language API to identify the user's preferences and emotions.
[0198] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the identified sentiment data. In this process, the sentiment analysis results from Amazon Comprehend are utilized to generate beverage options, which are then presented to the user.
[0199] When a user selects a beverage, the server stores the user's selection information in Firebase and coordinates with home appliances to produce the beverage. Specifically, it controls home robots and smart appliances to prepare and serve the desired beverage.
[0200] Furthermore, the server analyzes the user's selection history and updates popular trend information in real time. This information is visually displayed in the user interface, and users can share their evaluations and opinions with other users through information exchange and provision mechanisms.
[0201] For example, if a user tells the device by voice, "I want to relax a little today," the system converts the voice data into text and analyzes the user's desire to relax using Amazon Comprehend. The server then suggests beverages such as "chamomile tea" or "lavender latte," which the user selects. After the selection, a robot automatically prepares the beverage and serves it to the user. In addition, trending information for the day is displayed in real time, and users can also view ratings and opinions from other users.
[0202] An example of a prompt is, "What beverage would you recommend drinking when you feel like this: when you want to relax?"
[0203] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0204] Step 1:
[0205] The user enters their beverage request into the device via voice or text. The entered voice data is converted into text data within the device using the Google Cloud Speech-to-Text service. The output of this conversion process is the text data.
[0206] Step 2:
[0207] The device sends text data to the server. The server uses this text data as input and performs natural language processing using the Google Cloud Natural Language API to identify the user's preferences and emotions. This step generates sentiment analysis results and user preference data.
[0208] Step 3:
[0209] Based on the sentiment analysis results, the server uses Amazon Comprehend to generate beverage suggestions that match the user's mood. These suggestions are sent to the device and displayed as beverage options on the user interface. This step outputs a list of suggestions.
[0210] Step 4:
[0211] The user selects a beverage from the displayed options. The selected beverage data is sent from the terminal to the server. At this time, the user's selection information is recorded.
[0212] Step 5:
[0213] The server receives the user's selection information and stores that data in Firebase. The stored data is used for history management and generating trend information. In this step, the selection history data is updated.
[0214] Step 6:
[0215] The server sends control commands to home appliances to prepare the selected beverage. Home robots and smart appliances work together to prepare the selected beverage and serve it to the user. At this point, the finished beverage is dispensed and served.
[0216] Step 7:
[0217] The server analyzes trend information in real time based on selection history and trend data, and displays it visually in the user interface. Furthermore, it facilitates the sharing of evaluations and opinions about beverages to enable information exchange with other users. In this step, updated trend information and a sharing interface are provided.
[0218] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0219] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0220] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0221] [Second Embodiment]
[0222] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0223] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0224] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0225] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0226] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0227] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0228] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0229] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0230] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0231] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0232] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0233] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0234] This invention is a system built to provide a personalized beverage experience according to the user's mood and preferences. The main components of the system include a terminal with a user interface, a server for data analysis and processing, and means for creating and serving beverages.
[0235] First, the user connects to the system using a device and inputs information about their mood and preferences. The device receives this information as voice or text and sends it to the server. If it is voice data, the server automatically converts it to text and then uses natural language processing technology to analyze the user's emotions and preferences in detail.
[0236] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the analysis results. The server searches a database and generates a list of beverage candidates that may satisfy the user's mood. These candidates are visually displayed on the terminal's screen and offered to the user as options.
[0237] When a user selects a beverage, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. Based on this information, the terminal details the beverage ingredients and preparation procedure to the user. In particular, if an automated cooking robot is installed, the selected beverage is prepared and served automatically.
[0238] Furthermore, the server records the user's selection history and analyzes this data in real time to update the popularity rankings. This ranking is displayed on the user's device as a reference for understanding trends. The system also provides a community function, allowing users to share their ratings and opinions on cocktails with other users. In this way, the invention can provide users with a comprehensive and personalized beverage experience.
[0239] For example, if a user enters "I'm feeling restless and would like a relaxing drink" into their device, the server will suggest options such as "lavender lemonade" or "chamomile tea." The beverage is then prepared and served according to the user's selection, and the popularity ranking is updated. In this way, a beverage experience tailored to the user's current state of mind can be easily provided.
[0240] The following describes the processing flow.
[0241] Step 1:
[0242] The user connects to the device and opens the free chat screen or voice input screen. The device enters input waiting mode and prompts the user to enter information about their mood or preferences.
[0243] Step 2:
[0244] The user inputs information such as "I want to relax" via text or voice. The device receives the input and, if it's voice, automatically converts it into text data.
[0245] Step 3:
[0246] The terminal sends user input to the server. The server applies natural language processing techniques to analyze the received text data.
[0247] Step 4:
[0248] The server identifies the user's emotions and preferences from the analyzed information. Based on this identification, it searches the database for relevant beverage candidates.
[0249] Step 5:
[0250] The server generates a list of beverage suggestions that match the user's mood based on the search results. This list is sent to the device and displayed visually on the device.
[0251] Step 6:
[0252] The user selects their desired beverage from the options displayed on the terminal. The terminal reports the user's selection to the server.
[0253] Step 7:
[0254] The server sends the recipe and preparation instructions for the selected beverage to the terminal. The terminal displays this information and instructs the user on the preparation steps.
[0255] Step 8:
[0256] If automated cooking equipment is available, the terminal issues a cocktail preparation command to the equipment, initiating the automated process. The user then receives their beverage as instructed.
[0257] Step 9:
[0258] The server records the user's selection history and updates the popularity rankings based on that data. This ranking information is provided to the terminal, allowing users to view the latest beverage trends.
[0259] Step 10:
[0260] Users can access the community via their devices and share ratings and opinions about beverages with other users. The server manages and displays this information.
[0261] (Example 1)
[0262] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0263] In modern society, there is a demand for personalized experiences based on the diverse preferences and emotions of users. However, conventional beverage serving systems have difficulty accurately suggesting beverages that reflect users' preferences and moods at any given time. Furthermore, there is a lack of mechanisms to encourage interaction among users, which hinders the formation of communities.
[0264] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0265] In this invention, the server includes a language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; and a beverage production means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to suggest and provide beverages that are tailored to the individual user's state and preferences. Furthermore, community building is promoted as users interact with other users and share evaluations and comments for motivation.
[0266] "Language processing means" are tools for analyzing user input information to identify user preferences and emotions.
[0267] A "candidate generation method" is a means for suggesting appropriate beverages based on the preferences and emotions of identified users.
[0268] A "beverage preparation means" is a means for preparing and providing a beverage selected by the user.
[0269] A "history management method" is a means of recording and analyzing a user's selection history to update the popularity rankings.
[0270] "Display means" refers to a means of presenting information to the user visually.
[0271] "Input assistance means" refers to means for supporting voice or text input.
[0272] This invention is a method for implementing a system designed to provide personalized beverages tailored to the user's preferences and emotions. The invention comprises a terminal with a user interface, a server for data processing, and means for creating and serving beverages.
[0273] First, the user connects to the system by operating a device and inputs information about their mood and preferences via voice or text. The device receives this information and sends it to the server over the network. The server's speech recognition function may convert the voice data into text, for example, by using a speech recognition API. In addition, natural language processing technology uses natural language processing libraries to analyze the user's emotions and preferences.
[0274] Based on the analysis results, the server generates beverage candidates. Using a database management system, it efficiently searches for beverages that match the user's preferences and generates a list. The generated candidates are sent to the terminal and presented to the user as options.
[0275] When a user selects a beverage, the server records the selection and sends instructions to the terminal for the beverage preparation device. Based on this information, the terminal guides the user with the necessary ingredients and procedures. If an automated preparation device is available, the selected beverage is automatically prepared and served.
[0276] For example, a user might type "I'm feeling restless, so I'd like a relaxing drink" into the terminal. The server analyzes this prompt and suggests options such as "lavender lemonade" or "chamomile tea." Based on the user's selection, the beverage is prepared, and in some cases, the popularity ranking is updated.
[0277] This invention specifically demonstrates the implementation of a system that can provide a beverage experience tailored to the needs of individual users and promote interaction among users.
[0278] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0279] Step 1:
[0280] The user uses a device to input information about their mood and preferences via voice or text. Upon receiving the input data, the device prepares the data according to the input format and sends it to the server. An internet connection is used for sending data to the server. A specific example of this process would be the user typing "I want to refresh" into a text box and clicking the submit button.
[0281] Step 2:
[0282] The server validates the data received from the terminal. If audio data is sent, it is converted to text using a speech recognition API. Based on this conversion, a natural language processing library is used to analyze the text data and extract the user's emotions and preferences. The input is audio or text data, and the output is user emotion and preference information based on the analysis.
[0283] Step 3:
[0284] The server uses the analysis results to generate beverage candidates. It executes appropriate queries against the database management system to retrieve beverages suitable for the user's emotions and preferences from the database. As a specific example, using SQL, it calls beverages related to "refreshing" to generate candidates such as "lime mint smoothie". The input is the user's emotion and preference information, and the output is a list of beverage candidates.
[0285] Step 4:
[0286] The server sends the generated beverage candidates to the terminal, and the terminal visually presents them to the user. The user interface displays the candidates in an easy-to-select form to enable the user to make a selection. As a specific operation, the terminal displays the beverage names and brief descriptions on the screen in a list or card format. The input is the candidate data from the server, and the output is the visual display for the user.
[0287] Step 5:
[0288] The user selects the desired beverage from the presented beverage candidates. The terminal returns that selection to the server. The server records the information of the selected beverage and sends an instruction for the beverage generation means to the terminal. As a specific operation, when the user selects "lime mint smoothie" and the terminal notifies the server of it. The input is the user's selection, and the output is the selection notification to the server.
[0289] Step 6:
[0290] The terminal receives the instruction from the server and displays the ingredients and preparation procedure of the beverage to the user. If there is an automatic cooking device, the terminal sends a signal to the cooking device and the selected beverage is automatically prepared. Specific displays include procedures such as "Required ingredients: lime, mint, honey; Preparation procedure: Mix the ingredients." The input is the preparation instruction from the server, and the output is the display to the user or the signal to the cooking device.
[0291] Step 7:
[0292] The server stores the user's selection history in a database and analyzes and updates the popularity ranking. The updated ranking is sent to the terminal in real time and presented to the user as the latest information. Specifically, the server records data with a command such as "INSERT INTO history (user_id, drink_id) VALUES (...)" and updates the ranking using an analysis algorithm. The input is the selection history, and the output is the ranking information.
[0293] (Application Example 1)
[0294] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0295] In modern urban environments, there is a need for personalized beverage delivery systems that meet the diverse needs of residents. In particular, providing optimal beverages based on individual moods and preferences, and fostering interaction among residents, presents a challenging task. Furthermore, real-time beverage delivery is essential, providing an efficient and seamless experience.
[0296] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0297] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions, a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions, and a communication means that cooperates with beverage supply facilities in a smart city environment to provide beverages to the user in real time. This enables the immediate provision of optimal beverages tailored to the individual needs of the user, facilitating interaction among residents and facilitating an efficient beverage experience.
[0298] "Natural language processing means" refers to technologies that analyze user input information and identify the user's preferences and emotions from that information.
[0299] A "candidate generation method" is a method for extracting and processing information necessary to suggest the most suitable beverage to a user based on their identified preferences and emotions.
[0300] "Beverage production means" refers to a series of processes for preparing and specifically providing the beverage selected by the user.
[0301] A "history management method" is a method of recording a user's selection history and using that data to analyze and update the popularity rankings.
[0302] "Communication methods" refer to the interfaces and protocols necessary for beverage supply equipment and servers to cooperate within a smart city environment.
[0303] This invention is a system for providing personalized beverages tailored to user needs in a smart city. The system mainly consists of a server, a terminal with a user interface, and beverage supply equipment within the smart city environment.
[0304] The server uses natural language processing technology to analyze voice or text input from the user to identify their preferences and emotions. To do this, it converts voice data to text using the Google Cloud Speech-to-Text API and analyzes the text data using the Google Cloud Natural Language API. Based on the analysis results, the server selects beverages from its database that match the user's preferences and displays them as options on the device.
[0305] The user makes a selection from the beverage candidates displayed through the terminal. This information is sent to the server, the selected beverage is recorded by the history management means, and is utilized for updating the popularity ranking among residents. The selected beverage is linked with vending machines and cafes within the smart city through communication means, and the beverage supply facilities prepare the beverage in real-time.
[0306] As a specific example, when the user inputs a request to the terminal stating "I'm tired today and want a refreshing drink", the server proposes options such as "mint lemonade" or "green smoothie". Depending on the user's selection, the chosen beverage is immediately provided at a cafe within the smart city and is also reflected in the ranking.
[0307] Examples of prompt sentences for this system include ones like "Please explain the process of proposing the optimal beverage when the user indicates a desire to refresh and immediately providing that beverage within the city."
[0308] The flow of the specific process in Application Example 1 will be described using FIG. 12.
[0309] Step 1:
[0310] The user uses the terminal to input information regarding mood and preferences in voice or text. If the input data is voice, the Google Cloud Speech-to-Text API is used to convert the voice to text and send it to the server as text data.
[0311] Step 2:
[0312] The server analyzes the received text data using the Google Cloud Natural Language API. In this analysis, keywords and emotions are extracted to identify the user's preferences and state. The input of the data is text, and the output is the specific information of the user's preferences and emotions as the analysis result.
[0313] Step 3:
[0314] The server searches a beverage database for a suitable beverage based on identified preferences and emotions. It selects multiple beverage candidates using a candidate generation mechanism. The input is analytical information based on the user's preferences, and the output is a list of beverage candidates.
[0315] Step 4:
[0316] The server sends a list of options to the terminal. The terminal visually displays this list in its user interface, offering it to the user as a selection. The input is the beverage option list, and the output is the visual presentation to the user.
[0317] Step 5:
[0318] The user selects a beverage from the displayed beverage options. The terminal sends the user's selection information to the server, along with the ID of the selected beverage as data. The input is the user's selection, and the output is the transmission of the beverage ID based on that selection.
[0319] Step 6:
[0320] The server records the received beverage IDs using a history management system and updates the popularity ranking. Simultaneously, it transmits this information to beverage supply equipment within the smart city, instructing it to begin preparation. The input is the beverage ID, and the output is the history update and supply instruction.
[0321] Step 7:
[0322] Beverage supply systems within smart cities accurately prepare and provide beverages to users based on received information. Users can pick up their beverages at designated locations. The input is a supply instruction, and the output is the preparation and provision of beverages.
[0323] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0324] This invention relates to a technology that improves the accuracy of a system that suggests and provides the optimal beverage based on the user's mood and preferences by using emotion recognition. In addition to a terminal equipped with a user interface, a server for data analysis and processing, and means for generating and providing beverages, the system incorporates an emotion engine.
[0325] First, the user begins inputting information through the device. The user inputs information about their mood and preferences in text or voice. The device receives this, automatically converts the voice data into text, and sends it to the server. The server not only utilizes natural language processing technology to analyze the input data, but also identifies the user's emotions using an emotion engine.
[0326] Next, the server uses a candidate generation mechanism to suggest a beverage that best matches the user's mood, based on emotional data obtained from the emotion engine. Emotional data plays a particularly important role in evaluating the candidates, improving the accuracy of the suggestions. The generated candidates are sent to the terminal and visually displayed on the user interface.
[0327] Once the user selects their desired beverage from the presented options, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. The terminal then provides the user with a detailed recipe and preparation instructions, and, if necessary, activates an automated cooking robot to dispense the beverage.
[0328] Furthermore, the server integrates and records user selection history and sentiment data, updating popularity rankings in real time. This allows users to stay up-to-date on the latest trends. The system also includes a community feature, allowing users to share their ratings and opinions about beverages with other users.
[0329] For example, if a user types "I want to lift my spirits a little" into their device, the server uses natural language processing and an emotion engine to suggest "uplifting drinks" such as "Lemon Mint Sparkle" or "Ginger Blend Tea." Based on the emotion analysis results, the suggestions are fine-tuned, allowing the user to choose a more appropriate beverage. The selected beverage is quickly prepared, providing an experience that is tailored to the user's mood.
[0330] The following describes the processing flow.
[0331] Step 1:
[0332] The user accesses the system from their device and launches either the free chat screen or the voice input screen. The device then prepares to accept user input.
[0333] Step 2:
[0334] The user enters information about their mood in text or voice, for example, "I want to relax." If voice input is available, the device automatically converts it to text.
[0335] Step 3:
[0336] The terminal sends the user's text input data to the server. The server applies natural language processing algorithms to analyze key information related to the user's emotions and preferences from their words.
[0337] Step 4:
[0338] The server passes the analyzed data to the emotion engine to identify more detailed emotional states. This process allows the user's emotions to be recognized as concrete indicators.
[0339] Step 5:
[0340] Based on the output from the emotion engine, the server refers to a beverage database to select the most suitable beverage candidate. The candidate generation system then lists beverages that match the user's emotions.
[0341] Step 6:
[0342] The server sends a list of selected beverage options to the terminal. The terminal displays an interface that visually presents the beverage options to the user and prompts them to make a selection.
[0343] Step 7:
[0344] The user selects their desired beverage from the presented options. The selection information is sent to the server via the device.
[0345] Step 8:
[0346] The server sends the terminal a detailed recipe and preparation instructions for the selected beverage. The terminal displays the ingredients and preparation method to the user and assists them.
[0347] Step 9:
[0348] If possible, the terminal sends instructions to the automated cooking robot to prepare the beverage, and the beverage is then prepared for the user.
[0349] Step 10:
[0350] The server records user selection history and sentiment data, and updates the popularity rankings. Users can refer to these rankings to see what other users have chosen.
[0351] Step 11:
[0352] Users can access community features through their devices to share ratings and feedback, and exchange opinions with other users. The server oversees this activity and continuously updates community information.
[0353] (Example 2)
[0354] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0355] Existing systems that suggest the optimal beverage based on the user's mood and preferences struggle to accurately analyze the user's complex emotional state. Furthermore, they lack sufficient functionality to integrate past selection history with real-time emotions to provide trend information, resulting in a lack of consistency in the user experience.
[0356] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0357] In this invention, the server includes a text analysis means that analyzes user input information to identify a state and preferences; a candidate generation means that suggests an appropriate beverage based on the identified state and preferences; and a beverage generation means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to accurately analyze the user's emotional state and suggest an appropriate beverage based on that analysis.
[0358] A "text analysis tool" is an information processing system that analyzes user input information to identify their state and preferences.
[0359] A "candidate generation means" is a device or system that has the function of selecting appropriate beverage candidates based on specified conditions or preferences.
[0360] "Beverage production means" refers to a series of devices and processes for preparing and serving beverages selected by the user.
[0361] "History management means" refers to a device or system that records and analyzes a user's selection history and updates popularity ratings accordingly.
[0362] "Emotion recognition means" refers to technology that analyzes complex emotional states from user input information and identifies emotion labels.
[0363] This invention is implemented as a system that suggests and provides the optimal beverage based on the user's emotions and preferences. The system includes a terminal with a user interface, a server responsible for data analysis, and means for generating the beverage. Furthermore, emotion recognition means are incorporated to accurately analyze the emotional state.
[0364] The terminal receives input from the user (text or voice) and, in the case of voice input, automatically converts it into text data. Automatic speech recognition software can be used for this conversion. The converted text data is then sent to the server.
[0365] The server analyzes the received data using text analysis tools to identify the user's state and preferences. A natural language processing library is used for the analysis. Furthermore, emotion recognition tools identify the emotional state from the user's input. Based on the information obtained through this process, the server uses a candidate generation tool to generate the most suitable beverage candidates.
[0366] The generated beverage candidates are sent back to the terminal and visually displayed on the user interface. The user can select their desired beverage from the displayed options. This selection is sent to the server, and the beverage is prepared through the beverage production system. Automated cooking robots may be used in this process.
[0367] As a concrete example, consider a scenario where a user types "I want to lift my spirits a little" into their device. The server then uses natural language processing and emotion recognition to present options such as "Lemon Mint Sparkle" or "Ginger Blend Tea." The selected beverage is quickly prepared, enhancing the user's experience through its delivery.
[0368] Example prompt for a generative AI model: "Generate a list of beverages to suggest when the user types 'I want to relax.'"
[0369] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0370] Step 1:
[0371] Users input information about their mood and preferences via text or voice through their device. If the input is voice, the device uses speech recognition technology to convert the voice data into text. This converted text data is then sent to the server as output.
[0372] Step 2:
[0373] The server receives text data from the terminal as input. It then uses a natural language processing library to analyze the text data and extract keywords related to the user's state and preferences. This analysis process outputs information about the user's current emotions and preferences.
[0374] Step 3:
[0375] The server uses emotion recognition tools to further examine the analysis results and identify the user's emotion label (e.g., joy, calmness). The input is the analysis result from the previous step, and the server generates emotion data obtained from that input as output.
[0376] Step 4:
[0377] The server takes identified emotion data as input and uses a candidate generation mechanism to generate a list of optimal beverages. It searches the database for relevant beverage information and generates suggestions corresponding to the emotion data as output.
[0378] Step 5:
[0379] The generated list of beverage options is sent to the terminal and visually displayed on the user interface. The user interface provides descriptions and images for each beverage, making it easy for the user to select their desired beverage.
[0380] Step 6:
[0381] Information about the beverage selected by the user is sent from the terminal to the server. The server receives this information as input, generates preparation instructions for the selected beverage as output, and sends them to the terminal.
[0382] Step 7:
[0383] The terminal uses the output received from the server to activate an automated cooking robot and prepare the selected beverage. Once the beverage is ready, it is served to the user, and the process is complete.
[0384] (Application Example 2)
[0385] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the smart glasses 214 as the "terminal".
[0386] Providing users with the most suitable beverages quickly and accurately, tailored to their emotions and preferences, is crucial for increasing user satisfaction. However, conventional technology has struggled to accurately analyze user emotions and provide appropriate beverages based on that analysis. Furthermore, features for updating selection history and trend information in real time and sharing information with other users were limited, resulting in a lack of a sense of unity and shared experience in the user experience.
[0387] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0388] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; a beverage production means that works in conjunction with a home appliance to prepare and serve the beverage selected by the user from the suggested candidates; and a history management means that records and analyzes the user's selection history to update popular trend information in real time and visually display it on the user interface. This makes it possible to suggest and serve the optimal beverage based on the user's emotions and to promote communication through selection history and trend information.
[0389] "Natural language processing means that analyze user input information to identify preferences and emotions" refers to data processing technology that analyzes voice or text data provided by users to identify individual preferences and moods.
[0390] "A candidate generation method that suggests appropriate beverages based on identified preferences and emotions" is a technology that generates and presents beverage options that match the user's mood at the time, based on an analysis of the user's emotions.
[0391] A "beverage production method linked to home appliances" is a means for preparing and serving a beverage selected by the user using automated household equipment.
[0392] "A history management means that records and analyzes a user's selection history to update popular trend information in real time and visually display it on the user interface" is a technology for accumulating and analyzing a user's past selection data and generating and displaying the latest trend information.
[0393] "Information exchange and provision means" refers to a means of exchanging evaluations and opinions about beverages with other users and communicating with them.
[0394] "Voice analysis means" refers to a technology that instantly converts a user's voice input into text data and analyzes their emotional state.
[0395] The system for realizing this invention has a structure in which the user, server, and terminal communicate. A specific embodiment thereof is described below.
[0396] The server receives voice or text data sent by the user and analyzes the input information using natural language processing techniques. This process utilizes the Google Cloud Natural Language API to identify the user's preferences and emotions.
[0397] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the identified sentiment data. In this process, the sentiment analysis results from Amazon Comprehend are utilized to generate beverage options, which are then presented to the user.
[0398] When a user selects a beverage, the server stores the user's selection information in Firebase and coordinates with home appliances to produce the beverage. Specifically, it controls home robots and smart appliances to prepare and serve the desired beverage.
[0399] Furthermore, the server analyzes the user's selection history and updates popular trend information in real time. This information is visually displayed in the user interface, and users can share their evaluations and opinions with other users through information exchange and provision mechanisms.
[0400] For example, if a user tells the device by voice, "I want to relax a little today," the system converts the voice data into text and analyzes the user's desire to relax using Amazon Comprehend. The server then suggests beverages such as "chamomile tea" or "lavender latte," which the user selects. After the selection, a robot automatically prepares the beverage and serves it to the user. In addition, trending information for the day is displayed in real time, and users can also view ratings and opinions from other users.
[0401] An example of a prompt is, "What beverage would you recommend drinking when you feel like this: when you want to relax?"
[0402] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0403] Step 1:
[0404] The user enters their beverage request into the device via voice or text. The entered voice data is converted into text data within the device using the Google Cloud Speech-to-Text service. The output of this conversion process is the text data.
[0405] Step 2:
[0406] The device sends text data to the server. The server uses this text data as input and performs natural language processing using the Google Cloud Natural Language API to identify the user's preferences and emotions. This step generates sentiment analysis results and user preference data.
[0407] Step 3:
[0408] Based on the sentiment analysis results, the server uses Amazon Comprehend to generate beverage suggestions that match the user's mood. These suggestions are sent to the device and displayed as beverage options on the user interface. This step outputs a list of suggestions.
[0409] Step 4:
[0410] The user selects a beverage from the displayed options. The selected beverage data is sent from the terminal to the server. At this time, the user's selection information is recorded.
[0411] Step 5:
[0412] The server receives the user's selection information and stores that data in Firebase. The stored data is used for history management and generating trend information. In this step, the selection history data is updated.
[0413] Step 6:
[0414] The server sends control commands to home appliances to prepare the selected beverage. Home robots and smart appliances work together to prepare the selected beverage and serve it to the user. At this point, the finished beverage is dispensed and served.
[0415] Step 7:
[0416] The server analyzes trend information in real time based on selection history and trend data, and displays it visually in the user interface. Furthermore, it facilitates the sharing of evaluations and opinions about beverages to enable information exchange with other users. In this step, updated trend information and a sharing interface are provided.
[0417] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0418] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (Internet Search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0419] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0420] [Third Embodiment]
[0421] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0422] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0423] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0424] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0425] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0426] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0427] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0428] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0429] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0430] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0431] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0432] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0433] This invention is a system built to provide a personalized beverage experience according to the user's mood and preferences. The main components of the system include a terminal with a user interface, a server for data analysis and processing, and means for creating and serving beverages.
[0434] First, the user connects to the system using a device and inputs information about their mood and preferences. The device receives this information as voice or text and sends it to the server. If it is voice data, the server automatically converts it to text and then uses natural language processing technology to analyze the user's emotions and preferences in detail.
[0435] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the analysis results. The server searches a database and generates a list of beverage candidates that may satisfy the user's mood. These candidates are visually displayed on the terminal's screen and offered to the user as options.
[0436] When a user selects a beverage, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. Based on this information, the terminal details the beverage ingredients and preparation procedure to the user. In particular, if an automated cooking robot is installed, the selected beverage is prepared and served automatically.
[0437] Furthermore, the server records the user's selection history and analyzes this data in real time to update the popularity rankings. This ranking is displayed on the user's device as a reference for understanding trends. The system also provides a community function, allowing users to share their ratings and opinions on cocktails with other users. In this way, the invention can provide users with a comprehensive and personalized beverage experience.
[0438] For example, if a user enters "I'm feeling restless and would like a relaxing drink" into their device, the server will suggest options such as "lavender lemonade" or "chamomile tea." The beverage is then prepared and served according to the user's selection, and the popularity ranking is updated. In this way, a beverage experience tailored to the user's current state of mind can be easily provided.
[0439] The following describes the processing flow.
[0440] Step 1:
[0441] The user connects to the device and opens the free chat screen or voice input screen. The device enters input waiting mode and prompts the user to enter information about their mood or preferences.
[0442] Step 2:
[0443] The user inputs information such as "I want to relax" via text or voice. The device receives the input and, if it's voice, automatically converts it into text data.
[0444] Step 3:
[0445] The terminal sends user input to the server. The server applies natural language processing techniques to analyze the received text data.
[0446] Step 4:
[0447] The server identifies the user's emotions and preferences from the analyzed information. Based on this identification, it searches the database for relevant beverage candidates.
[0448] Step 5:
[0449] The server generates a list of beverage suggestions that match the user's mood based on the search results. This list is sent to the device and displayed visually on the device.
[0450] Step 6:
[0451] The user selects their desired beverage from the options displayed on the terminal. The terminal reports the user's selection to the server.
[0452] Step 7:
[0453] The server sends the recipe and preparation instructions for the selected beverage to the terminal. The terminal displays this information and instructs the user on the preparation steps.
[0454] Step 8:
[0455] If automated cooking equipment is available, the terminal issues a cocktail preparation command to the equipment, initiating the automated process. The user then receives their beverage as instructed.
[0456] Step 9:
[0457] The server records the user's selection history and updates the popularity rankings based on that data. This ranking information is provided to the terminal, allowing users to view the latest beverage trends.
[0458] Step 10:
[0459] Users can access the community via their devices and share ratings and opinions about beverages with other users. The server manages and displays this information.
[0460] (Example 1)
[0461] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0462] In modern society, there is a demand for personalized experiences based on the diverse preferences and emotions of users. However, conventional beverage serving systems have difficulty accurately suggesting beverages that reflect users' preferences and moods at any given time. Furthermore, there is a lack of mechanisms to encourage interaction among users, which hinders the formation of communities.
[0463] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0464] In this invention, the server includes a language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; and a beverage production means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to suggest and provide beverages that are tailored to the individual user's state and preferences. Furthermore, community building is promoted as users interact with other users and share evaluations and comments for motivation.
[0465] "Language processing means" are tools for analyzing user input information to identify user preferences and emotions.
[0466] A "candidate generation method" is a means for suggesting appropriate beverages based on the preferences and emotions of identified users.
[0467] A "beverage preparation means" is a means for preparing and providing a beverage selected by the user.
[0468] A "history management method" is a means of recording and analyzing a user's selection history to update the popularity rankings.
[0469] "Display means" refers to a means of presenting information to the user visually.
[0470] "Input assistance means" refers to means for supporting voice or text input.
[0471] This invention is a method for implementing a system designed to provide personalized beverages tailored to the user's preferences and emotions. The invention comprises a terminal with a user interface, a server for data processing, and means for creating and serving beverages.
[0472] First, the user connects to the system by operating a device and inputs information about their mood and preferences via voice or text. The device receives this information and sends it to the server over the network. The server's speech recognition function may convert the voice data into text, for example, by using a speech recognition API. In addition, natural language processing technology uses natural language processing libraries to analyze the user's emotions and preferences.
[0473] Based on the analysis results, the server generates beverage candidates. Using a database management system, it efficiently searches for beverages that match the user's preferences and generates a list. The generated candidates are sent to the terminal and presented to the user as options.
[0474] When a user selects a beverage, the server records the selection and sends instructions to the terminal for the beverage preparation device. Based on this information, the terminal guides the user with the necessary ingredients and procedures. If an automated preparation device is available, the selected beverage is automatically prepared and served.
[0475] For example, a user might type "I'm feeling restless, so I'd like a relaxing drink" into the terminal. The server analyzes this prompt and suggests options such as "lavender lemonade" or "chamomile tea." Based on the user's selection, the beverage is prepared, and in some cases, the popularity ranking is updated.
[0476] This invention specifically demonstrates the implementation of a system that can provide a beverage experience tailored to the needs of individual users and promote interaction among users.
[0477] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0478] Step 1:
[0479] The user uses a device to input information about their mood and preferences via voice or text. Upon receiving the input data, the device prepares the data according to the input format and sends it to the server. An internet connection is used for sending data to the server. A specific example of this process would be the user typing "I want to refresh" into a text box and clicking the submit button.
[0480] Step 2:
[0481] The server validates the data received from the terminal. If audio data is sent, it is converted to text using a speech recognition API. Based on this conversion, a natural language processing library is used to analyze the text data and extract the user's emotions and preferences. The input is audio or text data, and the output is user emotion and preference information based on the analysis.
[0482] Step 3:
[0483] The server uses the analysis results to generate beverage candidates. It executes appropriate queries against the database management system to retrieve beverages that match the user's mood and preferences. For example, it uses SQL to retrieve beverages related to "refreshment" and generates candidates such as "lime mint smoothie." The input is the user's mood and preference information, and the output is a list of beverage candidates.
[0484] Step 4:
[0485] The server sends the generated beverage candidates to the terminal, which then visually presents them to the user. The user interface displays the candidates in an easy-to-select format, allowing the user to choose. Specifically, the terminal displays the beverage names and brief descriptions on the screen in a list or card format. The input is candidate data from the server, and the output is the user's visual display.
[0486] Step 5:
[0487] The user selects their desired beverage from the presented options. The terminal sends this selection back to the server. The server records the information about the selected beverage and sends instructions to the terminal for the beverage production process. For example, the user selects "Lime Mint Smoothie," and the terminal notifies the server of this selection. The input is the user's selection, and the output is the selection notification to the server.
[0488] Step 6:
[0489] The terminal receives instructions from the server and displays the beverage ingredients and preparation steps to the user. If an automated cooking device is available, the terminal sends a signal to the device, and the selected beverage is automatically prepared. Specific display details include steps such as, "Required ingredients: lime, mint, honey; Preparation steps: mix the ingredients." Input is the preparation instructions from the server, and output is the display to the user or a signal to the cooking device.
[0490] Step 7:
[0491] The server stores the user's selection history in a database and analyzes and updates the popularity ranking. The updated ranking is sent to the terminal in real time and presented to the user as the latest information. Specifically, the server records data with a command such as "INSERT INTO history (user_id, drink_id) VALUES (...)" and updates the ranking using an analysis algorithm. The input is the selection history, and the output is the ranking information.
[0492] (Application Example 1)
[0493] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0494] In modern urban environments, there is a need for personalized beverage delivery systems that meet the diverse needs of residents. In particular, providing optimal beverages based on individual moods and preferences, and fostering interaction among residents, presents a challenging task. Furthermore, real-time beverage delivery is essential, providing an efficient and seamless experience.
[0495] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0496] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions, a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions, and a communication means that cooperates with beverage supply facilities in a smart city environment to provide beverages to the user in real time. This enables the immediate provision of optimal beverages tailored to the individual needs of the user, facilitating interaction among residents and facilitating an efficient beverage experience.
[0497] "Natural language processing means" refers to technologies that analyze user input information and identify the user's preferences and emotions from that information.
[0498] A "candidate generation method" is a method for extracting and processing information necessary to suggest the most suitable beverage to a user based on their identified preferences and emotions.
[0499] "Beverage production means" refers to a series of processes for preparing and specifically providing the beverage selected by the user.
[0500] A "history management method" is a method of recording a user's selection history and using that data to analyze and update the popularity rankings.
[0501] "Communication methods" refer to the interfaces and protocols necessary for beverage supply equipment and servers to cooperate within a smart city environment.
[0502] This invention is a system for providing personalized beverages tailored to user needs in a smart city. The system mainly consists of a server, a terminal with a user interface, and beverage supply equipment within the smart city environment.
[0503] The server uses natural language processing technology to analyze voice or text input from the user to identify their preferences and emotions. To do this, it converts voice data to text using the Google Cloud Speech-to-Text API and analyzes the text data using the Google Cloud Natural Language API. Based on the analysis results, the server selects beverages from its database that match the user's preferences and displays them as options on the device.
[0504] Users select a beverage from a list of options displayed on their device. This information is sent to a server, and the selected beverage is recorded in a history management system and used to update the popularity ranking among residents. The selected beverage is then communicated with vending machines and cafes within the smart city, and beverage supply equipment prepares the beverage in real time.
[0505] For example, if a user enters a request into the terminal saying, "I'm tired today, so I'd like a refreshing drink," the server will suggest options such as "mint lemonade" or "green smoothie." Depending on the user's selection, the chosen beverage will be immediately served at a cafe within the smart city, and this will also be reflected in the rankings.
[0506] An example of a prompt for this system would be: "Describe the process of suggesting the most suitable beverage when a user says they want to refresh themselves, and then providing that beverage immediately within the city."
[0507] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0508] Step 1:
[0509] Users use their devices to input information about their mood and preferences via voice or text. If the input data is voice, the Google Cloud Speech-to-Text API is used to convert the voice to text and send it to the server as text data.
[0510] Step 2:
[0511] The server analyzes the received text data using the Google Cloud Natural Language API. This analysis extracts keywords and sentiments to identify the user's preferences and state. The input data is text, and the output is the analysis result, which identifies the user's preferences and sentiments.
[0512] Step 3:
[0513] The server searches a beverage database for a suitable beverage based on identified preferences and emotions. It selects multiple beverage candidates using a candidate generation mechanism. The input is analytical information based on the user's preferences, and the output is a list of beverage candidates.
[0514] Step 4:
[0515] The server sends a list of options to the terminal. The terminal visually displays this list in its user interface, offering it to the user as a selection. The input is the beverage option list, and the output is the visual presentation to the user.
[0516] Step 5:
[0517] The user selects a beverage from the displayed beverage options. The terminal sends the user's selection information to the server, along with the ID of the selected beverage as data. The input is the user's selection, and the output is the transmission of the beverage ID based on that selection.
[0518] Step 6:
[0519] The server records the received beverage IDs using a history management system and updates the popularity ranking. Simultaneously, it transmits this information to beverage supply equipment within the smart city, instructing it to begin preparation. The input is the beverage ID, and the output is the history update and supply instruction.
[0520] Step 7:
[0521] Beverage supply systems within smart cities accurately prepare and provide beverages to users based on received information. Users can pick up their beverages at designated locations. The input is a supply instruction, and the output is the preparation and provision of beverages.
[0522] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0523] This invention relates to a technology that improves the accuracy of a system that suggests and provides the optimal beverage based on the user's mood and preferences by using emotion recognition. In addition to a terminal equipped with a user interface, a server for data analysis and processing, and means for generating and providing beverages, the system incorporates an emotion engine.
[0524] First, the user begins inputting information through the device. The user inputs information about their mood and preferences in text or voice. The device receives this, automatically converts the voice data into text, and sends it to the server. The server not only utilizes natural language processing technology to analyze the input data, but also identifies the user's emotions using an emotion engine.
[0525] Next, the server uses a candidate generation mechanism to suggest a beverage that best matches the user's mood, based on emotional data obtained from the emotion engine. Emotional data plays a particularly important role in evaluating the candidates, improving the accuracy of the suggestions. The generated candidates are sent to the terminal and visually displayed on the user interface.
[0526] Once the user selects their desired beverage from the presented options, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. The terminal then provides the user with a detailed recipe and preparation instructions, and, if necessary, activates an automated cooking robot to dispense the beverage.
[0527] Furthermore, the server integrates and records user selection history and sentiment data, updating popularity rankings in real time. This allows users to stay up-to-date on the latest trends. The system also includes a community feature, allowing users to share their ratings and opinions about beverages with other users.
[0528] For example, if a user types "I want to lift my spirits a little" into their device, the server uses natural language processing and an emotion engine to suggest "uplifting drinks" such as "Lemon Mint Sparkle" or "Ginger Blend Tea." Based on the emotion analysis results, the suggestions are fine-tuned, allowing the user to choose a more appropriate beverage. The selected beverage is quickly prepared, providing an experience that is tailored to the user's mood.
[0529] The following describes the processing flow.
[0530] Step 1:
[0531] The user accesses the system from their device and launches either the free chat screen or the voice input screen. The device then prepares to accept user input.
[0532] Step 2:
[0533] The user enters information about their mood in text or voice, for example, "I want to relax." If voice input is available, the device automatically converts it to text.
[0534] Step 3:
[0535] The terminal sends the user's text input data to the server. The server applies natural language processing algorithms to analyze key information related to the user's emotions and preferences from their words.
[0536] Step 4:
[0537] The server passes the analyzed data to the emotion engine to identify more detailed emotional states. This process allows the user's emotions to be recognized as concrete indicators.
[0538] Step 5:
[0539] Based on the output from the emotion engine, the server refers to a beverage database to select the most suitable beverage candidate. The candidate generation system then lists beverages that match the user's emotions.
[0540] Step 6:
[0541] The server sends a list of selected beverage options to the terminal. The terminal displays an interface that visually presents the beverage options to the user and prompts them to make a selection.
[0542] Step 7:
[0543] The user selects their desired beverage from the presented options. The selection information is sent to the server via the device.
[0544] Step 8:
[0545] The server sends the terminal a detailed recipe and preparation instructions for the selected beverage. The terminal displays the ingredients and instructions to the user, providing assistance.
[0546] Step 9:
[0547] If possible, the terminal sends instructions to the automated cooking robot to prepare the beverage, and the beverage is then prepared for the user.
[0548] Step 10:
[0549] The server records user selection history and sentiment data, and updates the popularity rankings. Users can refer to these rankings to see what other users have chosen.
[0550] Step 11:
[0551] Users can access community features through their devices to share ratings and feedback, and exchange opinions with other users. The server oversees this activity and continuously updates community information.
[0552] (Example 2)
[0553] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0554] Existing systems that suggest the optimal beverage based on the user's mood and preferences struggle to accurately analyze the user's complex emotional state. Furthermore, they lack sufficient functionality to integrate past selection history with real-time emotions to provide trend information, resulting in a lack of consistency in the user experience.
[0555] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0556] In this invention, the server includes a text analysis means that analyzes user input information to identify a state and preferences; a candidate generation means that suggests an appropriate beverage based on the identified state and preferences; and a beverage generation means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to accurately analyze the user's emotional state and suggest an appropriate beverage based on that analysis.
[0557] A "text analysis tool" is an information processing system that analyzes user input information to identify their state and preferences.
[0558] A "candidate generation means" is a device or system that has the function of selecting appropriate beverage candidates based on specified conditions or preferences.
[0559] "Beverage production means" refers to a series of devices and processes for preparing and serving beverages selected by the user.
[0560] "History management means" refers to a device or system that records and analyzes a user's selection history and updates popularity ratings accordingly.
[0561] "Emotion recognition means" refers to technology that analyzes complex emotional states from user input information and identifies emotion labels.
[0562] This invention is implemented as a system that suggests and provides the optimal beverage based on the user's emotions and preferences. The system includes a terminal with a user interface, a server responsible for data analysis, and means for generating the beverage. Furthermore, emotion recognition means are incorporated to accurately analyze the emotional state.
[0563] The terminal receives input from the user (text or voice) and, in the case of voice input, automatically converts it into text data. Automatic speech recognition software can be used for this conversion. The converted text data is then sent to the server.
[0564] The server analyzes the received data using text analysis tools to identify the user's state and preferences. A natural language processing library is used for the analysis. Furthermore, emotion recognition tools identify the emotional state from the user's input. Based on the information obtained through this process, the server uses a candidate generation tool to generate the most suitable beverage candidates.
[0565] The generated beverage candidates are sent back to the terminal and visually displayed on the user interface. The user can select their desired beverage from the displayed options. This selection is sent to the server, and the beverage is prepared through the beverage production system. Automated cooking robots may be used in this process.
[0566] As a concrete example, consider a scenario where a user types "I want to lift my spirits a little" into their device. The server then uses natural language processing and emotion recognition to present options such as "Lemon Mint Sparkle" or "Ginger Blend Tea." The selected beverage is quickly prepared, enhancing the user's experience through its delivery.
[0567] Example prompt for a generative AI model: "Generate a list of beverages to suggest when the user types 'I want to relax.'"
[0568] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0569] Step 1:
[0570] Users input information about their mood and preferences via text or voice through their device. If the input is voice, the device uses speech recognition technology to convert the voice data into text. This converted text data is then sent to the server as output.
[0571] Step 2:
[0572] The server receives text data from the terminal as input. It then uses a natural language processing library to analyze the text data and extract keywords related to the user's state and preferences. This analysis process outputs information about the user's current emotions and preferences.
[0573] Step 3:
[0574] The server uses emotion recognition tools to further examine the analysis results and identify the user's emotion label (e.g., joy, calmness). The input is the analysis result from the previous step, and the server generates emotion data obtained from that input as output.
[0575] Step 4:
[0576] The server takes identified emotion data as input and uses a candidate generation mechanism to generate a list of optimal beverages. It searches the database for relevant beverage information and generates suggestions corresponding to the emotion data as output.
[0577] Step 5:
[0578] The generated list of beverage options is sent to the terminal and visually displayed on the user interface. The user interface provides descriptions and images for each beverage, making it easy for the user to select their desired beverage.
[0579] Step 6:
[0580] Information about the beverage selected by the user is sent from the terminal to the server. The server receives this information as input, generates preparation instructions for the selected beverage as output, and sends them to the terminal.
[0581] Step 7:
[0582] The terminal uses the output received from the server to activate an automated cooking robot and prepare the selected beverage. Once the beverage is ready, it is served to the user, and the process is complete.
[0583] (Application Example 2)
[0584] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0585] Providing users with the most suitable beverages quickly and accurately, tailored to their emotions and preferences, is crucial for increasing user satisfaction. However, conventional technology has struggled to accurately analyze user emotions and provide appropriate beverages based on that analysis. Furthermore, features for updating selection history and trend information in real time and sharing information with other users were limited, resulting in a lack of a sense of unity and shared experience in the user experience.
[0586] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0587] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; a beverage production means that works in conjunction with a home appliance to prepare and serve the beverage selected by the user from the suggested candidates; and a history management means that records and analyzes the user's selection history to update popular trend information in real time and visually display it on the user interface. This makes it possible to suggest and serve the optimal beverage based on the user's emotions and to promote communication through selection history and trend information.
[0588] "Natural language processing means that analyze user input information to identify preferences and emotions" refers to data processing technology that analyzes voice or text data provided by users to identify individual preferences and moods.
[0589] "Candidate generation means for suggesting appropriate beverages based on identified preferences and emotions" refers to a technology that generates and presents beverage options that match the user's mood at the time, based on an analysis of the user's emotions.
[0590] A "beverage production method linked to home appliances" is a means for preparing and serving a beverage selected by the user using automated household equipment.
[0591] "A history management means that records and analyzes a user's selection history to update popular trend information in real time and visually display it on the user interface" is a technology for accumulating and analyzing a user's past selection data and generating and displaying the latest trend information.
[0592] "Information exchange and provision means" refers to a means of exchanging evaluations and opinions about beverages with other users and communicating with them.
[0593] "Voice analysis means" refers to a technology that instantly converts a user's voice input into text data and analyzes their emotional state.
[0594] The system for realizing this invention has a structure in which the user, server, and terminal communicate. A specific embodiment thereof is described below.
[0595] The server receives voice or text data sent by the user and analyzes the input information using natural language processing techniques. This process utilizes the Google Cloud Natural Language API to identify the user's preferences and emotions.
[0596] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the identified sentiment data. In this process, the sentiment analysis results from Amazon Comprehend are utilized to generate beverage options, which are then presented to the user.
[0597] When a user selects a beverage, the server stores the user's selection information in Firebase and coordinates with home appliances to produce the beverage. Specifically, it controls home robots and smart appliances to prepare and serve the desired beverage.
[0598] Furthermore, the server analyzes the user's selection history and updates popular trend information in real time. This information is visually displayed in the user interface, and users can share their evaluations and opinions with other users through information exchange and provision mechanisms.
[0599] For example, if a user tells the device by voice, "I want to relax a little today," the system converts the voice data into text and analyzes the user's desire to relax using Amazon Comprehend. The server then suggests beverages such as "chamomile tea" or "lavender latte," which the user selects. After the selection, a robot automatically prepares the beverage and serves it to the user. In addition, trending information for the day is displayed in real time, and users can also view ratings and opinions from other users.
[0600] An example of a prompt is, "What beverage would you recommend drinking when you feel like this: when you want to relax?"
[0601] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0602] Step 1:
[0603] The user enters their beverage request into the device via voice or text. The entered voice data is converted into text data within the device using the Google Cloud Speech-to-Text service. The output of this conversion process is the text data.
[0604] Step 2:
[0605] The device sends text data to the server. The server uses this text data as input and performs natural language processing using the Google Cloud Natural Language API to identify the user's preferences and emotions. This step generates sentiment analysis results and user preference data.
[0606] Step 3:
[0607] Based on the sentiment analysis results, the server uses Amazon Comprehend to generate beverage suggestions that match the user's mood. These suggestions are sent to the device and displayed as beverage options on the user interface. This step outputs a list of suggestions.
[0608] Step 4:
[0609] The user selects a beverage from the displayed options. The selected beverage data is sent from the terminal to the server. At this time, the user's selection information is recorded.
[0610] Step 5:
[0611] The server receives the user's selection information and stores that data in Firebase. The stored data is used for history management and generating trend information. In this step, the selection history data is updated.
[0612] Step 6:
[0613] The server sends control commands to home appliances to prepare the selected beverage. Home robots and smart appliances work together to prepare the selected beverage and serve it to the user. At this point, the finished beverage is dispensed and served.
[0614] Step 7:
[0615] The server analyzes trend information in real time based on selection history and trend data, and displays it visually in the user interface. Furthermore, it facilitates the sharing of evaluations and opinions about beverages to enable information exchange with other users. In this step, updated trend information and a sharing interface are provided.
[0616] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0617] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0618] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0619] [Fourth Embodiment]
[0620] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0621] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0622] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0623] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0624] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0625] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0626] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0627] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0628] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0629] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0630] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0631] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0632] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0633] This invention is a system built to provide a personalized beverage experience according to the user's mood and preferences. The main components of the system include a terminal with a user interface, a server for data analysis and processing, and means for creating and serving beverages.
[0634] First, the user connects to the system using a device and inputs information about their mood and preferences. The device receives this information as voice or text and sends it to the server. If it is voice data, the server automatically converts it to text and then uses natural language processing technology to analyze the user's emotions and preferences in detail.
[0635] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the analysis results. The server searches a database and generates a list of beverage candidates that may satisfy the user's mood. These candidates are visually displayed on the terminal's screen and offered to the user as options.
[0636] When a user selects a beverage, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. Based on this information, the terminal details the beverage ingredients and preparation procedure to the user. In particular, if an automated cooking robot is installed, the selected beverage is prepared and served automatically.
[0637] Furthermore, the server records the user's selection history and analyzes this data in real time to update the popularity rankings. This ranking is displayed on the user's device as a reference for understanding trends. The system also provides a community function, allowing users to share their ratings and opinions on cocktails with other users. In this way, the invention can provide users with a comprehensive and personalized beverage experience.
[0638] For example, if a user enters "I'm feeling restless and would like a relaxing drink" into their device, the server will suggest options such as "lavender lemonade" or "chamomile tea." The beverage is then prepared and served according to the user's selection, and the popularity ranking is updated. In this way, a beverage experience tailored to the user's current state of mind can be easily provided.
[0639] The following describes the processing flow.
[0640] Step 1:
[0641] The user connects to the device and opens the free chat screen or voice input screen. The device enters input waiting mode and prompts the user to enter information about their mood or preferences.
[0642] Step 2:
[0643] The user inputs information such as "I want to relax" via text or voice. The device receives the input and, if it's voice, automatically converts it into text data.
[0644] Step 3:
[0645] The terminal sends user input to the server. The server applies natural language processing techniques to analyze the received text data.
[0646] Step 4:
[0647] The server identifies the user's emotions and preferences from the analyzed information. Based on this identification, it searches the database for relevant beverage candidates.
[0648] Step 5:
[0649] The server generates a list of beverage suggestions that match the user's mood based on the search results. This list is sent to the device and displayed visually on the device.
[0650] Step 6:
[0651] The user selects their desired beverage from the options displayed on the terminal. The terminal reports the user's selection to the server.
[0652] Step 7:
[0653] The server sends the recipe and preparation instructions for the selected beverage to the terminal. The terminal displays this information and instructs the user on the preparation steps.
[0654] Step 8:
[0655] If automated cooking equipment is available, the terminal issues a cocktail preparation command to the equipment, initiating the automated process. The user then receives their beverage as instructed.
[0656] Step 9:
[0657] The server records the user's selection history and updates the popularity rankings based on that data. This ranking information is provided to the terminal, allowing users to view the latest beverage trends.
[0658] Step 10:
[0659] Users can access the community via their devices and share ratings and opinions about beverages with other users. The server manages and displays this information.
[0660] (Example 1)
[0661] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0662] In modern society, there is a demand for personalized experiences based on the diverse preferences and emotions of users. However, conventional beverage serving systems have difficulty accurately suggesting beverages that reflect users' preferences and moods at any given time. Furthermore, there is a lack of mechanisms to encourage interaction among users, which hinders the formation of communities.
[0663] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0664] In this invention, the server includes a language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; and a beverage production means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to suggest and provide beverages that are tailored to the individual user's state and preferences. Furthermore, community building is promoted as users interact with other users and share evaluations and comments for motivation.
[0665] "Language processing means" are tools for analyzing user input information to identify user preferences and emotions.
[0666] A "candidate generation method" is a means for suggesting appropriate beverages based on the preferences and emotions of identified users.
[0667] A "beverage preparation means" is a means for preparing and providing a beverage selected by the user.
[0668] A "history management method" is a means of recording and analyzing a user's selection history to update the popularity rankings.
[0669] "Display means" refers to a means of presenting information to the user visually.
[0670] "Input assistance means" refers to means for supporting voice or text input.
[0671] This invention is a method for implementing a system designed to provide personalized beverages tailored to the user's preferences and emotions. The invention comprises a terminal with a user interface, a server for data processing, and means for creating and serving beverages.
[0672] First, the user connects to the system by operating a device and inputs information about their mood and preferences via voice or text. The device receives this information and sends it to the server over the network. The server's speech recognition function may convert the voice data into text, for example, by using a speech recognition API. In addition, natural language processing technology uses natural language processing libraries to analyze the user's emotions and preferences.
[0673] Based on the analysis results, the server generates beverage candidates. Using a database management system, it efficiently searches for beverages that match the user's preferences and generates a list. The generated candidates are sent to the terminal and presented to the user as options.
[0674] When a user selects a beverage, the server records the selection and sends instructions to the terminal for the beverage preparation device. Based on this information, the terminal guides the user with the necessary ingredients and procedures. If an automated preparation device is available, the selected beverage is automatically prepared and served.
[0675] For example, a user might type "I'm feeling restless, so I'd like a relaxing drink" into the terminal. The server analyzes this prompt and suggests options such as "lavender lemonade" or "chamomile tea." Based on the user's selection, the beverage is prepared, and in some cases, the popularity ranking is updated.
[0676] This invention specifically demonstrates the implementation of a system that can provide a beverage experience tailored to the needs of individual users and promote interaction among users.
[0677] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0678] Step 1:
[0679] The user uses a device to input information about their mood and preferences via voice or text. Upon receiving the input data, the device prepares the data according to the input format and sends it to the server. An internet connection is used for sending data to the server. A specific example of this process would be the user typing "I want to refresh" into a text box and clicking the submit button.
[0680] Step 2:
[0681] The server validates the data received from the terminal. If audio data is sent, it is converted to text using a speech recognition API. Based on this conversion, a natural language processing library is used to analyze the text data and extract the user's emotions and preferences. The input is audio or text data, and the output is user emotion and preference information based on the analysis.
[0682] Step 3:
[0683] The server uses the analysis results to generate beverage candidates. It executes appropriate queries against the database management system to retrieve beverages that match the user's mood and preferences. For example, it uses SQL to retrieve beverages related to "refreshment" and generates candidates such as "lime mint smoothie." The input is the user's mood and preference information, and the output is a list of beverage candidates.
[0684] Step 4:
[0685] The server sends the generated beverage candidates to the terminal, which then visually presents them to the user. The user interface displays the candidates in an easy-to-select format, allowing the user to choose. Specifically, the terminal displays the beverage names and brief descriptions on the screen in a list or card format. The input is candidate data from the server, and the output is the user's visual display.
[0686] Step 5:
[0687] The user selects their desired beverage from the presented options. The terminal sends this selection back to the server. The server records the information about the selected beverage and sends instructions to the terminal for the beverage production process. For example, the user selects "Lime Mint Smoothie," and the terminal notifies the server of this selection. The input is the user's selection, and the output is the selection notification to the server.
[0688] Step 6:
[0689] The terminal receives instructions from the server and displays the beverage ingredients and preparation steps to the user. If an automated cooking device is available, the terminal sends a signal to the device, and the selected beverage is automatically prepared. Specific display details include steps such as, "Required ingredients: lime, mint, honey; Preparation steps: mix the ingredients." Input is the preparation instructions from the server, and output is the display to the user or a signal to the cooking device.
[0690] Step 7:
[0691] The server stores the user's selection history in a database and analyzes and updates the popularity ranking. The updated ranking is sent to the terminal in real time and presented to the user as the latest information. Specifically, the server records data with a command such as "INSERT INTO history (user_id, drink_id) VALUES (...)" and updates the ranking using an analysis algorithm. The input is the selection history, and the output is the ranking information.
[0692] (Application Example 1)
[0693] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0694] In modern urban environments, there is a need for personalized beverage delivery systems that meet the diverse needs of residents. In particular, providing optimal beverages based on individual moods and preferences, and fostering interaction among residents, presents a challenging task. Furthermore, real-time beverage delivery is essential, providing an efficient and seamless experience.
[0695] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0696] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions, a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions, and a communication means that cooperates with beverage supply facilities in a smart city environment to provide beverages to the user in real time. This enables the immediate provision of optimal beverages tailored to the individual needs of the user, facilitating interaction among residents and facilitating an efficient beverage experience.
[0697] "Natural language processing means" refers to technologies that analyze user input information and identify the user's preferences and emotions from that information.
[0698] A "candidate generation method" is a method for extracting and processing information necessary to suggest the most suitable beverage to a user based on their identified preferences and emotions.
[0699] "Beverage production means" refers to a series of processes for preparing and specifically providing the beverage selected by the user.
[0700] A "history management method" is a method of recording a user's selection history and using that data to analyze and update the popularity rankings.
[0701] "Communication methods" refer to the interfaces and protocols necessary for beverage supply equipment and servers to cooperate within a smart city environment.
[0702] This invention is a system for providing personalized beverages tailored to user needs in a smart city. The system mainly consists of a server, a terminal with a user interface, and beverage supply equipment within the smart city environment.
[0703] The server uses natural language processing technology to analyze voice or text input from the user to identify their preferences and emotions. To do this, it converts voice data to text using the Google Cloud Speech-to-Text API and analyzes the text data using the Google Cloud Natural Language API. Based on the analysis results, the server selects beverages from its database that match the user's preferences and displays them as options on the device.
[0704] Users select a beverage from a list of options displayed on their device. This information is sent to a server, and the selected beverage is recorded in a history management system and used to update the popularity ranking among residents. The selected beverage is then communicated with vending machines and cafes within the smart city, and beverage supply equipment prepares the beverage in real time.
[0705] For example, if a user enters a request into the terminal saying, "I'm tired today, so I'd like a refreshing drink," the server will suggest options such as "mint lemonade" or "green smoothie." Depending on the user's selection, the chosen beverage will be immediately served at a cafe within the smart city, and this will also be reflected in the rankings.
[0706] An example of a prompt for this system would be: "Describe the process of suggesting the most suitable beverage when a user says they want to refresh themselves, and then providing that beverage immediately within the city."
[0707] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0708] Step 1:
[0709] Users use their devices to input information about their mood and preferences via voice or text. If the input data is voice, the Google Cloud Speech-to-Text API is used to convert the voice to text and send it to the server as text data.
[0710] Step 2:
[0711] The server analyzes the received text data using the Google Cloud Natural Language API. This analysis extracts keywords and sentiments to identify the user's preferences and state. The input data is text, and the output is the analysis result, which identifies the user's preferences and sentiments.
[0712] Step 3:
[0713] The server searches a beverage database for a suitable beverage based on identified preferences and emotions. It selects multiple beverage candidates using a candidate generation mechanism. The input is analytical information based on the user's preferences, and the output is a list of beverage candidates.
[0714] Step 4:
[0715] The server sends a list of options to the terminal. The terminal visually displays this list in its user interface, offering it to the user as a selection. The input is the beverage option list, and the output is the visual presentation to the user.
[0716] Step 5:
[0717] The user selects a beverage from the displayed beverage options. The terminal sends the user's selection information to the server, along with the ID of the selected beverage as data. The input is the user's selection, and the output is the transmission of the beverage ID based on that selection.
[0718] Step 6:
[0719] The server records the received beverage IDs using a history management system and updates the popularity ranking. Simultaneously, it transmits this information to beverage supply equipment within the smart city, instructing it to begin preparation. The input is the beverage ID, and the output is the history update and supply instruction.
[0720] Step 7:
[0721] Beverage supply systems within smart cities accurately prepare and provide beverages to users based on received information. Users can pick up their beverages at designated locations. The input is a supply instruction, and the output is the preparation and provision of beverages.
[0722] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0723] This invention relates to a technology that improves the accuracy of a system that suggests and provides the optimal beverage based on the user's mood and preferences by using emotion recognition. In addition to a terminal equipped with a user interface, a server for data analysis and processing, and means for generating and providing beverages, the system incorporates an emotion engine.
[0724] First, the user begins inputting information through the device. The user inputs information about their mood and preferences in text or voice. The device receives this, automatically converts the voice data into text, and sends it to the server. The server not only utilizes natural language processing technology to analyze the input data, but also identifies the user's emotions using an emotion engine.
[0725] Next, the server uses a candidate generation mechanism to suggest a beverage that best matches the user's mood, based on emotional data obtained from the emotion engine. Emotional data plays a particularly important role in evaluating the candidates, improving the accuracy of the suggestions. The generated candidates are sent to the terminal and visually displayed on the user interface.
[0726] Once the user selects their desired beverage from the presented options, the server records the selection and sends instructions regarding the beverage preparation method to the terminal. The terminal then provides the user with a detailed recipe and preparation instructions, and, if necessary, activates an automated cooking robot to dispense the beverage.
[0727] Furthermore, the server integrates and records user selection history and sentiment data, updating popularity rankings in real time. This allows users to stay up-to-date on the latest trends. The system also includes a community feature, allowing users to share their ratings and opinions about beverages with other users.
[0728] For example, if a user types "I want to lift my spirits a little" into their device, the server uses natural language processing and an emotion engine to suggest "uplifting drinks" such as "Lemon Mint Sparkle" or "Ginger Blend Tea." Based on the emotion analysis results, the suggestions are fine-tuned, allowing the user to choose a more appropriate beverage. The selected beverage is quickly prepared, providing an experience that is tailored to the user's mood.
[0729] The following describes the processing flow.
[0730] Step 1:
[0731] The user accesses the system from their device and launches either the free chat screen or the voice input screen. The device then prepares to accept user input.
[0732] Step 2:
[0733] The user enters information about their mood in text or voice, for example, "I want to relax." If voice input is available, the device automatically converts it to text.
[0734] Step 3:
[0735] The terminal sends the user's text input data to the server. The server applies natural language processing algorithms to analyze key information related to the user's emotions and preferences from their words.
[0736] Step 4:
[0737] The server passes the analyzed data to the emotion engine to identify more detailed emotional states. This process allows the user's emotions to be recognized as concrete indicators.
[0738] Step 5:
[0739] Based on the output from the emotion engine, the server refers to a beverage database to select the most suitable beverage candidate. The candidate generation system then lists beverages that match the user's emotions.
[0740] Step 6:
[0741] The server sends a list of selected beverage options to the terminal. The terminal displays an interface that visually presents the beverage options to the user and prompts them to make a selection.
[0742] Step 7:
[0743] The user selects their desired beverage from the presented options. The selection information is sent to the server via the device.
[0744] Step 8:
[0745] The server sends the terminal a detailed recipe and preparation instructions for the selected beverage. The terminal displays the ingredients and instructions to the user, providing assistance.
[0746] Step 9:
[0747] If possible, the terminal sends instructions to the automated cooking robot to prepare the beverage, and the beverage is then prepared for the user.
[0748] Step 10:
[0749] The server records user selection history and sentiment data, and updates the popularity rankings. Users can refer to these rankings to see what other users have chosen.
[0750] Step 11:
[0751] Users can access community features through their devices to share ratings and feedback, and exchange opinions with other users. The server oversees this activity and continuously updates community information.
[0752] (Example 2)
[0753] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0754] Existing systems that suggest the optimal beverage based on the user's mood and preferences struggle to accurately analyze the user's complex emotional state. Furthermore, they lack sufficient functionality to integrate past selection history with real-time emotions to provide trend information, resulting in a lack of consistency in the user experience.
[0755] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0756] In this invention, the server includes a text analysis means that analyzes user input information to identify a state and preferences; a candidate generation means that suggests an appropriate beverage based on the identified state and preferences; and a beverage generation means that prepares and provides the beverage selected by the user from the suggested candidates. This makes it possible to accurately analyze the user's emotional state and suggest an appropriate beverage based on that analysis.
[0757] A "text analysis tool" is an information processing system that analyzes user input information to identify their state and preferences.
[0758] A "candidate generation means" is a device or system that has the function of selecting appropriate beverage candidates based on specified conditions or preferences.
[0759] "Beverage production means" refers to a series of devices and processes for preparing and serving beverages selected by the user.
[0760] "History management means" refers to a device or system that records and analyzes a user's selection history and updates popularity ratings accordingly.
[0761] "Emotion recognition means" refers to technology that analyzes complex emotional states from user input information and identifies emotion labels.
[0762] This invention is implemented as a system that suggests and provides the optimal beverage based on the user's emotions and preferences. The system includes a terminal with a user interface, a server responsible for data analysis, and means for generating the beverage. Furthermore, emotion recognition means are incorporated to accurately analyze the emotional state.
[0763] The terminal receives input from the user (text or voice) and, in the case of voice input, automatically converts it into text data. Automatic speech recognition software can be used for this conversion. The converted text data is then sent to the server.
[0764] The server analyzes the received data using text analysis tools to identify the user's state and preferences. A natural language processing library is used for the analysis. Furthermore, emotion recognition tools identify the emotional state from the user's input. Based on the information obtained through this process, the server uses a candidate generation tool to generate the most suitable beverage candidates.
[0765] The generated beverage candidates are sent back to the terminal and visually displayed on the user interface. The user can select their desired beverage from the displayed options. This selection is sent to the server, and the beverage is prepared through the beverage production system. Automated cooking robots may be used in this process.
[0766] As a concrete example, consider a scenario where a user types "I want to lift my spirits a little" into their device. The server then uses natural language processing and emotion recognition to present options such as "Lemon Mint Sparkle" or "Ginger Blend Tea." The selected beverage is quickly prepared, enhancing the user's experience through its delivery.
[0767] Example prompt for a generative AI model: "Generate a list of beverages to suggest when the user types 'I want to relax.'"
[0768] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0769] Step 1:
[0770] Users input information about their mood and preferences via text or voice through their device. If the input is voice, the device uses speech recognition technology to convert the voice data into text. This converted text data is then sent to the server as output.
[0771] Step 2:
[0772] The server receives text data from the terminal as input. It then uses a natural language processing library to analyze the text data and extract keywords related to the user's state and preferences. This analysis process outputs information about the user's current emotions and preferences.
[0773] Step 3:
[0774] The server uses emotion recognition tools to further examine the analysis results and identify the user's emotion label (e.g., joy, calmness). The input is the analysis result from the previous step, and the server generates emotion data obtained from that input as output.
[0775] Step 4:
[0776] The server takes identified emotion data as input and uses a candidate generation mechanism to generate a list of optimal beverages. It searches the database for relevant beverage information and generates suggestions corresponding to the emotion data as output.
[0777] Step 5:
[0778] The generated list of beverage options is sent to the terminal and visually displayed on the user interface. The user interface provides descriptions and images for each beverage, making it easy for the user to select their desired beverage.
[0779] Step 6:
[0780] Information about the beverage selected by the user is sent from the terminal to the server. The server receives this information as input, generates preparation instructions for the selected beverage as output, and sends them to the terminal.
[0781] Step 7:
[0782] The terminal uses the output received from the server to activate an automated cooking robot and prepare the selected beverage. Once the beverage is ready, it is served to the user, and the process is complete.
[0783] (Application Example 2)
[0784] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0785] Providing users with the most suitable beverages quickly and accurately, tailored to their emotions and preferences, is crucial for increasing user satisfaction. However, conventional technology has struggled to accurately analyze user emotions and provide appropriate beverages based on that analysis. Furthermore, features for updating selection history and trend information in real time and sharing information with other users were limited, resulting in a lack of a sense of unity and shared experience in the user experience.
[0786] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0787] In this invention, the server includes a natural language processing means that analyzes user input information to identify preferences and emotions; a candidate generation means that suggests appropriate beverages based on the identified preferences and emotions; a beverage production means that works in conjunction with a home appliance to prepare and serve the beverage selected by the user from the suggested candidates; and a history management means that records and analyzes the user's selection history to update popular trend information in real time and visually display it on the user interface. This makes it possible to suggest and serve the optimal beverage based on the user's emotions and to promote communication through selection history and trend information.
[0788] "Natural language processing means that analyze user input information to identify preferences and emotions" refers to data processing technology that analyzes voice or text data provided by users to identify individual preferences and moods.
[0789] "Candidate generation means for suggesting appropriate beverages based on identified preferences and emotions" refers to a technology that generates and presents beverage options that match the user's mood at the time, based on an analysis of the user's emotions.
[0790] A "beverage production method linked to home appliances" is a means for preparing and serving a beverage selected by the user using automated household equipment.
[0791] "A history management means that records and analyzes a user's selection history to update popular trend information in real time and visually display it on the user interface" is a technology for accumulating and analyzing a user's past selection data and generating and displaying the latest trend information.
[0792] "Information exchange and provision means" refers to a means of exchanging evaluations and opinions about beverages with other users and communicating with them.
[0793] "Voice analysis means" refers to a technology that instantly converts a user's voice input into text data and analyzes their emotional state.
[0794] The system for realizing this invention has a structure in which the user, server, and terminal communicate. A specific embodiment thereof is described below.
[0795] The server receives voice or text data sent by the user and analyzes the input information using natural language processing techniques. This process utilizes the Google Cloud Natural Language API to identify the user's preferences and emotions.
[0796] Next, the server uses a candidate generation mechanism to suggest appropriate beverages based on the identified sentiment data. In this process, the sentiment analysis results from Amazon Comprehend are utilized to generate beverage options, which are then presented to the user.
[0797] When a user selects a beverage, the server stores the user's selection information in Firebase and coordinates with home appliances to produce the beverage. Specifically, it controls home robots and smart appliances to prepare and serve the desired beverage.
[0798] Furthermore, the server analyzes the user's selection history and updates popular trend information in real time. This information is visually displayed in the user interface, and users can share their evaluations and opinions with other users through information exchange and provision mechanisms.
[0799] For example, if a user tells the device by voice, "I want to relax a little today," the system converts the voice data into text and analyzes the user's desire to relax using Amazon Comprehend. The server then suggests beverages such as "chamomile tea" or "lavender latte," which the user selects. After the selection, a robot automatically prepares the beverage and serves it to the user. In addition, trending information for the day is displayed in real time, and users can also view ratings and opinions from other users.
[0800] An example of a prompt is, "What beverage would you recommend drinking when you feel like this: when you want to relax?"
[0801] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0802] Step 1:
[0803] The user enters their beverage request into the device via voice or text. The entered voice data is converted into text data within the device using the Google Cloud Speech-to-Text service. The output of this conversion process is the text data.
[0804] Step 2:
[0805] The device sends text data to the server. The server uses this text data as input and performs natural language processing using the Google Cloud Natural Language API to identify the user's preferences and emotions. This step generates sentiment analysis results and user preference data.
[0806] Step 3:
[0807] Based on the sentiment analysis results, the server uses Amazon Comprehend to generate beverage suggestions that match the user's mood. These suggestions are sent to the device and displayed as beverage options on the user interface. This step outputs a list of suggestions.
[0808] Step 4:
[0809] The user selects a beverage from the displayed options. The selected beverage data is sent from the terminal to the server. At this time, the user's selection information is recorded.
[0810] Step 5:
[0811] The server receives the user's selection information and stores that data in Firebase. The stored data is used for history management and generating trend information. In this step, the selection history data is updated.
[0812] Step 6:
[0813] The server sends control commands to home appliances to prepare the selected beverage. Home robots and smart appliances work together to prepare the selected beverage and serve it to the user. At this point, the finished beverage is dispensed and served.
[0814] Step 7:
[0815] The server analyzes trend information in real time based on selection history and trend data, and displays it visually in the user interface. Furthermore, it facilitates the sharing of evaluations and opinions about beverages to enable information exchange with other users. In this step, updated trend information and a sharing interface are provided.
[0816] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0817] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0818] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0819] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0820] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0821] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0822] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0823] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0824] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0825] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0826] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0827] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0828] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0829] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0830] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0831] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0832] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0833] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0834] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0835] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0836] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0837] The following is further disclosed regarding the embodiments described above.
[0838] (Claim 1)
[0839] A natural language processing method that analyzes user input information to identify preferences and emotions,
[0840] A candidate generation means that suggests an appropriate beverage based on the identified preferences and emotions,
[0841] A beverage production means that prepares and provides a beverage selected by the user from among the proposed candidates,
[0842] A history management means that records and analyzes the user's selection history and updates the popularity ranking,
[0843] A system that includes this.
[0844] (Claim 2)
[0845] The system according to claim 1, further comprising means for providing a community where users can interact with other users and share evaluations and comments for motivational purposes.
[0846] (Claim 3)
[0847] The system according to claim 1, further comprising a function to convert voice input into text data in real time, and a voice analysis means for analyzing the user's intent from their voice.
[0848] "Example 1"
[0849] (Claim 1)
[0850] A language processing method that analyzes user input information to identify preferences and emotions,
[0851] A candidate generation means that suggests an appropriate beverage based on the identified preferences and emotions,
[0852] A beverage production means that prepares and provides a beverage selected by the user from among the proposed candidates,
[0853] A history management means that records and analyzes the user's selection history and updates the popularity ranking,
[0854] A means of displaying information to the user visually,
[0855] A system including input assistance means that supports voice or text input.
[0856] (Claim 2)
[0857] The system according to claim 1, which enables users to interact with other users and share evaluations and comments for motivational purposes.
[0858] (Claim 3)
[0859] The system according to claim 1, which has a function to convert voice input into text data in real time and to analyze the user's intent from their voice.
[0860] "Application Example 1"
[0861] (Claim 1)
[0862] A natural language processing method that analyzes user input information to identify preferences and emotions,
[0863] A candidate generation means that suggests an appropriate beverage based on the identified preferences and emotions,
[0864] A beverage production means that prepares and provides a beverage selected by the user from among the proposed candidates,
[0865] A history management means that records and analyzes the user's selection history and updates the popularity ranking,
[0866] A communication method that works in conjunction with beverage supply facilities within a smart city environment to provide beverages to users in real time,
[0867] A system that includes this.
[0868] (Claim 2)
[0869] The system according to claim 1, further comprising means for providing a community where users can interact with other users and share evaluations and comments for motivational purposes.
[0870] (Claim 3)
[0871] The system according to claim 1, further comprising a function to convert voice input into text data in real time, and a voice analysis means for analyzing the user's intent from their voice.
[0872] "Example 2 of combining an emotion engine"
[0873] (Claim 1)
[0874] A text analysis method that analyzes user input information to identify status and preferences,
[0875] A candidate generation means that proposes an appropriate beverage based on the identified state and preferences,
[0876] A beverage production means that prepares and provides a beverage selected by the user from among the proposed candidates,
[0877] A history management means that records and analyzes the user's selection history and updates the popularity rating,
[0878] An emotion recognition method that analyzes the emotional state from the user's input,
[0879] A system that includes this.
[0880] (Claim 2)
[0881] The system according to claim 1, further comprising means for providing a community where users can interact with other users and share evaluations and comments for motivational purposes.
[0882] (Claim 3)
[0883] The system according to claim 1, further comprising a function to convert voice input into text data in real time, and a voice analysis means for analyzing the user's intent from their voice.
[0884] "Application example 2 when combining with an emotional engine"
[0885] (Claim 1)
[0886] A natural language processing method that analyzes user input information to identify preferences and emotions,
[0887] A candidate generation means that suggests an appropriate beverage based on the identified preferences and emotions,
[0888] A beverage production means, linked to a home appliance, that prepares and provides a beverage selected by the user from the aforementioned proposed candidates,
[0889] A history management means that records and analyzes the user's selection history, updates popular trend information in real time, and displays it visually on the user interface.
[0890] A system that includes this.
[0891] (Claim 2)
[0892] The system according to claim 1, further comprising means for exchanging and providing information that enables users to interact with other users and share evaluations and opinions about selected beverages.
[0893] (Claim 3)
[0894] The system according to claim 1, further comprising a voice analysis means for converting voice input into text data in real time and analyzing emotional data from the user's voice, wherein a beverage is automatically prepared based on the analyzed emotional data. [Explanation of Symbols]
[0895] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. A natural language processing method that analyzes user input information to identify preferences and emotions, A candidate generation means that suggests an appropriate beverage based on the identified preferences and emotions, A beverage production means that prepares and provides a beverage selected by the user from among the proposed candidates, A history management means that records and analyzes the user's selection history and updates the popularity ranking, A communication method that works in conjunction with beverage supply facilities within a smart city environment to provide beverages to users in real time, A system that includes this.
2. The system according to claim 1, further comprising means for providing a community where users can interact with other users and share evaluations and comments for motivational purposes.
3. The system according to claim 1, further comprising a function to convert voice input into text data in real time, and a voice analysis means for analyzing the user's intent from their voice.