system
The system addresses language barriers and travel complexities by translating user input, generating personalized plans, and responding to emergencies, providing a stress-free and efficient travel experience.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-16
- Publication Date
- 2026-06-26
AI Technical Summary
Travelers face language barriers, communication obstacles, and difficulties in managing complex travel plans and responding to sudden emergencies, leading to stress during their journeys.
A system that uses a generative model-based translation engine on a server to translate user input into another language, generates personalized travel plans, and provides real-time responses to unexpected situations by detecting abnormalities and suggesting alternatives.
Enables seamless communication, tailored travel experiences, and quick responses to emergencies, enhancing the overall travel experience by overcoming language barriers and managing unexpected events.
Smart Images

Figure 2026105413000001_ABST
Abstract
Description
Technical Field
[0005] ,
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, the method including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a character of the chatbot, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] There are language barriers and communication obstacles during travel, the complexity of travel plans, and further difficulties in quickly responding to sudden emergencies. Since these problems can cause stress to travelers, means for effectively solving these problems are required to make travel comfortable.
Means for Solving the Problems
[0005] This invention provides a system that receives user input data via a terminal, translates it into another language using a generative model-based translation engine on a server, and presents the resulting translation to the user via the terminal. It also includes a system that generates and presents an optimal travel plan based on the user's travel history and preferences. Furthermore, it enables a system that can quickly respond to unexpected situations during travel by detecting abnormal conditions in real time and automatically generating and presenting alternative solutions.
[0006] A "terminal device" refers to a device that receives input data from the user, transmits it to a server, and then presents the translation results to the user.
[0007] A "server system" refers to a computer system that has the function of processing received input data and translating it into another language using a generative model.
[0008] A "generative model" is an algorithm that uses artificial intelligence technology to analyze input data and automatically translate it into the appropriate language.
[0009] "Display means" refers to an interface for presenting translated data to the user visually or audibly.
[0010] A "travel plan" is a set of information that makes up the itinerary and destinations related to a user's trip.
[0011] An "abnormal situation" refers to a situation where an unexpected event occurs during travel, such as a flight delay or cancellation.
[0012] An "alternative" is a new option or solution generated to address an abnormal situation. [Brief explanation of the drawing]
[0013] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.
Mode for Carrying Out the Invention
[0014] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0015] First, the terms used in the following description will be explained.
[0016] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0017] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0018] In the following embodiments, the numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.
[0019] In the following embodiments, the numbered communication I / F (Interface) is an interface including a communication processor and an antenna, etc. The communication I / F controls communication between multiple computers. Examples of communication standards applied to the communication I / F include wireless communication standards including 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark), etc.
[0020] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0021] [First Embodiment]
[0022] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0023] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0024] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0025] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0026] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0027] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0028] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0029] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0030] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0031] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0032] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0033] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0034] This invention is a system designed to help travelers overcome language barriers and enable smooth communication and quick situational responses during their travels. This system operates using the user's smartphone or tablet and an external server.
[0035] First, the user uses the device to input the information they need during their trip. For example, if the user wants to make a reservation at a restaurant in their destination, they input this instruction into the device via voice or text. The device then transmits this input data to the server in real time.
[0036] The server uses a powerful generative model to translate incoming data into other languages. In doing so, the server considers the context to produce accurate translations. The generated translations are sent back to the terminal and presented to the user. This process allows users to communicate seamlessly with locals even in foreign countries.
[0037] The device also records the user's past travel history and preferences and sends them to the server. Based on this data, the server generates personalized travel plans tailored to the user's interests. For example, a user who enjoys visiting art museums will be offered a plan that includes information on special exhibitions being held in the city they are visiting.
[0038] Furthermore, the device continuously transmits flight information, location data, and other information collected during the trip to the server. The server analyzes this data and automatically generates alternatives if an abnormal situation occurs. For example, if a flight is canceled, the server suggests and presents alternative flights or modes of transportation to the user, ensuring that the flow of travel is not interrupted.
[0039] As a practical example, if a user is traveling in Paris, they can book a restaurant near the Eiffel Tower, receive a plan of museums they can visit, and get immediate information on new flights if there are flight delays. In this way, the present invention functions as a system that comprehensively provides support to travelers in various situations.
[0040] The following describes the processing flow.
[0041] Step 1:
[0042] Users input information via voice or text by operating a device. This information is necessary to support the user's communication and procedures on-site.
[0043] Step 2:
[0044] The terminal sends user input data to the server. During this process, the data is converted to an appropriate format before being sent.
[0045] Step 3:
[0046] The server analyzes the received data using a generative model and translates it into the specified language. This model enables highly accurate translation that takes context into account.
[0047] Step 4:
[0048] The server sends the translation results back to the terminal. The translated data is returned immediately, supporting multilingual communication.
[0049] Step 5:
[0050] The device displays the received translation results to the user and plays them back as audio if necessary. This allows the user to communicate smoothly with local people.
[0051] Step 6:
[0052] The server creates personalized travel plans based on the user's travel history and preferences. Machine learning is used in this process to provide the user with the most suitable plan.
[0053] Step 7:
[0054] The terminal displays travel plans received from the server to the user. The user can then review the proposed plans and create a trip that suits their preferences.
[0055] Step 8:
[0056] During the user's trip, the device periodically sends flight information and location data to the server. This ensures that the server always receives the latest travel information.
[0057] Step 9:
[0058] The server monitors this data and, upon detecting an abnormal condition (e.g., flight delays or cancellations), immediately generates a response plan.
[0059] Step 10:
[0060] The server sends a list of possible solutions, including alternatives, to the terminal and notifies the user. The user can then review the notification and decide on their next course of action from the presented options.
[0061] (Example 1)
[0062] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0063] Modern travelers face numerous challenges, including language barriers in foreign lands, the need for appropriate information tailored to their individual travel needs, and the need for quick responses to unexpected problems. These challenges hinder the smooth progress of travel and significantly degrade the user experience. The present invention aims to comprehensively solve these problems and improve the convenience and satisfaction of travelers.
[0064] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0065] In this invention, the server includes an information processing device means for acquiring information from a user, an information processing device means for using a generative artificial intelligence model to convert the information into another language, and a means for detecting abnormal conditions based on external information collected during travel and automatically creating alternative solutions. This enables smooth communication that transcends language barriers, the provision of personalized information, and rapid problem solving.
[0066] "Information processing device means" refers to a device or system that has the function of acquiring and processing input information from a user, and further transmitting and receiving data with an external system.
[0067] A "generative artificial intelligence model" is a model that uses advanced artificial intelligence technology to translate user input data into other languages or to generate recommended plans based on user behavior patterns.
[0068] "Output device means" refers to a processing device that displays or outputs audio to present information transmitted from the server to the user visually or audibly.
[0069] "Personalized suggestions" refer to suggestions that aim to provide optimal information and recommended plans tailored to the individual user's needs, based on the user's past behavior history and preferences.
[0070] An "abnormal condition" refers to a situation that deviates from the planned schedule or normal conditions, and includes, in particular, flight delays or cancellations during travel, as well as other unexpected problems.
[0071] This invention provides a system that enables travelers to overcome language barriers abroad, facilitating smooth communication and rapid situational response. The system consists of a terminal such as a smartphone or tablet for receiving user input and an external server for data processing. The system is realized by combining an information processing device, a generation AI model, and an output device.
[0072] First, the user inputs the information they need during their trip through the device. For example, they might use voice or text input to make a reservation at a restaurant they want to visit. This operation is performed using an application on the device. The device acquires data using voice recognition software or a text input system and sends that information to the server.
[0073] The server uses a generative AI model to translate received information into other languages. Leveraging natural language processing technology, it performs highly accurate, context-aware translations, allowing users to obtain accurate information even if they are unfamiliar with the foreign language. The translated information is then sent back to the terminal and displayed on the terminal's screen. Furthermore, the user can visually confirm the outputted translation results, enabling them to communicate effectively with local people.
[0074] Furthermore, the server suggests personalized itineraries based on the user's past travel history and preferences. The generating AI model creates the optimal travel plan for the user based on the tourist destinations the user has visited and the activities they have enjoyed in the past. For example, for a user who enjoys visiting art museums, it can suggest an itinerary that includes information on special exhibitions at the places they are visiting.
[0075] Furthermore, to address emergencies during travel, the device continuously transmits flight and location information to the server in real time, regardless of the situation. The server analyzes this information and monitors for any abnormalities. If an anomaly is detected, for example, if a flight delay occurs, the server automatically suggests alternative options. Users can then quickly decide on their next course of action by checking this information on their device.
[0076] As a concrete example, a prompt might read, "I want to make a reservation at a popular restaurant near the Eiffel Tower. Please translate this into the local language." The AI model then provides an appropriate translation. This system allows users to enjoy a high-quality travel experience.
[0077] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0078] Step 1:
[0079] The user inputs information using a device. The user enters necessary tasks at their travel destination (e.g., restaurant reservations) into the device via voice or text. The input data is processed by an application installed on the device and prepared as reservation information or text for translation.
[0080] Step 2:
[0081] The terminal sends the information entered by the user to the server. The transmitted data is converted into a format (e.g., JSON format) that includes the user's instructions and context. This allows the server to accurately interpret the data after receiving it.
[0082] Step 3:
[0083] The server uses a generative artificial intelligence model to translate received input information into another language. The input here is the transmitted text data. The server's process involves contextual analysis using natural language processing techniques to generate an accurate translation output.
[0084] Step 4:
[0085] The server sends the translated data to the terminal. The translated output is then formatted again and sent to the terminal. This allows the terminal to receive the translation accurately.
[0086] Step 5:
[0087] The terminal displays the translation results received from the server to the user. The translated text is displayed on the terminal's screen. The user can visually confirm this and communicate as needed in the local language.
[0088] Step 6:
[0089] The device sends data about the user's past travel history and preferences to the server. The server analyzes this data and creates a travel plan optimized for the user. The plan includes information on preferred tourist destinations and events.
[0090] Step 7:
[0091] Even while traveling, the device continues to transmit flight and location information to the server in real time. The server uses this information to monitor for any abnormal situations and generates alternative plans if necessary. For example, if a flight is canceled, the server will suggest alternative flights or modes of transportation.
[0092] Step 8:
[0093] Users can review suggested alternatives through their devices and choose the appropriate course of action. This process minimizes travel interruptions and provides a smoother experience.
[0094] (Application Example 1)
[0095] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0096] A system is needed to enable travelers to maintain smooth communication in multicultural and multilingual environments and to respond quickly to unexpected situations during their trip. Furthermore, support is required to flexibly cope with unforeseen changes in circumstances during their travels. To achieve this, personalized information provision that takes into account the traveler's past history and preferences is essential.
[0097] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0098] In this invention, the server includes an information processing device means for receiving input information from a user, a server device means that uses a generative model to translate the input information into another language and enable context-based notifications, a visualization means for presenting and visually displaying the translated information to the user, and a robot means for responding to user requests through interactive communication. This enables the user to travel with peace of mind in a foreign culture without experiencing language barriers and while dynamically optimizing their travel plan.
[0099] An "information processing device" is a device that receives input information from a user and has the function of processing voice and text as data and transmitting it to a server.
[0100] A "generative model" is a technology that analyzes input information and translates it into a target language; it is an algorithm that generates appropriate translation results based on context.
[0101] A "server device means" is a central network device designed to translate and appropriately process information received from users, utilizing a generative model.
[0102] A "visualization means" is a device or interface that displays translated information to the user, allowing them to visually confirm the information.
[0103] A "robot means" is an autonomous machine that enables interactive communication with users, responds to user requests, and provides information.
[0104] To realize this invention, a system is constructed by coordinating an information processing device, a server device, a visualization device, and a robot.
[0105] The server implements a generative model and converts user input into text using speech recognition technology (such as Google® Speech-to-Text API). After the input information is converted to text, it is translated into the appropriate language using a generative AI model (such as DeepL API). The translated information is displayed to the user in real time via a visualization device. This could be a display or smart device that shows the translation results.
[0106] The robot acts as an interface for interacting with users, collecting information through user conversations and providing responses while communicating with a server. This allows travelers to communicate smoothly in unfamiliar cultures.
[0107] As an example of this system, if a user traveling in London asks the robot, "Can you tell me about some popular tourist attractions nearby?", the robot could respond, "There is a history museum nearby, and it is very popular." A specific example of a prompt to achieve this would be: "Translate user voice data and process it to respond in the user's native language. Example: 'What are some recommended tourist attractions nearby?'"
[0108] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0109] Step 1:
[0110] The user inputs their voice into the information processing device. The received voice data is converted into text data using speech recognition technology.
[0111] Step 2:
[0112] The text data is sent to the server. The server uses a generative model to translate the text data into the appropriate language. Contextual analysis is also performed to ensure context-aware translation. The output is the translated text.
[0113] Step 3:
[0114] The translated text is sent to a visualization device and displayed to the user. This allows the user to check the translation results on their own device.
[0115] Step 4:
[0116] If the user has additional questions or responses to the robot, they input voice commands as in the previous step and communicate with the server again. The latest information is input via the information processing device and processed by the server using a generative model. As a result, translated responses to the user's questions are returned in real time.
[0117] This process enables users to communicate smoothly even in multilingual environments while traveling.
[0118] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0119] This invention is an advanced system designed to address various situations users face during their travels and enhance their travel experience. This system not only utilizes user input data to provide translations and travel plans, but also recognizes the user's emotions and adapts its service delivery accordingly.
[0120] First, the user inputs their intentions and necessary information via voice or text into the device they will use for their trip. This input data is then transmitted to the server in real time by the device.
[0121] The server uses a generative model for translation and simultaneously analyzes the emotions contained in the user's input using an emotion engine. Based on this analysis, the translation and the information presented are adjusted to match the user's emotions. For example, if it is determined that the user is stressed, softer language can be used in the translation.
[0122] Furthermore, the server combines the user's past travel history with emotional data to generate a travel plan that more personalizes the user's travel experience. This travel plan is tailored not only to the user's preferences but also to their expected emotional state. For example, if the user is seeking relaxation, a plan that includes visits to quiet places such as museums and parks will be suggested.
[0123] During travel, the device continuously transmits flight information and environmental data to the server. This allows the server to detect abnormal conditions in real time and generate alternatives as needed. When suggesting alternatives, the emotion engine takes the user's emotional state into consideration, providing more appropriate options.
[0124] For example, if a user is experiencing discomfort due to a long wait at a foreign airport, the server can sense this emotion and provide information such as lounge access to help them spend their waiting time more comfortably. Thus, the present invention is a complex system that not only handles language and abnormal conditions but also provides adaptive support tailored to the user's emotions.
[0125] The following describes the processing flow.
[0126] Step 1:
[0127] Users enter necessary travel information into their device via voice or text. This data is collected based on the user's current requests and circumstances.
[0128] Step 2:
[0129] The device transmits user input data to the server in real time. This data is used for translation and sentiment analysis.
[0130] Step 3:
[0131] The server uses a generative model based on the received data to translate user input into the target language. This process takes context into account to ensure high translation accuracy.
[0132] Step 4:
[0133] The server simultaneously uses an emotion engine to recognize the emotions contained in the user's input data. Emotion analysis allows the user's emotional state to be understood.
[0134] Step 5:
[0135] The server adjusts the translation and sentiment analysis results to provide the user with the most appropriate translation and information. For example, for a user experiencing stress, words with a calming tone will be selected.
[0136] Step 6:
[0137] The device presents the user with the adjusted translation results, supporting them in taking appropriate action in their local area.
[0138] Step 7:
[0139] The server generates personalized travel plans based on the user's emotional state, using their historical and emotional data.
[0140] Step 8:
[0141] The device presents the generated travel plan to the user and provides information to help the user achieve their desired travel experience.
[0142] Step 9:
[0143] During travel, the device periodically sends flight and environmental information to the server, continuously updating its status.
[0144] Step 10:
[0145] The server monitors this data, and if it detects an abnormal condition, it generates and quickly provides alternative solutions that take the user's feelings into consideration.
[0146] Step 11:
[0147] The device notifies the user of status updates, including alternatives, and assists the user in choosing their next course of action.
[0148] (Example 2)
[0149] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0150] There is a need to solve problems such as language barriers and stress caused by unexpected situations during modern travel, and to provide users with personalized travel experiences. However, conventional travel support systems have struggled to provide adaptive services that take into account the user's current emotional state and past preferences.
[0151] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0152] In this invention, the server includes information terminal means, data processing means, display means, plan generation means, and anomaly detection means. This enables real-time analysis of the user's intentions and emotions, provides adapted information and travel plans, and allows for a rich travel experience tailored to individual needs.
[0153] An "information terminal device" is a device that allows users to input information about their intentions and feelings during their travels, either by voice or text.
[0154] "Data processing means" refers to a process that utilizes an artificial intelligence engine to perform translation and sentiment analysis based on input user data.
[0155] A "display means" refers to a device or interface that visually presents the analyzed information in a way that is adapted to the user's emotional state.
[0156] The "plan generation means" is a function that generates a travel plan based on the user's past usage information and current emotional state.
[0157] An "anomaly detection method" is a function that monitors environmental information in real time and generates alternative solutions to respond to unexpected situations.
[0158] This invention is an advanced system for enhancing the user experience during travel in a more personalized way. This system is comprised of a combination of information terminal means, data processing means, display means, plan generation means, and anomaly detection means.
[0159] First, the user inputs their travel requirements and feelings into an information terminal via voice or text. The terminal includes voice recognition and a keyboard input device, and common devices such as smartphones and tablets are used.
[0160] The terminal sends this input data to the server, which processes the data using a generative AI model. Specifically, it uses natural language processing technology to perform translation and support communication between different languages. Furthermore, it uses an emotion analysis engine to detect the user's emotional state. This utilizes AI technology powered by cloud services.
[0161] Based on the analysis results, the server adjusts the information and returns it to the terminal in a way that is appropriate to the user's emotions. For example, if the user is feeling stressed, it will present a message that includes gentle language. LCD displays and voice guidance are used as means of display.
[0162] Furthermore, the plan generation system integrates the user's past travel history and emotional data to create a personalized travel plan. This plan includes quiet places such as museums and parks, designed to promote relaxation for the user.
[0163] Environmental information is monitored in real time, and anomaly detection measures utilize machine learning algorithms to allow servers to detect changes in flight information and traffic conditions. This enables the rapid provision of alternative solutions even when users encounter unexpected situations.
[0164] For example, if a user inputs "I want to go to a shopping mall, but I'm feeling stressed right now," the server will suggest a shopping mall with a relaxation space. An example of a prompt might be, "I'm in Paris. I'm feeling down. Where can I find a place to relax?"
[0165] These features enable the system to respond to users' emotions and needs in various situations during travel, providing a safe and comfortable travel experience.
[0166] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0167] Step 1:
[0168] Users input necessary information and emotions during their trip into the device via voice or text. This input data includes desired destinations, mood, and language translation preferences. The device receives this data, converts it into a structured format, and prepares to send it to the server.
[0169] Step 2:
[0170] The terminal sends user input data to the server. This data is securely transferred using the HTTPS protocol. The input includes user requests and emotions, which form the basis for subsequent data processing steps.
[0171] Step 3:
[0172] The server first processes the received data using a generative AI model. It analyzes the meaning of the data using natural language processing techniques and translates it into other languages. At this time, it uses open-source natural language processing libraries to analyze the input data and generate text translated into other languages.
[0173] Step 4:
[0174] The server simultaneously uses an emotion analysis engine to identify emotions from the user's input data. The analysis employs machine learning models to extract emotional indicators from the text. The output is generated as data indicating the user's emotional state, such as joy, stress, or relaxation.
[0175] Step 5:
[0176] The server generates messages and information tailored to the user's emotional state based on translation results and sentiment analysis. For example, if stress is detected, it creates a translation that incorporates gentler, more reassuring language. This data is then sent to the display device as the final response to the user.
[0177] Step 6:
[0178] The server uses a travel plan generation system to create personalized travel plans based on analysis results and the user's history information. The generated plans are constructed by referencing a database of past visited locations and preferences, and are then provided to the user.
[0179] Step 7:
[0180] The terminal continuously transmits real-time flight information, weather, and traffic conditions to the server. If an anomaly occurs, the server uses anomaly detection methods to create alternative plans. If flight delays or other issues are detected, the server suggests nearby, comfortable waiting locations to the user.
[0181] (Application Example 2)
[0182] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0183] The challenge lies in the lack of effective systems to alleviate the stress of language barriers, anxiety, and changes in plans that travelers face when encountering various situations in a foreign country, thereby providing a more comfortable and personalized travel experience.
[0184] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0185] In this invention, the server includes information device means for receiving input data from a user, information processing means using a generative AI model to translate the input data into another natural language, information presentation means for presenting the translated information to the user, and emotion analysis means for analyzing the user's emotions. This makes it possible to provide information and propose plans that are tailored to the emotional state of the traveler.
[0186] "Information equipment means" refers to a device for receiving input data from a user.
[0187] A "generative AI model" is a system that includes algorithms for translating natural language and performing other information processing.
[0188] "Information processing means" refers to a process for translating input data into another language and generating information that is useful to the user.
[0189] "Information presentation means" refers to a device or method for displaying translated content or other information to a user.
[0190] "Emotional analysis means" refers to technology that analyzes the emotional state from user input data and generates appropriate information based on that analysis.
[0191] The system that implements this application consists primarily of a user, a terminal, and a server. When the user inputs data using voice or text via an information device, the terminal transmits that data to the server in real time. This is done using a communication module.
[0192] The server uses a generative AI model to translate input data into other natural languages, while simultaneously analyzing the user's emotions using emotion analysis tools. Emotion analysis is the process of identifying an emotional state based on the context and wording of the input data. Based on this information, translated information and additional suggestions are generated by information processing tools. The generated information is provided to the user via information presentation tools according to the user's emotional state. For example, if the system determines that the user is feeling anxious, it can suggest a plan to "visit small museums and parks that will allow you to enjoy your trip with peace of mind."
[0193] As a concrete example, consider a situation where a user is feeling anxious about a hotel reservation. When the user speaks to their information device, the device senses their anxiety and sends data to the server. The server generates a translated message and emotionally sensitive information such as, "Don't worry, your reservation has been confirmed. We'll guide you on transportation so you can check in on time."
[0194] Generative AI model prompt example:
[0195] "If there are flight delays or cancellations, please provide information about relaxing spas."
[0196] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0197] Step 1:
[0198] The user inputs voice or text data through an information device. This data often contains information and questions related to the user's travel. The terminal collects this input data and prepares it for the next processing.
[0199] Step 2:
[0200] The terminal uses a communication module to send input data to the server. The input data is converted to text by a speech recognition engine and sent to the server. The output here is the data obtained by converting speech to text.
[0201] Step 3:
[0202] The server uses a generative AI model to translate text data into other natural languages. In this process, the conditions for translating the input data are determined based on prompt statements. The output is the translated text.
[0203] Step 4:
[0204] The server uses sentiment analysis tools to analyze the user's emotional state on the translated data. This analysis extracts specific emotional signals from the input data. The output provides information about the user's emotional state.
[0205] Step 5:
[0206] The server combines the translation results and sentiment analysis results to generate information or suggestions tailored to the user's emotional state. During this process, the information is presented in a way that flexibly responds to the user's emotional state. The output is customized information or suggestions provided to the user.
[0207] Step 6:
[0208] The terminal displays customized information received from the server to the user using an information presentation mechanism. This allows the user to directly receive appropriate and personalized information. The output is the information ultimately presented to the user.
[0209] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0210] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0211] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0212] [Second Embodiment]
[0213] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0214] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0215] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0216] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0217] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0218] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0219] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0220] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0221] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0222] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0223] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0224] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0225] This invention is a system designed to help travelers overcome language barriers and enable smooth communication and quick situational responses during their travels. This system operates using the user's smartphone or tablet and an external server.
[0226] First, the user uses the device to input the information they need during their trip. For example, if the user wants to make a reservation at a restaurant in their destination, they input this instruction into the device via voice or text. The device then transmits this input data to the server in real time.
[0227] The server uses a powerful generative model to translate incoming data into other languages. In doing so, the server considers the context to produce accurate translations. The generated translations are sent back to the terminal and presented to the user. This process allows users to communicate seamlessly with locals even in foreign countries.
[0228] The device also records the user's past travel history and preferences and sends them to the server. Based on this data, the server generates personalized travel plans tailored to the user's interests. For example, a user who enjoys visiting art museums will be offered a plan that includes information on special exhibitions being held in the city they are visiting.
[0229] Furthermore, the device continuously transmits flight information, location data, and other information collected during the trip to the server. The server analyzes this data and automatically generates alternatives if an abnormal situation occurs. For example, if a flight is canceled, the server suggests and presents alternative flights or modes of transportation to the user, ensuring that the flow of travel is not interrupted.
[0230] As a practical example, if a user is traveling in Paris, they can book a restaurant near the Eiffel Tower, receive a plan of museums they can visit, and get immediate information on new flights if there are flight delays. In this way, the present invention functions as a system that comprehensively provides support to travelers in various situations.
[0231] The following describes the processing flow.
[0232] Step 1:
[0233] Users input information via voice or text by operating a device. This information is necessary to support the user's communication and procedures on-site.
[0234] Step 2:
[0235] The terminal sends user input data to the server. During this process, the data is converted to an appropriate format before being sent.
[0236] Step 3:
[0237] The server analyzes the received data using a generative model and translates it into the specified language. This model enables highly accurate translation that takes context into account.
[0238] Step 4:
[0239] The server sends the translation results back to the terminal. The translated data is returned immediately, supporting multilingual communication.
[0240] Step 5:
[0241] The device displays the received translation results to the user and plays them back as audio if necessary. This allows the user to communicate smoothly with local people.
[0242] Step 6:
[0243] The server creates personalized travel plans based on the user's travel history and preferences. Machine learning is used in this process to provide the user with the most suitable plan.
[0244] Step 7:
[0245] The terminal displays travel plans received from the server to the user. The user can then review the proposed plans and create a trip that suits their preferences.
[0246] Step 8:
[0247] During the user's trip, the device periodically sends flight information and location data to the server. This ensures that the server always receives the latest travel information.
[0248] Step 9:
[0249] The server monitors this data and, upon detecting an abnormal condition (e.g., flight delays or cancellations), immediately generates a response plan.
[0250] Step 10:
[0251] The server sends a list of possible solutions, including alternatives, to the terminal and notifies the user. The user can then review the notification and decide on their next course of action from the presented options.
[0252] (Example 1)
[0253] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0254] Modern travelers face numerous challenges, including language barriers in foreign lands, the need for appropriate information tailored to their individual travel needs, and the need for quick responses to unexpected problems. These challenges hinder the smooth progress of travel and significantly degrade the user experience. The present invention aims to comprehensively solve these problems and improve the convenience and satisfaction of travelers.
[0255] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0256] In this invention, the server includes an information processing device means for acquiring information from a user, an information processing device means for using a generative artificial intelligence model to convert the information into another language, and a means for detecting abnormal conditions based on external information collected during travel and automatically creating alternative solutions. This enables smooth communication that transcends language barriers, the provision of personalized information, and rapid problem solving.
[0257] "Information processing device means" refers to a device or system that has the function of acquiring and processing input information from a user, and further transmitting and receiving data with an external system.
[0258] A "generative artificial intelligence model" is a model that uses advanced artificial intelligence technology to translate user input data into other languages or to generate recommended plans based on user behavior patterns.
[0259] "Output device means" refers to a processing device that displays or outputs audio to present information transmitted from the server to the user visually or audibly.
[0260] "Personalized suggestions" refer to suggestions that aim to provide optimal information and recommended plans tailored to the individual user's needs, based on the user's past behavior history and preferences.
[0261] An "abnormal condition" refers to a situation that deviates from the planned schedule or normal conditions, and includes, in particular, flight delays or cancellations during travel, as well as other unexpected problems.
[0262] This invention provides a system that enables travelers to overcome language barriers abroad, facilitating smooth communication and rapid situational response. The system consists of a terminal such as a smartphone or tablet for receiving user input and an external server for data processing. The system is realized by combining an information processing device, a generation AI model, and an output device.
[0263] First, the user inputs the information they need during their trip through the device. For example, they might use voice or text input to make a reservation at a restaurant they want to visit. This operation is performed using an application on the device. The device acquires data using voice recognition software or a text input system and sends that information to the server.
[0264] The server uses a generative AI model to translate received information into other languages. Leveraging natural language processing technology, it performs highly accurate, context-aware translations, allowing users to obtain accurate information even if they are unfamiliar with the foreign language. The translated information is then sent back to the terminal and displayed on the terminal's screen. Furthermore, the user can visually confirm the outputted translation results, enabling them to communicate effectively with local people.
[0265] Furthermore, the server suggests personalized itineraries based on the user's past travel history and preferences. The generating AI model creates the optimal travel plan for the user based on the tourist destinations the user has visited and the activities they have enjoyed in the past. For example, for a user who enjoys visiting art museums, it can suggest an itinerary that includes information on special exhibitions at the places they are visiting.
[0266] Furthermore, to address emergencies during travel, the device continuously transmits flight and location information to the server in real time, regardless of the situation. The server analyzes this information and monitors for any abnormalities. If an anomaly is detected, for example, if a flight delay occurs, the server automatically suggests alternative options. Users can then quickly decide on their next course of action by checking this information on their device.
[0267] As a concrete example, a prompt might read, "I want to make a reservation at a popular restaurant near the Eiffel Tower. Please translate this into the local language." The AI model then provides an appropriate translation. This system allows users to enjoy a high-quality travel experience.
[0268] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0269] Step 1:
[0270] The user inputs information using a device. The user enters necessary tasks at their travel destination (e.g., restaurant reservations) into the device via voice or text. The input data is processed by an application installed on the device and prepared as reservation information or text for translation.
[0271] Step 2:
[0272] The terminal sends the information entered by the user to the server. The transmitted data is converted into a format (e.g., JSON format) that includes the user's instructions and context. This allows the server to accurately interpret the data after receiving it.
[0273] Step 3:
[0274] The server uses a generative artificial intelligence model to translate received input information into another language. The input here is the transmitted text data. The server's process involves contextual analysis using natural language processing techniques to generate an accurate translation output.
[0275] Step 4:
[0276] The server sends the translated data to the terminal. The translated output is then formatted again and sent to the terminal. This allows the terminal to receive the translation accurately.
[0277] Step 5:
[0278] The terminal displays the translation results received from the server to the user. The translated text is displayed on the terminal's screen. The user can visually confirm this and communicate as needed in the local language.
[0279] Step 6:
[0280] The terminal sends data related to the user's past travel history and preferences to the server. The server analyzes this data and creates an optimal travel plan for the user. The plan includes information on preferred tourist destinations and events.
[0281] Step 7:
[0282] Even during the trip, the terminal continues to send flight information and location information to the server in real time. The server monitors for abnormal situations based on this information and generates alternative plans if necessary. For example, when a flight is cancelled, the server proposes alternative flights or transportation means.
[0283] Step 8:
[0284] The user can confirm the proposed alternatives through the terminal and select appropriate actions. This process makes it possible to minimize travel disruptions and provide a smooth experience.
[0285] (Application Example 1)
[0286] Next, Application Example 1 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".
[0287] There is a need for a system that enables travelers to maintain smooth communication in an environment of different cultures and languages and to quickly respond to abnormal situations during the trip. Also, support that can flexibly handle unexpected changes in the situation during the trip is required. To achieve this, personalized information provision considering the traveler's own past history and preferences is essential.
[0288] The specific processing by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0289] In this invention, the server includes an information processing device means for receiving input information from a user, a server device means that uses a generative model to translate the input information into another language and enable context-based notifications, a visualization means for presenting and visually displaying the translated information to the user, and a robot means for responding to user requests through interactive communication. This enables the user to travel with peace of mind in a foreign culture without experiencing language barriers and while dynamically optimizing their travel plan.
[0290] An "information processing device" is a device that receives input information from a user and has the function of processing voice and text as data and transmitting it to a server.
[0291] A "generative model" is a technology that analyzes input information and translates it into a target language; it is an algorithm that generates appropriate translation results based on context.
[0292] A "server device means" is a central network device designed to translate and appropriately process information received from users, utilizing a generative model.
[0293] A "visualization means" is a device or interface that displays translated information to the user, allowing them to visually confirm the information.
[0294] A "robot means" is an autonomous machine that enables interactive communication with users, responds to user requests, and provides information.
[0295] To realize this invention, a system is constructed by coordinating an information processing device, a server device, a visualization device, and a robot.
[0296] The server implements a generative model, converting user input into text using speech recognition technology (such as the Google Speech-to-Text API). After the input is converted to text, it is translated into the appropriate language using a generative AI model (such as the DeepL API). The translated information is displayed to the user in real time via a visualization device, such as a display or smart device that shows the translation results.
[0297] The robot acts as an interface for interacting with users, collecting information through user conversations and providing responses while communicating with a server. This allows travelers to communicate smoothly in unfamiliar cultures.
[0298] As an example of this system, if a user traveling in London asks the robot, "Can you tell me about some popular tourist attractions nearby?", the robot could respond, "There is a history museum nearby, and it is very popular." A specific example of a prompt to achieve this would be: "Translate user voice data and process it to respond in the user's native language. Example: 'What are some recommended tourist attractions nearby?'"
[0299] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0300] Step 1:
[0301] The user inputs their voice into the information processing device. The received voice data is converted into text data using speech recognition technology.
[0302] Step 2:
[0303] The text data is sent to the server. The server uses a generative model to translate the text data into the appropriate language. Contextual analysis is also performed to ensure context-aware translation. The output is the translated text.
[0304] Step 3:
[0305] The translated text is sent to the visualization device and displayed to the user. This allows the user to check the translation result on their terminal.
[0306] Step 4:
[0307] When the user makes additional questions or responses to the robot, they input voice as in the previous steps and communicate with the server again. The latest information is input via the information processing device, and the server processes it using the generation model. As a result, a translation response to the user's question is returned in real time.
[0308] This process enables the user during travel to communicate smoothly even in a foreign language environment.
[0309] Furthermore, an emotion engine for estimating the user's emotion may be combined. That is, the specific processing unit 290 may estimate the user's emotion using the emotion specific model 59 and perform specific processing using the user's emotion.
[0310] The present invention is an advanced system for dealing with various situations faced by users during travel and improving the travel experience. This system not only utilizes the user's input data for translation and travel plan provision, but also recognizes the user's emotion and adapts service provision based on it.
[0311] First, the user inputs their intentions and necessary information in voice or text to the terminal used during travel. This input data is sent to the server in real time by the terminal.
[0312] The server uses a generative model for translation and simultaneously analyzes the emotions contained in the user's input using an emotion engine. Based on this analysis, the translation and the information presented are adjusted to match the user's emotions. For example, if it is determined that the user is stressed, softer language can be used in the translation.
[0313] Furthermore, the server combines the user's past travel history with emotional data to generate a travel plan that more personalizes the user's travel experience. This travel plan is tailored not only to the user's preferences but also to their expected emotional state. For example, if the user is seeking relaxation, a plan that includes visits to quiet places such as museums and parks will be suggested.
[0314] During travel, the device continuously transmits flight information and environmental data to the server. This allows the server to detect abnormal conditions in real time and generate alternatives as needed. When suggesting alternatives, the emotion engine takes the user's emotional state into consideration, providing more appropriate options.
[0315] For example, if a user is experiencing discomfort due to a long wait at a foreign airport, the server can sense this emotion and provide information such as lounge access to help them spend their waiting time more comfortably. Thus, the present invention is a complex system that not only handles language and abnormal conditions but also provides adaptive support tailored to the user's emotions.
[0316] The following describes the processing flow.
[0317] Step 1:
[0318] Users enter necessary travel information into their device via voice or text. This data is collected based on the user's current requests and circumstances.
[0319] Step 2:
[0320] The device transmits user input data to the server in real time. This data is used for translation and sentiment analysis.
[0321] Step 3:
[0322] The server uses a generative model based on the received data to translate user input into the target language. This process takes context into account to ensure high translation accuracy.
[0323] Step 4:
[0324] The server simultaneously uses an emotion engine to recognize the emotions contained in the user's input data. Emotion analysis allows the user's emotional state to be understood.
[0325] Step 5:
[0326] The server adjusts the translation and sentiment analysis results to provide the user with the most appropriate translation and information. For example, for a user experiencing stress, words with a calming tone will be selected.
[0327] Step 6:
[0328] The device presents the user with the adjusted translation results, supporting them in taking appropriate action in their local area.
[0329] Step 7:
[0330] The server generates personalized travel plans based on the user's emotional state, using their historical and emotional data.
[0331] Step 8:
[0332] The device presents the generated travel plan to the user and provides information to help the user achieve their desired travel experience.
[0333] Step 9:
[0334] During travel, the device periodically sends flight and environmental information to the server, continuously updating its status.
[0335] Step 10:
[0336] The server monitors this data, and if it detects an abnormal condition, it generates and quickly provides alternative solutions that take the user's feelings into consideration.
[0337] Step 11:
[0338] The device notifies the user of status updates, including alternatives, and assists the user in choosing their next course of action.
[0339] (Example 2)
[0340] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0341] There is a need to solve problems such as language barriers and stress caused by unexpected situations during modern travel, and to provide users with personalized travel experiences. However, conventional travel support systems have struggled to provide adaptive services that take into account the user's current emotional state and past preferences.
[0342] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0343] In this invention, the server includes information terminal means, data processing means, display means, plan generation means, and anomaly detection means. This enables real-time analysis of the user's intentions and emotions, provides adapted information and travel plans, and allows for a rich travel experience tailored to individual needs.
[0344] An "information terminal device" is a device that allows users to input information about their intentions and feelings during their travels, either by voice or text.
[0345] "Data processing means" refers to a process that utilizes an artificial intelligence engine to perform translation and sentiment analysis based on input user data.
[0346] A "display means" refers to a device or interface that visually presents the analyzed information in a way that is adapted to the user's emotional state.
[0347] The "plan generation means" is a function that generates a travel plan based on the user's past usage information and current emotional state.
[0348] An "anomaly detection method" is a function that monitors environmental information in real time and generates alternative solutions to respond to unexpected situations.
[0349] This invention is an advanced system for enhancing the user experience during travel in a more personalized way. This system is comprised of a combination of information terminal means, data processing means, display means, plan generation means, and anomaly detection means.
[0350] First, the user inputs their travel requirements and feelings into an information terminal via voice or text. The terminal includes voice recognition and a keyboard input device, and common devices such as smartphones and tablets are used.
[0351] The terminal sends this input data to the server, which processes the data using a generative AI model. Specifically, it uses natural language processing technology to perform translation and support communication between different languages. Furthermore, it uses an emotion analysis engine to detect the user's emotional state. This utilizes AI technology powered by cloud services.
[0352] Based on the analysis results, the server adjusts the information and returns it to the terminal in a way that is appropriate to the user's emotions. For example, if the user is feeling stressed, it will present a message that includes gentle language. LCD displays and voice guidance are used as means of display.
[0353] Furthermore, the plan generation system integrates the user's past travel history and emotional data to create a personalized travel plan. This plan includes quiet places such as museums and parks, designed to promote relaxation for the user.
[0354] Environmental information is monitored in real time, and anomaly detection measures utilize machine learning algorithms to allow servers to detect changes in flight information and traffic conditions. This enables the rapid provision of alternative solutions even when users encounter unexpected situations.
[0355] For example, if a user inputs "I want to go to a shopping mall, but I'm feeling stressed right now," the server will suggest a shopping mall with a relaxation space. An example of a prompt might be, "I'm in Paris. I'm feeling down. Where can I find a place to relax?"
[0356] These features enable the system to respond to users' emotions and needs in various situations during travel, providing a safe and comfortable travel experience.
[0357] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0358] Step 1:
[0359] Users input necessary information and emotions during their trip into the device via voice or text. This input data includes desired destinations, mood, and language translation preferences. The device receives this data, converts it into a structured format, and prepares to send it to the server.
[0360] Step 2:
[0361] The terminal sends user input data to the server. This data is securely transferred using the HTTPS protocol. The input includes user requests and emotions, which form the basis for subsequent data processing steps.
[0362] Step 3:
[0363] The server first processes the received data using a generative AI model. It analyzes the meaning of the data using natural language processing techniques and translates it into other languages. At this time, it uses open-source natural language processing libraries to analyze the input data and generate text translated into other languages.
[0364] Step 4:
[0365] The server simultaneously uses an emotion analysis engine to identify emotions from the user's input data. The analysis employs machine learning models to extract emotional indicators from the text. The output is generated as data indicating the user's emotional state, such as joy, stress, or relaxation.
[0366] Step 5:
[0367] The server generates messages and information tailored to the user's emotional state based on translation results and sentiment analysis. For example, if stress is detected, it creates a translation that incorporates gentler, more reassuring language. This data is then sent to the display device as the final response to the user.
[0368] Step 6:
[0369] The server uses a travel plan generation system to create personalized travel plans based on analysis results and the user's history information. The generated plans are constructed by referencing a database of past visited locations and preferences, and are then provided to the user.
[0370] Step 7:
[0371] The terminal continuously transmits real-time flight information, weather, and traffic conditions to the server. If an anomaly occurs, the server uses anomaly detection methods to create alternative plans. If flight delays or other issues are detected, the server suggests nearby, comfortable waiting locations to the user.
[0372] (Application Example 2)
[0373] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0374] The challenge lies in the lack of effective systems to alleviate the stress of language barriers, anxiety, and changes in plans that travelers face when encountering various situations in a foreign country, thereby providing a more comfortable and personalized travel experience.
[0375] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0376] In this invention, the server includes information device means for receiving input data from a user, information processing means using a generative AI model to translate the input data into another natural language, information presentation means for presenting the translated information to the user, and emotion analysis means for analyzing the user's emotions. This makes it possible to provide information and propose plans that are tailored to the emotional state of the traveler.
[0377] "Information equipment means" refers to a device for receiving input data from a user.
[0378] A "generative AI model" is a system that includes algorithms for translating natural language and performing other information processing.
[0379] "Information processing means" refers to a process for translating input data into another language and generating information that is useful to the user.
[0380] "Information presentation means" refers to a device or method for displaying translated content or other information to a user.
[0381] "Emotional analysis means" refers to technology that analyzes the emotional state from user input data and generates appropriate information based on that analysis.
[0382] The system that implements this application consists primarily of a user, a terminal, and a server. When the user inputs data using voice or text via an information device, the terminal transmits that data to the server in real time. This is done using a communication module.
[0383] The server uses a generative AI model to translate input data into other natural languages, while simultaneously analyzing the user's emotions using emotion analysis tools. Emotion analysis is the process of identifying an emotional state based on the context and wording of the input data. Based on this information, translated information and additional suggestions are generated by information processing tools. The generated information is provided to the user via information presentation tools according to the user's emotional state. For example, if the system determines that the user is feeling anxious, it can suggest a plan to "visit small museums and parks that will allow you to enjoy your trip with peace of mind."
[0384] As a concrete example, consider a situation where a user is feeling anxious about a hotel reservation. When the user speaks to their information device, the device senses their anxiety and sends data to the server. The server generates a translated message and emotionally sensitive information such as, "Don't worry, your reservation has been confirmed. We'll guide you on transportation so you can check in on time."
[0385] Generative AI model prompt example:
[0386] "If there are flight delays or cancellations, please provide information about relaxing spas."
[0387] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0388] Step 1:
[0389] The user inputs voice or text data through an information device. This data often contains information and questions related to the user's travel. The terminal collects this input data and prepares it for the next processing.
[0390] Step 2:
[0391] The terminal uses a communication module to send input data to the server. The input data is converted to text by a speech recognition engine and sent to the server. The output here is the data obtained by converting speech to text.
[0392] Step 3:
[0393] The server uses a generative AI model to translate text data into other natural languages. In this process, the conditions for translating the input data are determined based on prompt statements. The output is the translated text.
[0394] Step 4:
[0395] The server uses sentiment analysis tools to analyze the user's emotional state on the translated data. This analysis extracts specific emotional signals from the input data. The output provides information about the user's emotional state.
[0396] Step 5:
[0397] The server combines the translation results and sentiment analysis results to generate information or suggestions tailored to the user's emotional state. During this process, the information is presented in a way that flexibly responds to the user's emotional state. The output is customized information or suggestions provided to the user.
[0398] Step 6:
[0399] The terminal displays customized information received from the server to the user using an information presentation mechanism. This allows the user to directly receive appropriate and personalized information. The output is the information ultimately presented to the user.
[0400] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0401] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0402] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0403] [Third Embodiment]
[0404] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0405] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0406] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0407] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0408] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0409] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0410] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0411] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0412] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0413] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0414] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0415] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0416] This invention is a system designed to help travelers overcome language barriers and enable smooth communication and quick situational responses during their travels. This system operates using the user's smartphone or tablet and an external server.
[0417] First, the user uses the device to input the information they need during their trip. For example, if the user wants to make a reservation at a restaurant in their destination, they input this instruction into the device via voice or text. The device then transmits this input data to the server in real time.
[0418] The server uses a powerful generative model to translate incoming data into other languages. In doing so, the server considers the context to produce accurate translations. The generated translations are sent back to the terminal and presented to the user. This process allows users to communicate seamlessly with locals even in foreign countries.
[0419] The device also records the user's past travel history and preferences and sends them to the server. Based on this data, the server generates personalized travel plans tailored to the user's interests. For example, a user who enjoys visiting art museums will be offered a plan that includes information on special exhibitions being held in the city they are visiting.
[0420] Furthermore, the device continuously transmits flight information, location data, and other information collected during the trip to the server. The server analyzes this data and automatically generates alternatives if an abnormal situation occurs. For example, if a flight is canceled, the server suggests and presents alternative flights or modes of transportation to the user, ensuring that the flow of travel is not interrupted.
[0421] As a practical example, if a user is traveling in Paris, they can book a restaurant near the Eiffel Tower, receive a plan of museums they can visit, and get immediate information on new flights if there are flight delays. In this way, the present invention functions as a system that comprehensively provides support to travelers in various situations.
[0422] The following describes the processing flow.
[0423] Step 1:
[0424] Users input information via voice or text by operating a device. This information is necessary to support the user's communication and procedures on-site.
[0425] Step 2:
[0426] The terminal sends user input data to the server. During this process, the data is converted to an appropriate format before being sent.
[0427] Step 3:
[0428] The server analyzes the received data using a generative model and translates it into the specified language. This model enables highly accurate translation that takes context into account.
[0429] Step 4:
[0430] The server sends the translation results back to the terminal. The translated data is returned immediately, supporting multilingual communication.
[0431] Step 5:
[0432] The device displays the received translation results to the user and plays them back as audio if necessary. This allows the user to communicate smoothly with local people.
[0433] Step 6:
[0434] The server creates personalized travel plans based on the user's travel history and preferences. Machine learning is used in this process to provide the user with the most suitable plan.
[0435] Step 7:
[0436] The terminal displays travel plans received from the server to the user. The user can then review the proposed plans and create a trip that suits their preferences.
[0437] Step 8:
[0438] During the user's trip, the device periodically sends flight information and location data to the server. This ensures that the server always receives the latest travel information.
[0439] Step 9:
[0440] The server monitors this data and, upon detecting an abnormal condition (e.g., flight delays or cancellations), immediately generates a response plan.
[0441] Step 10:
[0442] The server sends a list of possible solutions, including alternatives, to the terminal and notifies the user. The user can then review the notification and decide on their next course of action from the presented options.
[0443] (Example 1)
[0444] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0445] Modern travelers face numerous challenges, including language barriers in foreign lands, the need for appropriate information tailored to their individual travel needs, and the need for quick responses to unexpected problems. These challenges hinder the smooth progress of travel and significantly degrade the user experience. The present invention aims to comprehensively solve these problems and improve the convenience and satisfaction of travelers.
[0446] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0447] In this invention, the server includes an information processing device means for acquiring information from a user, an information processing device means for using a generative artificial intelligence model to convert the information into another language, and a means for detecting abnormal conditions based on external information collected during travel and automatically creating alternative solutions. This enables smooth communication that transcends language barriers, the provision of personalized information, and rapid problem solving.
[0448] "Information processing device means" refers to a device or system that has the function of acquiring and processing input information from a user, and further transmitting and receiving data with an external system.
[0449] A "generative artificial intelligence model" is a model that uses advanced artificial intelligence technology to translate user input data into other languages or to generate recommended plans based on user behavior patterns.
[0450] "Output device means" refers to a processing device that displays or outputs audio to present information transmitted from the server to the user visually or audibly.
[0451] "Personalized suggestions" refer to suggestions that aim to provide optimal information and recommended plans tailored to the individual user's needs, based on the user's past behavior history and preferences.
[0452] An "abnormal condition" refers to a situation that deviates from the planned schedule or normal conditions, and includes, in particular, flight delays or cancellations during travel, as well as other unexpected problems.
[0453] This invention provides a system that enables travelers to overcome language barriers abroad, facilitating smooth communication and rapid situational response. The system consists of a terminal such as a smartphone or tablet for receiving user input and an external server for data processing. The system is realized by combining an information processing device, a generation AI model, and an output device.
[0454] First, the user inputs the information they need during their trip through the device. For example, they might use voice or text input to make a reservation at a restaurant they want to visit. This operation is performed using an application on the device. The device acquires data using voice recognition software or a text input system and sends that information to the server.
[0455] The server uses a generative AI model to translate received information into other languages. Leveraging natural language processing technology, it performs highly accurate, context-aware translations, allowing users to obtain accurate information even if they are unfamiliar with the foreign language. The translated information is then sent back to the terminal and displayed on the terminal's screen. Furthermore, the user can visually confirm the outputted translation results, enabling them to communicate effectively with local people.
[0456] Furthermore, the server suggests personalized itineraries based on the user's past travel history and preferences. The generating AI model creates the optimal travel plan for the user based on the tourist destinations the user has visited and the activities they have enjoyed in the past. For example, for a user who enjoys visiting art museums, it can suggest an itinerary that includes information on special exhibitions at the places they are visiting.
[0457] Furthermore, to address emergencies during travel, the device continuously transmits flight and location information to the server in real time, regardless of the situation. The server analyzes this information and monitors for any abnormalities. If an anomaly is detected, for example, if a flight delay occurs, the server automatically suggests alternative options. Users can then quickly decide on their next course of action by checking this information on their device.
[0458] As a concrete example, a prompt might read, "I want to make a reservation at a popular restaurant near the Eiffel Tower. Please translate this into the local language." The AI model then provides an appropriate translation. This system allows users to enjoy a high-quality travel experience.
[0459] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0460] Step 1:
[0461] The user inputs information using a device. The user enters necessary tasks at their travel destination (e.g., restaurant reservations) into the device via voice or text. The input data is processed by an application installed on the device and prepared as reservation information or text for translation.
[0462] Step 2:
[0463] The terminal sends the information entered by the user to the server. The transmitted data is converted into a format (e.g., JSON format) that includes the user's instructions and context. This allows the server to accurately interpret the data after receiving it.
[0464] Step 3:
[0465] The server uses a generative artificial intelligence model to translate received input information into another language. The input here is the transmitted text data. The server's process involves contextual analysis using natural language processing techniques to generate an accurate translation output.
[0466] Step 4:
[0467] The server sends the translated data to the terminal. The translated output is then formatted again and sent to the terminal. This allows the terminal to receive the translation accurately.
[0468] Step 5:
[0469] The terminal displays the translation results received from the server to the user. The translated text is displayed on the terminal's screen. The user can visually confirm this and communicate as needed in the local language.
[0470] Step 6:
[0471] The device sends data about the user's past travel history and preferences to the server. The server analyzes this data and creates a travel plan optimized for the user. The plan includes information on preferred tourist destinations and events.
[0472] Step 7:
[0473] Even while traveling, the device continues to transmit flight and location information to the server in real time. The server uses this information to monitor for any abnormal situations and generates alternative plans if necessary. For example, if a flight is canceled, the server will suggest alternative flights or modes of transportation.
[0474] Step 8:
[0475] Users can review suggested alternatives through their devices and choose the appropriate course of action. This process minimizes travel interruptions and provides a smoother experience.
[0476] (Application Example 1)
[0477] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0478] A system is needed to enable travelers to maintain smooth communication in multicultural and multilingual environments and to respond quickly to unexpected situations during their trip. Furthermore, support is required to flexibly cope with unforeseen changes in circumstances during their travels. To achieve this, personalized information provision that takes into account the traveler's past history and preferences is essential.
[0479] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0480] In this invention, the server includes an information processing device means for receiving input information from a user, a server device means that uses a generative model to translate the input information into another language and enable context-based notifications, a visualization means for presenting and visually displaying the translated information to the user, and a robot means for responding to user requests through interactive communication. This enables the user to travel with peace of mind in a foreign culture without experiencing language barriers and while dynamically optimizing their travel plan.
[0481] An "information processing device" is a device that receives input information from a user and has the function of processing voice and text as data and transmitting it to a server.
[0482] A "generative model" is a technology that analyzes input information and translates it into a target language; it is an algorithm that generates appropriate translation results based on context.
[0483] A "server device means" is a central network device designed to translate and appropriately process information received from users, utilizing a generative model.
[0484] A "visualization means" is a device or interface that displays translated information to the user, allowing them to visually confirm the information.
[0485] A "robot means" is an autonomous machine that enables interactive communication with users, responds to user requests, and provides information.
[0486] To realize this invention, a system is constructed by coordinating an information processing device, a server device, a visualization device, and a robot.
[0487] The server implements a generative model, converting user input into text using speech recognition technology (such as the Google Speech-to-Text API). After the input is converted to text, it is translated into the appropriate language using a generative AI model (such as the DeepL API). The translated information is displayed to the user in real time via a visualization device, such as a display or smart device that shows the translation results.
[0488] The robot acts as an interface for interacting with users, collecting information through user conversations and providing responses while communicating with a server. This allows travelers to communicate smoothly in unfamiliar cultures.
[0489] As an example of this system, if a user traveling in London asks the robot, "Can you tell me about some popular tourist attractions nearby?", the robot could respond, "There is a history museum nearby, and it is very popular." A specific example of a prompt to achieve this would be: "Translate user voice data and process it to respond in the user's native language. Example: 'What are some recommended tourist attractions nearby?'"
[0490] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0491] Step 1:
[0492] The user inputs their voice into the information processing device. The received voice data is converted into text data using speech recognition technology.
[0493] Step 2:
[0494] The text data is sent to the server. The server uses a generative model to translate the text data into the appropriate language. Contextual analysis is also performed to ensure context-aware translation. The output is the translated text.
[0495] Step 3:
[0496] The translated text is sent to a visualization device and displayed to the user. This allows the user to check the translation results on their own device.
[0497] Step 4:
[0498] If the user has additional questions or responses to the robot, they input voice commands as in the previous step and communicate with the server again. The latest information is input via the information processing device and processed by the server using a generative model. As a result, translated responses to the user's questions are returned in real time.
[0499] This process enables users to communicate smoothly even in multilingual environments while traveling.
[0500] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0501] This invention is an advanced system designed to address various situations users face during their travels and enhance their travel experience. This system not only utilizes user input data to provide translations and travel plans, but also recognizes the user's emotions and adapts its service delivery accordingly.
[0502] First, the user inputs their intentions and necessary information via voice or text into the device they will use for their trip. This input data is then transmitted to the server in real time by the device.
[0503] The server uses a generative model for translation and simultaneously analyzes the emotions contained in the user's input using an emotion engine. Based on this analysis, the translation and the information presented are adjusted to match the user's emotions. For example, if it is determined that the user is stressed, softer language can be used in the translation.
[0504] Furthermore, the server combines the user's past travel history with emotional data to generate a travel plan that more personalizes the user's travel experience. This travel plan is tailored not only to the user's preferences but also to their expected emotional state. For example, if the user is seeking relaxation, a plan that includes visits to quiet places such as museums and parks will be suggested.
[0505] During travel, the device continuously transmits flight information and environmental data to the server. This allows the server to detect abnormal conditions in real time and generate alternatives as needed. When suggesting alternatives, the emotion engine takes the user's emotional state into consideration, providing more appropriate options.
[0506] For example, if a user is experiencing discomfort due to a long wait at a foreign airport, the server can sense this emotion and provide information such as lounge access to help them spend their waiting time more comfortably. Thus, the present invention is a complex system that not only handles language and abnormal conditions but also provides adaptive support tailored to the user's emotions.
[0507] The following describes the processing flow.
[0508] Step 1:
[0509] Users enter necessary travel information into their device via voice or text. This data is collected based on the user's current requests and circumstances.
[0510] Step 2:
[0511] The device transmits user input data to the server in real time. This data is used for translation and sentiment analysis.
[0512] Step 3:
[0513] The server uses a generative model based on the received data to translate user input into the target language. This process takes context into account to ensure high translation accuracy.
[0514] Step 4:
[0515] The server simultaneously uses an emotion engine to recognize the emotions contained in the user's input data. Emotion analysis allows the user's emotional state to be understood.
[0516] Step 5:
[0517] The server adjusts the translation and sentiment analysis results to provide the user with the most appropriate translation and information. For example, for a user experiencing stress, words with a calming tone will be selected.
[0518] Step 6:
[0519] The device presents the user with the adjusted translation results, supporting them in taking appropriate action in their local area.
[0520] Step 7:
[0521] The server generates personalized travel plans based on the user's emotional state, using their historical and emotional data.
[0522] Step 8:
[0523] The device presents the generated travel plan to the user and provides information to help the user achieve their desired travel experience.
[0524] Step 9:
[0525] During travel, the device periodically sends flight and environmental information to the server, continuously updating its status.
[0526] Step 10:
[0527] The server monitors this data, and if it detects an abnormal condition, it generates and quickly provides alternative solutions that take the user's feelings into consideration.
[0528] Step 11:
[0529] The device notifies the user of status updates, including alternatives, and assists the user in choosing their next course of action.
[0530] (Example 2)
[0531] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0532] There is a need to solve problems such as language barriers and stress caused by unexpected situations during modern travel, and to provide users with personalized travel experiences. However, conventional travel support systems have struggled to provide adaptive services that take into account the user's current emotional state and past preferences.
[0533] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0534] In this invention, the server includes information terminal means, data processing means, display means, plan generation means, and anomaly detection means. This enables real-time analysis of the user's intentions and emotions, provides adapted information and travel plans, and allows for a rich travel experience tailored to individual needs.
[0535] An "information terminal device" is a device that allows users to input information about their intentions and feelings during their travels, either by voice or text.
[0536] "Data processing means" refers to a process that utilizes an artificial intelligence engine to perform translation and sentiment analysis based on input user data.
[0537] A "display means" refers to a device or interface that visually presents the analyzed information in a way that is adapted to the user's emotional state.
[0538] The "plan generation means" is a function that generates a travel plan based on the user's past usage information and current emotional state.
[0539] An "anomaly detection method" is a function that monitors environmental information in real time and generates alternative solutions to respond to unexpected situations.
[0540] This invention is an advanced system for enhancing the user experience during travel in a more personalized way. This system is comprised of a combination of information terminal means, data processing means, display means, plan generation means, and anomaly detection means.
[0541] First, the user inputs their travel requirements and feelings into an information terminal via voice or text. The terminal includes voice recognition and a keyboard input device, and common devices such as smartphones and tablets are used.
[0542] The terminal sends this input data to the server, which processes the data using a generative AI model. Specifically, it uses natural language processing technology to perform translation and support communication between different languages. Furthermore, it uses an emotion analysis engine to detect the user's emotional state. This utilizes AI technology powered by cloud services.
[0543] Based on the analysis results, the server adjusts the information and returns it to the terminal in a way that is appropriate to the user's emotions. For example, if the user is feeling stressed, it will present a message that includes gentle language. LCD displays and voice guidance are used as means of display.
[0544] Furthermore, the plan generation system integrates the user's past travel history and emotional data to create a personalized travel plan. This plan includes quiet places such as museums and parks, designed to promote relaxation for the user.
[0545] Environmental information is monitored in real time, and anomaly detection measures utilize machine learning algorithms to allow servers to detect changes in flight information and traffic conditions. This enables the rapid provision of alternative solutions even when users encounter unexpected situations.
[0546] For example, if a user inputs "I want to go to a shopping mall, but I'm feeling stressed right now," the server will suggest a shopping mall with a relaxation space. An example of a prompt might be, "I'm in Paris. I'm feeling down. Where can I find a place to relax?"
[0547] These features enable the system to respond to users' emotions and needs in various situations during travel, providing a safe and comfortable travel experience.
[0548] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0549] Step 1:
[0550] Users input necessary information and emotions during their trip into the device via voice or text. This input data includes desired destinations, mood, and language translation preferences. The device receives this data, converts it into a structured format, and prepares to send it to the server.
[0551] Step 2:
[0552] The terminal sends user input data to the server. This data is securely transferred using the HTTPS protocol. The input includes user requests and emotions, which form the basis for subsequent data processing steps.
[0553] Step 3:
[0554] The server first processes the received data using a generative AI model. It analyzes the meaning of the data using natural language processing techniques and translates it into other languages. At this time, it uses open-source natural language processing libraries to analyze the input data and generate text translated into other languages.
[0555] Step 4:
[0556] The server simultaneously uses an emotion analysis engine to identify emotions from the user's input data. The analysis employs machine learning models to extract emotional indicators from the text. The output is generated as data indicating the user's emotional state, such as joy, stress, or relaxation.
[0557] Step 5:
[0558] The server generates messages and information tailored to the user's emotional state based on translation results and sentiment analysis. For example, if stress is detected, it creates a translation that incorporates gentler, more reassuring language. This data is then sent to the display device as the final response to the user.
[0559] Step 6:
[0560] The server uses a travel plan generation system to create personalized travel plans based on analysis results and the user's history information. The generated plans are constructed by referencing a database of past visited locations and preferences, and are then provided to the user.
[0561] Step 7:
[0562] The terminal continuously transmits real-time flight information, weather, and traffic conditions to the server. If an anomaly occurs, the server uses anomaly detection methods to create alternative plans. If flight delays or other issues are detected, the server suggests nearby, comfortable waiting locations to the user.
[0563] (Application Example 2)
[0564] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0565] The challenge lies in the lack of effective systems to alleviate the stress of language barriers, anxiety, and changes in plans that travelers face when encountering various situations in a foreign country, thereby providing a more comfortable and personalized travel experience.
[0566] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0567] In this invention, the server includes information device means for receiving input data from a user, information processing means using a generative AI model to translate the input data into another natural language, information presentation means for presenting the translated information to the user, and emotion analysis means for analyzing the user's emotions. This makes it possible to provide information and propose plans that are tailored to the emotional state of the traveler.
[0568] "Information equipment means" refers to a device for receiving input data from a user.
[0569] A "generative AI model" is a system that includes algorithms for translating natural language and performing other information processing.
[0570] "Information processing means" refers to a process for translating input data into another language and generating information that is useful to the user.
[0571] "Information presentation means" refers to a device or method for displaying translated content or other information to a user.
[0572] "Emotional analysis means" refers to technology that analyzes the emotional state from user input data and generates appropriate information based on that analysis.
[0573] The system that implements this application consists primarily of a user, a terminal, and a server. When the user inputs data using voice or text via an information device, the terminal transmits that data to the server in real time. This is done using a communication module.
[0574] The server uses a generative AI model to translate input data into other natural languages, while simultaneously analyzing the user's emotions using emotion analysis tools. Emotion analysis is the process of identifying an emotional state based on the context and wording of the input data. Based on this information, translated information and additional suggestions are generated by information processing tools. The generated information is provided to the user via information presentation tools according to the user's emotional state. For example, if the system determines that the user is feeling anxious, it can suggest a plan to "visit small museums and parks that will allow you to enjoy your trip with peace of mind."
[0575] As a concrete example, consider a situation where a user is feeling anxious about a hotel reservation. When the user speaks to their information device, the device senses their anxiety and sends data to the server. The server generates a translated message and emotionally sensitive information such as, "Don't worry, your reservation has been confirmed. We'll guide you on transportation so you can check in on time."
[0576] Generative AI model prompt example:
[0577] "If there are flight delays or cancellations, please provide information about relaxing spas."
[0578] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0579] Step 1:
[0580] The user inputs voice or text data through an information device. This data often contains information and questions related to the user's travel. The terminal collects this input data and prepares it for the next processing.
[0581] Step 2:
[0582] The terminal uses a communication module to send input data to the server. The input data is converted to text by a speech recognition engine and sent to the server. The output here is the data obtained by converting speech to text.
[0583] Step 3:
[0584] The server uses a generative AI model to translate text data into other natural languages. In this process, the conditions for translating the input data are determined based on prompt statements. The output is the translated text.
[0585] Step 4:
[0586] The server uses sentiment analysis tools to analyze the user's emotional state on the translated data. This analysis extracts specific emotional signals from the input data. The output provides information about the user's emotional state.
[0587] Step 5:
[0588] The server combines the translation results and sentiment analysis results to generate information or suggestions tailored to the user's emotional state. During this process, the information is presented in a way that flexibly responds to the user's emotional state. The output is customized information or suggestions provided to the user.
[0589] Step 6:
[0590] The terminal displays customized information received from the server to the user using an information presentation mechanism. This allows the user to directly receive appropriate and personalized information. The output is the information ultimately presented to the user.
[0591] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0592] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0593] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0594] [Fourth Embodiment]
[0595] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0596] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0597] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0598] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0599] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0600] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0601] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0602] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0603] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0604] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0605] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0606] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0607] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0608] This invention is a system designed to help travelers overcome language barriers and enable smooth communication and quick situational responses during their travels. This system operates using the user's smartphone or tablet and an external server.
[0609] First, the user uses the device to input the information they need during their trip. For example, if the user wants to make a reservation at a restaurant in their destination, they input this instruction into the device via voice or text. The device then transmits this input data to the server in real time.
[0610] The server uses a powerful generative model to translate incoming data into other languages. In doing so, the server considers the context to produce accurate translations. The generated translations are sent back to the terminal and presented to the user. This process allows users to communicate seamlessly with locals even in foreign countries.
[0611] The device also records the user's past travel history and preferences and sends them to the server. Based on this data, the server generates personalized travel plans tailored to the user's interests. For example, a user who enjoys visiting art museums will be offered a plan that includes information on special exhibitions being held in the city they are visiting.
[0612] Furthermore, the device continuously transmits flight information, location data, and other information collected during the trip to the server. The server analyzes this data and automatically generates alternatives if an abnormal situation occurs. For example, if a flight is canceled, the server suggests and presents alternative flights or modes of transportation to the user, ensuring that the flow of travel is not interrupted.
[0613] As a practical example, if a user is traveling in Paris, they can book a restaurant near the Eiffel Tower, receive a plan of museums they can visit, and get immediate information on new flights if there are flight delays. In this way, the present invention functions as a system that comprehensively provides support to travelers in various situations.
[0614] The following describes the processing flow.
[0615] Step 1:
[0616] Users input information via voice or text by operating a device. This information is necessary to support the user's communication and procedures on-site.
[0617] Step 2:
[0618] The terminal sends user input data to the server. During this process, the data is converted to an appropriate format before being sent.
[0619] Step 3:
[0620] The server analyzes the received data using a generative model and translates it into the specified language. This model enables highly accurate translation that takes context into account.
[0621] Step 4:
[0622] The server sends the translation results back to the terminal. The translated data is returned immediately, supporting multilingual communication.
[0623] Step 5:
[0624] The device displays the received translation results to the user and plays them back as audio if necessary. This allows the user to communicate smoothly with local people.
[0625] Step 6:
[0626] The server creates personalized travel plans based on the user's travel history and preferences. Machine learning is used in this process to provide the user with the most suitable plan.
[0627] Step 7:
[0628] The terminal displays travel plans received from the server to the user. The user can then review the proposed plans and create a trip that suits their preferences.
[0629] Step 8:
[0630] During the user's trip, the device periodically sends flight information and location data to the server. This ensures that the server always receives the latest travel information.
[0631] Step 9:
[0632] The server monitors this data and, upon detecting an abnormal condition (e.g., flight delays or cancellations), immediately generates a response plan.
[0633] Step 10:
[0634] The server sends a list of possible solutions, including alternatives, to the terminal and notifies the user. The user can then review the notification and decide on their next course of action from the presented options.
[0635] (Example 1)
[0636] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0637] Modern travelers face numerous challenges, including language barriers in foreign lands, the need for appropriate information tailored to their individual travel needs, and the need for quick responses to unexpected problems. These challenges hinder the smooth progress of travel and significantly degrade the user experience. The present invention aims to comprehensively solve these problems and improve the convenience and satisfaction of travelers.
[0638] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0639] In this invention, the server includes an information processing device means for acquiring information from a user, an information processing device means for using a generative artificial intelligence model to convert the information into another language, and a means for detecting abnormal conditions based on external information collected during travel and automatically creating alternative solutions. This enables smooth communication that transcends language barriers, the provision of personalized information, and rapid problem solving.
[0640] "Information processing device means" refers to a device or system that has the function of acquiring and processing input information from a user, and further transmitting and receiving data with an external system.
[0641] A "generative artificial intelligence model" is a model that uses advanced artificial intelligence technology to translate user input data into other languages or to generate recommended plans based on user behavior patterns.
[0642] "Output device means" refers to a processing device that displays or outputs audio to present information transmitted from the server to the user visually or audibly.
[0643] "Personalized suggestions" refer to suggestions that aim to provide optimal information and recommended plans tailored to the individual user's needs, based on the user's past behavior history and preferences.
[0644] An "abnormal condition" refers to a situation that deviates from the planned schedule or normal conditions, and includes, in particular, flight delays or cancellations during travel, as well as other unexpected problems.
[0645] This invention provides a system that enables travelers to overcome language barriers abroad, facilitating smooth communication and rapid situational response. The system consists of a terminal such as a smartphone or tablet for receiving user input and an external server for data processing. The system is realized by combining an information processing device, a generation AI model, and an output device.
[0646] First, the user inputs the information they need during their trip through the device. For example, they might use voice or text input to make a reservation at a restaurant they want to visit. This operation is performed using an application on the device. The device acquires data using voice recognition software or a text input system and sends that information to the server.
[0647] The server uses a generative AI model to translate received information into other languages. Leveraging natural language processing technology, it performs highly accurate, context-aware translations, allowing users to obtain accurate information even if they are unfamiliar with the foreign language. The translated information is then sent back to the terminal and displayed on the terminal's screen. Furthermore, the user can visually confirm the outputted translation results, enabling them to communicate effectively with local people.
[0648] Furthermore, the server suggests personalized itineraries based on the user's past travel history and preferences. The generating AI model creates the optimal travel plan for the user based on the tourist destinations the user has visited and the activities they have enjoyed in the past. For example, for a user who enjoys visiting art museums, it can suggest an itinerary that includes information on special exhibitions at the places they are visiting.
[0649] Furthermore, to address emergencies during travel, the device continuously transmits flight and location information to the server in real time, regardless of the situation. The server analyzes this information and monitors for any abnormalities. If an anomaly is detected, for example, if a flight delay occurs, the server automatically suggests alternative options. Users can then quickly decide on their next course of action by checking this information on their device.
[0650] As a concrete example, a prompt might read, "I want to make a reservation at a popular restaurant near the Eiffel Tower. Please translate this into the local language." The AI model then provides an appropriate translation. This system allows users to enjoy a high-quality travel experience.
[0651] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0652] Step 1:
[0653] The user inputs information using a device. The user enters necessary tasks at their travel destination (e.g., restaurant reservations) into the device via voice or text. The input data is processed by an application installed on the device and prepared as reservation information or text for translation.
[0654] Step 2:
[0655] The terminal sends the information entered by the user to the server. The transmitted data is converted into a format (e.g., JSON format) that includes the user's instructions and context. This allows the server to accurately interpret the data after receiving it.
[0656] Step 3:
[0657] The server uses a generative artificial intelligence model to translate received input information into another language. The input here is the transmitted text data. The server's process involves contextual analysis using natural language processing techniques to generate an accurate translation output.
[0658] Step 4:
[0659] The server sends the translated data to the terminal. The translated output is then formatted again and sent to the terminal. This allows the terminal to receive the translation accurately.
[0660] Step 5:
[0661] The terminal displays the translation results received from the server to the user. The translated text is displayed on the terminal's screen. The user can visually confirm this and communicate as needed in the local language.
[0662] Step 6:
[0663] The device sends data about the user's past travel history and preferences to the server. The server analyzes this data and creates a travel plan optimized for the user. The plan includes information on preferred tourist destinations and events.
[0664] Step 7:
[0665] Even while traveling, the device continues to transmit flight and location information to the server in real time. The server uses this information to monitor for any abnormal situations and generates alternative plans if necessary. For example, if a flight is canceled, the server will suggest alternative flights or modes of transportation.
[0666] Step 8:
[0667] Users can review suggested alternatives through their devices and choose the appropriate course of action. This process minimizes travel interruptions and provides a smoother experience.
[0668] (Application Example 1)
[0669] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0670] A system is needed to enable travelers to maintain smooth communication in multicultural and multilingual environments and to respond quickly to unexpected situations during their trip. Furthermore, support is required to flexibly cope with unforeseen changes in circumstances during their travels. To achieve this, personalized information provision that takes into account the traveler's past history and preferences is essential.
[0671] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0672] In this invention, the server includes an information processing device means for receiving input information from a user, a server device means that uses a generative model to translate the input information into another language and enable context-based notifications, a visualization means for presenting and visually displaying the translated information to the user, and a robot means for responding to user requests through interactive communication. This enables the user to travel with peace of mind in a foreign culture without experiencing language barriers and while dynamically optimizing their travel plan.
[0673] An "information processing device" is a device that receives input information from a user and has the function of processing voice and text as data and transmitting it to a server.
[0674] A "generative model" is a technology that analyzes input information and translates it into a target language; it is an algorithm that generates appropriate translation results based on context.
[0675] A "server device means" is a central network device designed to translate and appropriately process information received from users, utilizing a generative model.
[0676] A "visualization means" is a device or interface that displays translated information to the user, allowing them to visually confirm the information.
[0677] A "robot means" is an autonomous machine that enables interactive communication with users, responds to user requests, and provides information.
[0678] To realize this invention, a system is constructed by coordinating an information processing device, a server device, a visualization device, and a robot.
[0679] The server implements a generative model, converting user input into text using speech recognition technology (such as the Google Speech-to-Text API). After the input is converted to text, it is translated into the appropriate language using a generative AI model (such as the DeepL API). The translated information is displayed to the user in real time via a visualization device, such as a display or smart device that shows the translation results.
[0680] The robot acts as an interface for interacting with users, collecting information through user conversations and providing responses while communicating with a server. This allows travelers to communicate smoothly in unfamiliar cultures.
[0681] As an example of this system, if a user traveling in London asks the robot, "Can you tell me about some popular tourist attractions nearby?", the robot could respond, "There is a history museum nearby, and it is very popular." A specific example of a prompt to achieve this would be: "Translate user voice data and process it to respond in the user's native language. Example: 'What are some recommended tourist attractions nearby?'"
[0682] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0683] Step 1:
[0684] The user inputs their voice into the information processing device. The received voice data is converted into text data using speech recognition technology.
[0685] Step 2:
[0686] The text data is sent to the server. The server uses a generative model to translate the text data into the appropriate language. Contextual analysis is also performed to ensure context-aware translation. The output is the translated text.
[0687] Step 3:
[0688] The translated text is sent to a visualization device and displayed to the user. This allows the user to check the translation results on their own device.
[0689] Step 4:
[0690] If the user has additional questions or responses to the robot, they input voice commands as in the previous step and communicate with the server again. The latest information is input via the information processing device and processed by the server using a generative model. As a result, translated responses to the user's questions are returned in real time.
[0691] This process enables users to communicate smoothly even in multilingual environments while traveling.
[0692] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0693] This invention is an advanced system designed to address various situations users face during their travels and enhance their travel experience. This system not only utilizes user input data to provide translations and travel plans, but also recognizes the user's emotions and adapts its service delivery accordingly.
[0694] First, the user inputs their intentions and necessary information via voice or text into the device they will use for their trip. This input data is then transmitted to the server in real time by the device.
[0695] The server uses a generative model for translation and simultaneously analyzes the emotions contained in the user's input using an emotion engine. Based on this analysis, the translation and the information presented are adjusted to match the user's emotions. For example, if it is determined that the user is stressed, softer language can be used in the translation.
[0696] Furthermore, the server combines the user's past travel history with emotional data to generate a travel plan that more personalizes the user's travel experience. This travel plan is tailored not only to the user's preferences but also to their expected emotional state. For example, if the user is seeking relaxation, a plan that includes visits to quiet places such as museums and parks will be suggested.
[0697] During travel, the device continuously transmits flight information and environmental data to the server. This allows the server to detect abnormal conditions in real time and generate alternatives as needed. When suggesting alternatives, the emotion engine takes the user's emotional state into consideration, providing more appropriate options.
[0698] For example, if a user is experiencing discomfort due to a long wait at a foreign airport, the server can sense this emotion and provide information such as lounge access to help them spend their waiting time more comfortably. Thus, the present invention is a complex system that not only handles language and abnormal conditions but also provides adaptive support tailored to the user's emotions.
[0699] The following describes the processing flow.
[0700] Step 1:
[0701] Users enter necessary travel information into their device via voice or text. This data is collected based on the user's current requests and circumstances.
[0702] Step 2:
[0703] The device transmits user input data to the server in real time. This data is used for translation and sentiment analysis.
[0704] Step 3:
[0705] The server uses a generative model based on the received data to translate user input into the target language. This process takes context into account to ensure high translation accuracy.
[0706] Step 4:
[0707] The server simultaneously uses an emotion engine to recognize the emotions contained in the user's input data. Emotion analysis allows the user's emotional state to be understood.
[0708] Step 5:
[0709] The server adjusts the translation and sentiment analysis results to provide the user with the most appropriate translation and information. For example, for a user experiencing stress, words with a calming tone will be selected.
[0710] Step 6:
[0711] The device presents the user with the adjusted translation results, supporting them in taking appropriate action in their local area.
[0712] Step 7:
[0713] The server generates personalized travel plans based on the user's emotional state, using their historical and emotional data.
[0714] Step 8:
[0715] The device presents the generated travel plan to the user and provides information to help the user achieve their desired travel experience.
[0716] Step 9:
[0717] During travel, the device periodically sends flight and environmental information to the server, continuously updating its status.
[0718] Step 10:
[0719] The server monitors this data, and if it detects an abnormal condition, it generates and quickly provides alternative solutions that take the user's feelings into consideration.
[0720] Step 11:
[0721] The device notifies the user of status updates, including alternatives, and assists the user in choosing their next course of action.
[0722] (Example 2)
[0723] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0724] There is a need to solve problems such as language barriers and stress caused by unexpected situations during modern travel, and to provide users with personalized travel experiences. However, conventional travel support systems have struggled to provide adaptive services that take into account the user's current emotional state and past preferences.
[0725] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0726] In this invention, the server includes information terminal means, data processing means, display means, plan generation means, and anomaly detection means. This enables real-time analysis of the user's intentions and emotions, provides adapted information and travel plans, and allows for a rich travel experience tailored to individual needs.
[0727] An "information terminal device" is a device that allows users to input information about their intentions and feelings during their travels, either by voice or text.
[0728] "Data processing means" refers to a process that utilizes an artificial intelligence engine to perform translation and sentiment analysis based on input user data.
[0729] A "display means" refers to a device or interface that visually presents the analyzed information in a way that is adapted to the user's emotional state.
[0730] The "plan generation means" is a function that generates a travel plan based on the user's past usage information and current emotional state.
[0731] An "anomaly detection method" is a function that monitors environmental information in real time and generates alternative solutions to respond to unexpected situations.
[0732] This invention is an advanced system for enhancing the user experience during travel in a more personalized way. This system is comprised of a combination of information terminal means, data processing means, display means, plan generation means, and anomaly detection means.
[0733] First, the user inputs their travel requirements and feelings into an information terminal via voice or text. The terminal includes voice recognition and a keyboard input device, and common devices such as smartphones and tablets are used.
[0734] The terminal sends this input data to the server, which processes the data using a generative AI model. Specifically, it uses natural language processing technology to perform translation and support communication between different languages. Furthermore, it uses an emotion analysis engine to detect the user's emotional state. This utilizes AI technology powered by cloud services.
[0735] Based on the analysis results, the server adjusts the information and returns it to the terminal in a way that is appropriate to the user's emotions. For example, if the user is feeling stressed, it will present a message that includes gentle language. LCD displays and voice guidance are used as means of display.
[0736] Furthermore, the plan generation system integrates the user's past travel history and emotional data to create a personalized travel plan. This plan includes quiet places such as museums and parks, designed to promote relaxation for the user.
[0737] Environmental information is monitored in real time, and anomaly detection measures utilize machine learning algorithms to allow servers to detect changes in flight information and traffic conditions. This enables the rapid provision of alternative solutions even when users encounter unexpected situations.
[0738] For example, if a user inputs "I want to go to a shopping mall, but I'm feeling stressed right now," the server will suggest a shopping mall with a relaxation space. An example of a prompt might be, "I'm in Paris. I'm feeling down. Where can I find a place to relax?"
[0739] These features enable the system to respond to users' emotions and needs in various situations during travel, providing a safe and comfortable travel experience.
[0740] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0741] Step 1:
[0742] Users input necessary information and emotions during their trip into the device via voice or text. This input data includes desired destinations, mood, and language translation preferences. The device receives this data, converts it into a structured format, and prepares to send it to the server.
[0743] Step 2:
[0744] The terminal sends user input data to the server. This data is securely transferred using the HTTPS protocol. The input includes user requests and emotions, which form the basis for subsequent data processing steps.
[0745] Step 3:
[0746] The server first processes the received data using a generative AI model. It analyzes the meaning of the data using natural language processing techniques and translates it into other languages. At this time, it uses open-source natural language processing libraries to analyze the input data and generate text translated into other languages.
[0747] Step 4:
[0748] The server simultaneously uses an emotion analysis engine to identify emotions from the user's input data. The analysis employs machine learning models to extract emotional indicators from the text. The output is generated as data indicating the user's emotional state, such as joy, stress, or relaxation.
[0749] Step 5:
[0750] The server generates messages and information tailored to the user's emotional state based on translation results and sentiment analysis. For example, if stress is detected, it creates a translation that incorporates gentler, more reassuring language. This data is then sent to the display device as the final response to the user.
[0751] Step 6:
[0752] The server uses a travel plan generation system to create personalized travel plans based on analysis results and the user's history information. The generated plans are constructed by referencing a database of past visited locations and preferences, and are then provided to the user.
[0753] Step 7:
[0754] The terminal continuously transmits real-time flight information, weather, and traffic conditions to the server. If an anomaly occurs, the server uses anomaly detection methods to create alternative plans. If flight delays or other issues are detected, the server suggests nearby, comfortable waiting locations to the user.
[0755] (Application Example 2)
[0756] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0757] The challenge lies in the lack of effective systems to alleviate the stress of language barriers, anxiety, and changes in plans that travelers face when encountering various situations in a foreign country, thereby providing a more comfortable and personalized travel experience.
[0758] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0759] In this invention, the server includes information device means for receiving input data from a user, information processing means using a generative AI model to translate the input data into another natural language, information presentation means for presenting the translated information to the user, and emotion analysis means for analyzing the user's emotions. This makes it possible to provide information and propose plans that are tailored to the emotional state of the traveler.
[0760] "Information equipment means" refers to a device for receiving input data from a user.
[0761] A "generative AI model" is a system that includes algorithms for translating natural language and performing other information processing.
[0762] "Information processing means" refers to a process for translating input data into another language and generating information that is useful to the user.
[0763] "Information presentation means" refers to a device or method for displaying translated content or other information to a user.
[0764] "Emotional analysis means" refers to technology that analyzes the emotional state from user input data and generates appropriate information based on that analysis.
[0765] The system that implements this application consists primarily of a user, a terminal, and a server. When the user inputs data using voice or text via an information device, the terminal transmits that data to the server in real time. This is done using a communication module.
[0766] The server uses a generative AI model to translate input data into other natural languages, while simultaneously analyzing the user's emotions using emotion analysis tools. Emotion analysis is the process of identifying an emotional state based on the context and wording of the input data. Based on this information, translated information and additional suggestions are generated by information processing tools. The generated information is provided to the user via information presentation tools according to the user's emotional state. For example, if the system determines that the user is feeling anxious, it can suggest a plan to "visit small museums and parks that will allow you to enjoy your trip with peace of mind."
[0767] As a concrete example, consider a situation where a user is feeling anxious about a hotel reservation. When the user speaks to their information device, the device senses their anxiety and sends data to the server. The server generates a translated message and emotionally sensitive information such as, "Don't worry, your reservation has been confirmed. We'll guide you on transportation so you can check in on time."
[0768] Generative AI model prompt example:
[0769] "If there are flight delays or cancellations, please provide information about relaxing spas."
[0770] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0771] Step 1:
[0772] The user inputs voice or text data through an information device. This data often contains information and questions related to the user's travel. The terminal collects this input data and prepares it for the next processing.
[0773] Step 2:
[0774] The terminal uses a communication module to send input data to the server. The input data is converted to text by a speech recognition engine and sent to the server. The output here is the data obtained by converting speech to text.
[0775] Step 3:
[0776] The server uses a generative AI model to translate text data into other natural languages. In this process, the conditions for translating the input data are determined based on prompt statements. The output is the translated text.
[0777] Step 4:
[0778] The server uses sentiment analysis tools to analyze the user's emotional state on the translated data. This analysis extracts specific emotional signals from the input data. The output provides information about the user's emotional state.
[0779] Step 5:
[0780] The server combines the translation results and sentiment analysis results to generate information or suggestions tailored to the user's emotional state. During this process, the information is presented in a way that flexibly responds to the user's emotional state. The output is customized information or suggestions provided to the user.
[0781] Step 6:
[0782] The terminal displays customized information received from the server to the user using an information presentation mechanism. This allows the user to directly receive appropriate and personalized information. The output is the information ultimately presented to the user.
[0783] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0784] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0785] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0786] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0787] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0788] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0789] The inside of the Emotion Map 400 represents what's in your mind, while the outside represents what you're doing. Therefore, the further you go out the 400-coordinate scale, the more visible your emotions become (the more they manifest in your actions).
[0790] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0791] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0792] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0793] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0794] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0795] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0796] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0797] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0798] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0799] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0800] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0801] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0802] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0803] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0804] The following is further disclosed regarding the embodiments described above.
[0805] (Claim 1)
[0806] A terminal means for receiving input data from the user,
[0807] A server means that uses a generative model for translating the input data into another language,
[0808] A display means for presenting the translated data to the user,
[0809] A system that includes this.
[0810] (Claim 2)
[0811] The system according to claim 1, comprising a generative model that generates a travel plan based on the user's history and preferences.
[0812] (Claim 3)
[0813] The system according to claim 1, comprising means for detecting abnormal conditions during travel in real time and generating alternative plans.
[0814] "Example 1"
[0815] (Claim 1)
[0816] Information processing device means for acquiring information from the user,
[0817] Information processing device means that uses a generative artificial intelligence model for converting the information into another language,
[0818] Output device means for presenting the translated information,
[0819] A means for generating personalized suggestions based on the user's past behavior patterns and preference information,
[0820] A means of detecting abnormal conditions based on external information collected during travel and automatically creating alternative plans,
[0821] A system that includes this.
[0822] (Claim 2)
[0823] The system according to claim 1, which uses a generative artificial intelligence model that creates travel plans based on the user's behavioral history and preferences.
[0824] (Claim 3)
[0825] The system according to claim 1, comprising means for continuously monitoring abnormalities during travel and proposing alternatives as necessary.
[0826] "Application Example 1"
[0827] (Claim 1)
[0828] Information processing device means for receiving input information from the user,
[0829] A server device means that uses a generative model to translate the input information into another language and enable context-based notifications,
[0830] A visualization means for presenting and visually displaying the translated information to the user,
[0831] A robotic means that responds to user requests through interactive communication,
[0832] A system that includes this.
[0833] (Claim 2)
[0834] The system according to claim 1, comprising a generative model that creates a visit plan based on the user's past history and preferences.
[0835] (Claim 3)
[0836] The system according to claim 1, further comprising means for sequentially detecting abnormal conditions during a visit and creating and instructing on alternative measures.
[0837] "Example 2 of combining an emotion engine"
[0838] (Claim 1)
[0839] An information terminal means for receiving input regarding the user's intentions and emotions,
[0840] A data processing method that uses an artificial intelligence engine to translate input data into another language and simultaneously perform sentiment analysis,
[0841] A display means that generates the optimal information representation based on the analysis results and presents it to the user,
[0842] A travel plan generation means that generates a travel plan based on past usage information and emotional state,
[0843] An anomaly detection method that collects environmental information in real time and provides alternative solutions according to the situation,
[0844] A system that includes this.
[0845] (Claim 2)
[0846] The system according to claim 1, comprising a process for adaptively adjusting information according to the user's current emotional state.
[0847] (Claim 3)
[0848] The system according to claim 1, comprising means for generating and presenting alternatives while taking into account the emotional state of the user during travel.
[0849] "Application example 2 when combining with an emotional engine"
[0850] (Claim 1)
[0851] Information equipment means for receiving input data from the user,
[0852] Information processing means that uses a generative AI model to translate the input data into another natural language,
[0853] Information presentation means for presenting the translated information to the user,
[0854] A system that includes emotion analysis tools for analyzing user emotions.
[0855] (Claim 2)
[0856] The system according to claim 1, comprising a generative model that generates a travel plan based on the user's history information, preferences, and emotional information.
[0857] (Claim 3)
[0858] The system according to claim 1, comprising information processing means for detecting abnormal conditions during travel in real time and generating alternative solutions corresponding to the emotional state. [Explanation of symbols]
[0859] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. Information processing device means for receiving input information from the user, A server device means that uses a generative model to translate the input information into another language and enable context-based notifications, A visualization means for presenting and visually displaying the translated information to the user, A robotic means that responds to user requests through interactive communication, A system that includes this.
2. The system according to claim 1, comprising a generative model that creates a visit plan based on the user's past history and preferences.
3. The system according to claim 1, further comprising means for sequentially detecting abnormal conditions during a visit and creating and instructing on alternative measures.