system

The system addresses the challenge of standardized guide services by generating customized content based on user location, interests, and emotional state, delivering personalized and engaging information to enhance the tourist experience.

JP2026104549APending Publication Date: 2026-06-25SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-13
Publication Date
2026-06-25

AI Technical Summary

Technical Problem

Conventional guide services in museums and historical buildings struggle to provide personalized information based on individual user interests, knowledge levels, and language preferences, often offering standardized content that fails to engage users effectively.

Method used

A system that utilizes user location and interest information to retrieve relevant data from a database, generates customized guide content using a generative model, and delivers it in the user's preferred language, optimizing for their interests and emotional state.

Benefits of technology

Enriches the user experience by providing personalized, emotionally tailored, and linguistically appropriate information, enhancing engagement and knowledge acquisition at tourist destinations.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026104549000001_ABST
    Figure 2026104549000001_ABST
Patent Text Reader

Abstract

We provide the system. [Solution] A means for acquiring location information and interest information through a user interface that accepts user input information, A means for obtaining relevant information from a database based on the acquired location information and interest information, A means for generating customized guide content based on the aforementioned related information using a generative model, A means for distributing the generated guide content to the user terminal, A means of collecting interest information from visitors via voice and processing location information in real time, A means of communicating with a server in the cloud and generating content suitable for visitors using a generative AI model, A system that includes this.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] Conventional guide services in museums and historical buildings have remained at providing general information and have had difficulty accurately providing information according to individual interests and knowledge levels. Also, existing guide services are often in a predetermined tour format and content, and there is a problem that they cannot fully draw out the interests of users. Furthermore, there are problems with language and device compatibility, making it difficult to accommodate a diverse user base.

Means for Solving the Problems

[0005] This invention is a system that provides customized guide content based on the user's location and interest information. Specifically, it acquires location and interest information entered from the user's terminal and retrieves relevant information from a database based on this information. Furthermore, a generative model is used to generate user-specific guide content based on the acquired information and deliver it in the optimal format, thereby responding to individual interests and needs. In addition, by optimizing the guide content based on language settings, it can accommodate diverse language environments. As a result, users can have a richer travel experience.

[0006] A "user interface" is an interface that allows a user to interact with a system, and is a means of assisting in the input and output of information.

[0007] "Location information" refers to digital data about a user's current location, including geographical coordinates expressed as latitude and longitude.

[0008] "Interest information" refers to categories and keywords that indicate a user's interests, and is data used to specialize the guide content provided by the system.

[0009] A "database" is a structured digital system for storing and managing information, a mechanism that allows for the rapid retrieval and searching of specific information as needed.

[0010] "Relevant information" refers to content retrieved from a database based on the user's location and interests, and includes details about specific facilities or exhibits.

[0011] A "generative model" is an algorithm or AI model that processes and generates data based on input data, and is a means of generating customized content.

[0012] "Guide content" refers to a collection of information provided to users, including customized content such as explanations and descriptions of tourist facilities and exhibits.

[0013] "Distribution" refers to the process of transferring generated content to a user's device and making it playable as audio or text.

[0014] "Language settings" refer to the device's settings information related to the language the user uses, and affect the selection of display language and sound during interactions with the system. [Brief explanation of the drawing]

[0015] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11]It is a sequence diagram showing the processing flow of the data processing system in Embodiment 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Embodiment 2 when combined with an emotion engine. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when combined with an emotion engine.

Mode for Carrying Out the Invention

[0016] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0017] First, the terms used in the following description will be explained.

[0018] In the following embodiments, a numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0019] In the following embodiments, a numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0020] In the following embodiments, the signed storage is one or more non-volatile storage devices that store various programs and various parameters. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes.

[0021] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0022] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0023] [First Embodiment]

[0024] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0025] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0026] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0027] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0028] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0029] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0030] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0031] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0032] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0033] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0034] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0035] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0036] This invention relates to an embodiment of a system that personalizes a user's travel experience and efficiently provides information. This system consists of a user terminal, a server, and a database.

[0037] First, the user's device displays an interface for collecting location and interest information from the user. The user can select their current GPS location and categories of interest, such as "history" or "art." Once the user has made their selections, the device sends this information to the server.

[0038] The server queries an internal database based on location and interest information received from the user. This database stores detailed information about each tourist destination and exhibit, and the server quickly searches and retrieves relevant information. The retrieved information is categorized according to what the user is likely to be interested in.

[0039] Subsequently, the server utilizes a generative model to generate customized guide content based on the acquired information. This generative model generates detailed explanations and stories tailored to the user's specific interests, smoothly structuring personalized content.

[0040] The generated guide content is sent to the user's device and can be received by the user in audio or text format. The user's device plays this content according to the user's language settings, so users can enjoy explanations and guides in their native language.

[0041] As a concrete example, consider a user who has a particular interest in history. If this user visits an old castle, the system will generate in-depth information about the castle's history, the context of its construction, and important historical events. This allows the user to gain detailed knowledge that would not be available through a typical visit, resulting in a more enriching sightseeing experience.

[0042] Thus, the system of the present invention can provide information tailored to the individual interests and backgrounds of users, making their experiences at tourist destinations more personal and enriching.

[0043] The following describes the processing flow.

[0044] Step 1:

[0045] The user's device requests permission from the user through a user interface to obtain their current location. This interface also displays an option to select interest categories related to their destination (e.g., history, art, architecture, etc.). The user enters and submits their location information and interest categories according to the on-screen instructions.

[0046] Step 2:

[0047] The user's device sends its acquired location information (latitude and longitude) and selected interest categories to the server. This data serves as the basis for generating uniquely customized guide content for each user.

[0048] Step 3:

[0049] The server queries its internal database based on the received location information and interest categories. This query extracts detailed information about tourist attractions and exhibits related to the user's current location, prioritizing information that matches the user's interests.

[0050] Step 4:

[0051] The server uses the acquired information to launch a generative model. This model is designed to generate content tailored to the user's interests, customizing and creating guide content that includes in-depth knowledge of historical context and art.

[0052] Step 5:

[0053] The server sends the generated guide content to the user interface on the user's terminal. This content is translated or converted to speech, taking into account the user's language settings, and provided in the form of audio guidance or text display.

[0054] Step 6:

[0055] The user's device analyzes the received guide content and plays it back in the format selected by the user (audio or text). The user views tourist attractions through the provided guide and enriches their visit experience with individually customized information.

[0056] This processing flow enables the system to provide users with timely information based on their individual needs and interests.

[0057] (Example 1)

[0058] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0059] Traditional tourist guide systems have struggled to provide detailed, real-time information based on individual user interests and location. Furthermore, they often lacked the ability to generate flexible guide content tailored to users' language and preferences, resulting in a lack of user satisfaction.

[0060] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0061] In this invention, the server includes means for acquiring location information and preference information via a user interface, means for acquiring relevant content from information resources, and means for generating personalized guidance content using a model for generation. This makes it possible to provide detailed and personalized guidance information in real time according to the user's individual interests and language settings.

[0062] A "user interface" is a means of interaction used to receive input information from the user and to obtain location information and preference information.

[0063] "Location information" refers to data that indicates the geographical location where a user's terminal or device is currently located.

[0064] "Preference information" refers to data related to categories and themes that users are interested in.

[0065] "Information resources" refer to databases and information systems that store and provide accessible information about tourist destinations and other related topics.

[0066] A "model for generation" refers to an algorithm or artificial intelligence model that creates personalized guidance content suitable for the user based on acquired information.

[0067] "Guidance content" refers to guide data generated by individually tailoring tourism-related information to the user's interests.

[0068] "User's device" refers to a terminal or device used by the user, which is hardware used to receive, display, or play guidance content.

[0069] A "prompt" is an instruction or question that is input into a model to elicit a specific output.

[0070] This invention is a system that provides information to users in a more personalized and efficient way, enhancing their tourism experience. The system consists of a user terminal, a server, and a database as an information resource.

[0071] The user selects their current GPS location and categories of interest, such as "history" or "art," through the user interface on their device. This information is collected by the user's device and sent to the server.

[0072] The server queries a database within its information resources based on location and preference information obtained through the user interface. These resources contain detailed information about tourist attractions and exhibits. The server uses this data to quickly search for and retrieve information relevant to the user's interests.

[0073] The acquired information is input into a generation model, and the generation AI model generates personalized guidance content. This prompt is used to give specific instructions to the generation AI model. For example, a prompt such as "The user is interested in history. Please generate detailed historical explanations related to the following castles" can be used.

[0074] The generated guidance content is sent to the user's device. The user terminal displays the guidance content in audio or text format, allowing the user to receive interesting and detailed information in their native language. For example, if a user with a particular interest in history visits an old castle, this system can provide in-depth information about the castle's construction background and important historical events.

[0075] In this way, the present invention makes it possible to provide users with enriching experiences at tourist destinations that are tailored to their individual interests.

[0076] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0077] Step 1:

[0078] The user enters their current GPS location and categories of interest using the user interface on their device. The entered information is compiled into a data structure on the device and prepared for transmission to the server. For example, if the user selects the "History" category, that selection and their GPS location are combined.

[0079] Step 2:

[0080] The device transmits location and interest information obtained from the user to the server. The transmitted data is structured as packets and securely transferred to the server using security protocols.

[0081] Step 3:

[0082] The server searches the database for relevant information based on location and interest information received from the terminal. Here, it accesses the database using search queries such as SQL to retrieve information about tourist destinations and events relevant to the user.

[0083] Step 4:

[0084] The acquired information is set as input data for the generating AI model. The server generates prompts, instructing the AI ​​model to generate customized guidance based on the user's interests. Specifically, a prompt such as "The user is interested in history. Please generate detailed historical explanations related to the following castles" is created.

[0085] Step 5:

[0086] The generation AI model generates personalized guidance content tailored to the user based on the prompt text. The generated guidance content is stored on the server in text format, ready for subsequent processing.

[0087] Step 6:

[0088] The server sends the generated guidance content to the user's terminal. The transmission is done in real time, ensuring the user can access the information immediately.

[0089] Step 7:

[0090] The user terminal provides the received guidance to the user in either audio or text format. If speech synthesis is used, the terminal plays the content aloud, allowing the user to receive the information aurally. If a native language setting is configured, the guidance content is automatically adapted to the selected language.

[0091] Through this series of steps, the system provides users with a fulfilling travel experience tailored to their individual interests.

[0092] (Application Example 1)

[0093] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0094] Modern tourist destinations require information that caters to the diverse interests and languages ​​of visitors. However, current systems can only provide uniform information, making it difficult to offer a personalized tourist experience for each visitor. Therefore, the challenge lies in providing more individualized guide content in real time, based on each visitor's interests and location.

[0095] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0096] In this invention, the server includes means for acquiring location information and interest information via a user interface that accepts user input information; means for acquiring relevant information from a database based on the acquired location information and interest information; means for generating customized guide content based on the relevant information using a generative model; means for collecting interest information from visitors via voice and processing location information in real time; and means for communicating with a server on the cloud and generating content suitable for visitors using a generative AI model. This makes it possible to provide real-time tourist guide content based on the individual interests and location information of each visitor.

[0097] A "user interface" is an interface used by users to input information into a system, and it is responsible for receiving location information and interest information.

[0098] "Location information" refers to information that indicates the current location of a user or visitor, and is obtained using technologies such as GPS.

[0099] "Interest information" refers to information indicating categories or themes that users or visitors are particularly interested in, and is used to customize tourist guide content.

[0100] A "database" is a source of information where data is stored, and it is accessed to retrieve relevant information based on the user's location and interests.

[0101] A "generative model" is an algorithm or program that generates customized guide content based on acquired information, creating content that is suitable for the user.

[0102] "Guide content" refers to a collection of tourist information provided to users or visitors, which can be delivered in audio or text format.

[0103] "Speech synthesis" is a technology that converts generated text information into speech, providing information to visitors audibly.

[0104] A "cloud server" is a remote server accessed via the internet, and it serves as infrastructure for managing and processing the large amounts of data that a system handles.

[0105] This invention is a system that provides personalized tourist experiences to users visiting tourist destinations. Based on the user's location and interests, this system generates and delivers customized guide content in real time to the user's device.

[0106] First, the user inputs their location and interests using a user interface via a smartphone or a dedicated tourist guide robot. This interface includes GPS to confirm the user's current location and options to select categories of interest.

[0107] User input information is sent to a cloud server via the internet connection. The server uses this information to issue queries to its internally stored database and collect relevant tourist information. This information includes details about the historical background and works of art related to the visited location.

[0108] The collected information is transformed into stories and explanations tailored to the user's interests using a generative AI model. The generated guide content is then converted into audio format using text or speech synthesis technology and delivered to the user's device. A Python library is used for speech synthesis, allowing users to receive the content in their preferred language.

[0109] For example, if a visitor is interested in "medieval history," they will be provided with a detailed, customized audio guide about the history of the castle they are visiting and the important events of that era. Visitors can also request information from the robot using prompts such as, "Tell me more about the medieval history of this castle."

[0110] This allows users to enjoy a fulfilling travel experience tailored to their interests, gaining deeper knowledge and making discoveries that differ from traditional, standardized tourist guides.

[0111] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0112] Step 1:

[0113] Users input their current location and categories of interest using a smartphone or a tourist guide robot. This input data is retrieved through the user interface and is readily available for easy manipulation via a GPS module and category selection options. As part of the input data processing, the location information is prepared for transmission to the server in latitude and longitude format.

[0114] Step 2:

[0115] The device sends user input information to the cloud server. In this communication, location information and interest information are sent together to the server in JSON format. During data transmission, encoding is performed to prevent data corruption and ensure maximum communication efficiency.

[0116] Step 3:

[0117] The server queries an internal database based on the received location and interest information. The database stores tourist destination information, and the query extracts only the data relevant to the user's interests. As part of data processing, an SQL statement is constructed to fetch the necessary information.

[0118] Step 4:

[0119] The server converts data acquired using a generative AI model into customized guide content. The AI ​​model is fed with acquired tourist information as input, and the process generates a story optimized for the user's interests. As a data calculation, natural language generation is performed on the text data.

[0120] Step 5:

[0121] The server delivers the generated guide content to the user's terminal. Using a speech synthesis library, the text content is converted into audio format, and the user's terminal prepares to play it. An audio file is generated as data output and sent to the terminal via the network.

[0122] Step 6:

[0123] The user terminal saves received audio files locally and plays them back according to user input. In particular, the system plays information of the user's interest based on prompts. Users can start, stop, and skip playback, efficiently obtaining tourist information through the audio guide.

[0124] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0125] This invention provides a system that personalizes users' travel experiences and enables the provision of detailed information based on their emotional state. The system includes a user terminal, a server, a database, and an emotion engine.

[0126] The user terminal is equipped with an interface for collecting location and interest information. Through this, the user provides the location of their destination and selects categories of interest. The terminal also prepares to acquire emotional information using voice and camera, with the user's permission. Emotional information is obtained from subtle changes in the user's facial expressions and vocal intonation.

[0127] Next, this information is sent to the server in real time. The server queries the database based on the received location and interest information to retrieve information related to the visited location. In this process, the type and amount of data required are dynamically changed according to the user's emotional state.

[0128] The server then uses an emotion engine to analyze emotional information and identify the user's current emotional state. This emotional state plays a crucial role in customizing the guide content; for example, if the user is excited, a more detailed and engaging story is prepared, while if they are calm, a more relaxing explanation is provided.

[0129] The generative model generates customized guide content based on collected data. This guide content is optimized based on the user's emotional state, interests, and language settings, enabling a personalized experience. The generated content is delivered to the user's device for playback in the appropriate format.

[0130] For example, in the case of a user visiting an art museum, the system reads their emotions and, if it determines that they are interested in or amazed by the artwork, provides more detailed information about the historical background of the work and anecdotes about the artist. This allows the user to receive information that perfectly matches their emotions at that moment.

[0131] Thus, by combining an emotional engine, the present invention can achieve a higher level of personalization than traditional guide systems, enriching the visitor experience.

[0132] The following describes the processing flow.

[0133] Step 1:

[0134] The user device displays a user interface, prompting the user to input location information and categories of interest. The user can also use the device's GPS to automatically set their location. Furthermore, the user grants access to the device's microphone and camera, enabling the capture of emotional information.

[0135] Step 2:

[0136] The user's device transmits acquired location information, interest information, and emotional data (voice and facial expression data) to the server. Emotional data is collected from the user's facial expressions and tone of voice in real time.

[0137] Step 3:

[0138] The server queries its internal database based on the transmitted location and interest information. This query extracts relevant information about the visited location. This information includes the history of the facility, details about the exhibits, and anecdotes about the artists and buildings.

[0139] Step 4:

[0140] The server uses an emotion engine to analyze the transmitted emotion data. From the analyzed data, it determines the user's current emotional state and provides the generation model with content adjustment instructions that are most appropriate for that emotion.

[0141] Step 5:

[0142] The server activates a generative model based on the acquired relevant information and emotional state data to generate customized guide content. For example, if the user is in an excited state, it will include dynamic narratives and detailed information that match their emotions.

[0143] Step 6:

[0144] The generated guide content is optimized based on the user's language settings and delivered from the server to the user's device. The guide content is provided to the user in the selected format (audio or text).

[0145] Step 7:

[0146] The user's device will play the received guide content as audio or display it as text as needed, allowing the user to receive rich, individually customized information. This process ensures that the user has an optimal experience tailored to their emotions at that moment.

[0147] (Example 2)

[0148] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0149] Conventional guide systems standardize the content provided based on the user's location and interest information, and do not offer the detailed personalization that takes into account the emotional state of individual users. As a result, it has been difficult for users to receive information that matches their emotions and interests throughout their sightseeing experience. This invention aims to improve the quality of the experience by analyzing the user's emotional information and dynamically adjusting the information provided based on that information.

[0150] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0151] In this invention, the server includes means for acquiring geographical and interest information via a display device that accepts user input information, means for analyzing the user's emotional information, and means for inputting prompts to a generating AI model to generate customized guide content. This makes it possible to provide information according to the user's emotional state, thereby enabling a more personalized travel experience.

[0152] "User input information" refers to geographical and interest information provided by the user through the interface.

[0153] A "display device" refers to a device equipped with an interface for users to input information and receive information from a system.

[0154] "Geographic information" refers to information that indicates the user's current location or the location of their destination.

[0155] "Interest information" refers to information related to the user's specific interests or selected categories.

[0156] A "storage device" refers to a device that includes a database that holds large amounts of data and provides information as needed.

[0157] "Emotional information" refers to data that indicates the user's emotional state, and is obtained through the analysis of voice and facial expressions.

[0158] "Means for analyzing emotional information" refers to methods or devices for processing a user's emotional information and identifying their current emotional state.

[0159] A "generative AI model" refers to a technology that utilizes artificial intelligence to generate optimized content based on user input and emotional information.

[0160] A "prompt" refers to an instruction or input sentence that a generative AI model uses to generate specific content.

[0161] "Customized guide content" refers to guidance information provided in a format that is most suitable for the user's experience, based on their individual data.

[0162] "User device" refers to a device used by a user to receive information.

[0163] This invention is a system that provides more individually customized information to users when they visit tourist destinations. The system utilizes user input information and emotional information to generate customized guide content using a generative AI model. Specifically, it uses the following hardware and software.

[0164] System Configuration

[0165] 1. User terminal

[0166] Users input geographical information about tourist destinations (such as their current location) and categories of interest (such as art or history) using devices such as smartphones or tablets. These devices are equipped with display devices and have the functionality to retrieve information through their interfaces.

[0167] 2. Emotion acquisition device

[0168] The device utilizes its built-in camera and microphone to detect the user's facial expressions and voice tone, collecting emotional information. This information is obtained only with the user's permission.

[0169] 3. Servers and Databases

[0170] The server retrieves relevant information from the database based on the geographical and interest information entered by the user. The database contains detailed information related to tourist destinations.

[0171] 4. Emotion Analysis Engine

[0172] The server uses an emotion analysis engine to analyze the acquired emotion information and identify the user's current emotional state. This enables the provision of information tailored to the user's emotional state.

[0173] 5. Generative AI Models

[0174] The server generates customized guide content using a generative AI model based on the analyzed data. This model operates based on pre-configured prompts.

[0175] Specific examples and prompt statements

[0176] For example, when a user visits a museum and views artwork, if they feel intrigued or surprised, the system generates a guide that explains the historical background of the artwork and anecdotes about the artist in detail. Another example of a prompt that might be used is, "Based on the user's current emotional state, please generate detailed information about the artwork in the museum that interests them."

[0177] This makes it possible to provide information optimized according to the user's emotions and interests, thereby enriching the tourism experience.

[0178] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0179] Step 1:

[0180] The user inputs geographical information about the tourist destination and categories of interest through the terminal's display device. The terminal acquires this input information and collects the current location's latitude and longitude data as geographical information and the selected categories (e.g., culture, nature, history, etc.) as interest information. The terminal's operation is the collection of information using the input interface.

[0181] Step 2:

[0182] The device uses voice recognition and camera functions to acquire the user's emotional information. This information includes emotional data derived from voice intonation and facial expressions. The acquired emotional information is transmitted to the server in real time by the device. In this process, the input is emotional information from facial expressions and voice, and the output is emotional data.

[0183] Step 3:

[0184] The server receives geographical information, interest information, and sentiment information sent from the terminal. The server uses this information to issue queries to its storage device (database) to retrieve information related to tourist destinations (e.g., details of exhibits, facility descriptions, etc.). Specific data manipulation in the database involves extracting records that match specific criteria. The input is location information and interest information, and the output is related information.

[0185] Step 4:

[0186] The server uses an emotion analysis engine to analyze the received emotion information and identify the user's emotional state. This analysis determines the type of emotion (joy, surprise, calmness, etc.), and the result is used for subsequent processing. The input is emotion data, and the output is an evaluation result of the emotional state.

[0187] Step 5:

[0188] The server prompts the AI ​​model based on the analysis results to generate customized guide content. Here, prompts are used, for example, "Create information tailored to the current emotional state," to create content optimized for the user experience. The input consists of the emotional state and prompts, while the output is the customized content.

[0189] Step 6:

[0190] The generated guide content is delivered from the server to the user's device. The device receives this content and provides it to the user in the form of audio guides or text displays. The input is the generated content, and the output is the information the user receives visually or aurally.

[0191] (Application Example 2)

[0192] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0193] Traditional navigation systems provide uniform information without considering the user's emotional state, making it difficult to maximize the individual user experience. Furthermore, they lack mechanisms to dynamically optimize information based on the user's real-time emotions, meaning the information users receive may not align with their expectations or interests.

[0194] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0195] In this invention, the server includes means for acquiring location data and information on areas of interest via an interface for receiving user input information; means for acquiring relevant data from information sources based on the acquired location data and information on areas of interest; means for analyzing the user's emotional state based on the acquired emotional information; means for generating personalized guide content based on the relevant data using a generative model with the results of the emotional analysis; and means for transmitting the generated guide content to the user terminal. This makes it possible to provide personalized guide information that is appropriate to the user's emotional state in real time.

[0196] "User input information" refers to location data and information related to areas of interest that users provide to the system.

[0197] An "interface" is a means of exchanging information between a user and a system.

[0198] "Location data" refers to information about the user's current location.

[0199] "Areas of interest" refers to information about categories and topics that the user is interested in.

[0200] An "information source" is a collection of searchable data that exists in databases or on the internet.

[0201] "Emotional information" refers to data about emotions that can be gleaned from a user's facial expressions and tone of voice.

[0202] "Emotional state" refers to the state of a user's emotions at a particular point in time.

[0203] A "generative model" is an algorithm used to generate content based on input information.

[0204] "Personalized guide content" refers to guidance information that is customized to the user's emotional state and interests.

[0205] A "user terminal" refers to a device that a user uses to receive information.

[0206] As an embodiment of this invention, a user terminal such as a smartphone or smart glasses is used. The user terminal is equipped with an interface that receives location data and information on areas of interest from the user. Furthermore, it is equipped with a camera and microphone to acquire emotional information that detects subtle changes in the user's facial expressions and voice. This emotional information is analyzed using software such as OpenCV.

[0207] The server receives location data and sentiment information transmitted from the user's device. Next, it uses a sentiment engine to retrieve relevant data from the information sources and analyze the user's emotional state. Based on the user's sentiment analysis, it generates personalized guide content using cloud services such as Google Cloud Platform. The generated content is then delivered to the user's device in an appropriate format.

[0208] For example, if a tourist visiting a historical building is detected to be emotionally excited, the server will provide the user with detailed history and interesting anecdotes about that building. In this process, the generative AI model can generate customized content by receiving instructions such as, "Input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, include detailed history and anecdotes."

[0209] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0210] Step 1:

[0211] The user terminal receives the user's location data and information about their areas of interest. This information is input as GPS data provided by the user and the categories of interests they have selected. This allows the system to obtain the user's interest information, which forms the basis for subsequent processing.

[0212] Step 2:

[0213] The user terminal collects user facial expression data and voice data through its camera and microphone. This data is input as emotional information, and the user's emotional state is analyzed in real time using OpenCV. The output here is the analyzed emotional state data.

[0214] Step 3:

[0215] The server receives location data, information on areas of interest, and sentiment information from the user's terminal. It then uses this data as input to query information sources and retrieve relevant data. This allows the server to prepare specific information to provide to the user.

[0216] Step 4:

[0217] The server uses an emotion engine to analyze the user's emotional state in detail based on the received emotion analysis data. This process helps determine the appropriate content direction based on the user's emotions. The output identifies the main emotional states expressed by the user.

[0218] Step 5:

[0219] The server uses a generative AI model to generate personalized guide content based on all input data. Prompts are used to provide specific instructions, such as, "Please input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, please include detailed history and anecdotes." The content generated during this process is then output.

[0220] Step 6:

[0221] The server sends the generated personalized guide content to the user's terminal. On the user's terminal, this content is played back in the appropriate format and provided to the user. The final output is personalized information in a format the user can receive.

[0222] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0223] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0224] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0225] [Second Embodiment]

[0226] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0227] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0228] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0229] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0230] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0231] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0232] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0233] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0234] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0235] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0236] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0237] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0238] This invention relates to an embodiment of a system that personalizes a user's travel experience and efficiently provides information. This system consists of a user terminal, a server, and a database.

[0239] First, the user's device displays an interface for collecting location and interest information from the user. The user can select their current GPS location and categories of interest, such as "history" or "art." Once the user has made their selections, the device sends this information to the server.

[0240] The server queries an internal database based on location and interest information received from the user. This database stores detailed information about each tourist destination and exhibit, and the server quickly searches and retrieves relevant information. The retrieved information is categorized according to what the user is likely to be interested in.

[0241] Subsequently, the server utilizes a generative model to generate customized guide content based on the acquired information. This generative model generates detailed explanations and stories tailored to the user's specific interests, smoothly structuring personalized content.

[0242] The generated guide content is sent to the user's device and can be received by the user in audio or text format. The user's device plays this content according to the user's language settings, so users can enjoy explanations and guides in their native language.

[0243] As a concrete example, consider a user who has a particular interest in history. If this user visits an old castle, the system will generate in-depth information about the castle's history, the context of its construction, and important historical events. This allows the user to gain detailed knowledge that would not be available through a typical visit, resulting in a more enriching sightseeing experience.

[0244] Thus, the system of the present invention can provide information tailored to the individual interests and backgrounds of users, making their experiences at tourist destinations more personal and enriching.

[0245] The following describes the processing flow.

[0246] Step 1:

[0247] The user's device requests permission from the user through a user interface to obtain their current location. This interface also displays an option to select interest categories related to their destination (e.g., history, art, architecture, etc.). The user enters and submits their location information and interest categories according to the on-screen instructions.

[0248] Step 2:

[0249] The user's device sends its acquired location information (latitude and longitude) and selected interest categories to the server. This data serves as the basis for generating uniquely customized guide content for each user.

[0250] Step 3:

[0251] The server queries its internal database based on the received location information and interest categories. This query extracts detailed information about tourist attractions and exhibits related to the user's current location, prioritizing information that matches the user's interests.

[0252] Step 4:

[0253] The server uses the acquired information to launch a generative model. This model is designed to generate content tailored to the user's interests, customizing and creating guide content that includes in-depth knowledge of historical context and art.

[0254] Step 5:

[0255] The server sends the generated guide content to the user interface on the user's terminal. This content is translated or converted to speech, taking into account the user's language settings, and provided in the form of audio guidance or text display.

[0256] Step 6:

[0257] The user's device analyzes the received guide content and plays it back in the format selected by the user (audio or text). The user views tourist attractions through the provided guide and enriches their visit experience with individually customized information.

[0258] This processing flow enables the system to provide users with timely information based on their individual needs and interests.

[0259] (Example 1)

[0260] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0261] Traditional tourist guide systems have struggled to provide detailed, real-time information based on individual user interests and location. Furthermore, they often lacked the ability to generate flexible guide content tailored to users' language and preferences, resulting in a lack of user satisfaction.

[0262] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0263] In this invention, the server includes means for acquiring location information and preference information via a user interface, means for acquiring relevant content from information resources, and means for generating personalized guidance content using a model for generation. This makes it possible to provide detailed and personalized guidance information in real time according to the user's individual interests and language settings.

[0264] A "user interface" is a means of interaction used to receive input information from the user and to obtain location information and preference information.

[0265] "Location information" refers to data that indicates the geographical location where a user's terminal or device is currently located.

[0266] "Preference information" refers to data related to categories and themes that users are interested in.

[0267] "Information resources" refer to databases and information systems that store and provide accessible information about tourist destinations and other related topics.

[0268] A "model for generation" refers to an algorithm or artificial intelligence model that creates personalized guidance content suitable for the user based on acquired information.

[0269] "Guidance content" refers to guide data generated by individually tailoring tourism-related information to the user's interests.

[0270] "User's device" refers to a terminal or device used by the user, which is hardware used to receive, display, or play guidance content.

[0271] A "prompt" is an instruction or question that is input into a model to elicit a specific output.

[0272] This invention is a system that provides information to users in a more personalized and efficient way, enhancing their tourism experience. The system consists of a user terminal, a server, and a database as an information resource.

[0273] The user selects their current GPS location and categories of interest, such as "history" or "art," through the user interface on their device. This information is collected by the user's device and sent to the server.

[0274] The server queries a database within its information resources based on location and preference information obtained through the user interface. These resources contain detailed information about tourist attractions and exhibits. The server uses this data to quickly search for and retrieve information relevant to the user's interests.

[0275] The acquired information is input into a generation model, and the generation AI model generates personalized guidance content. This prompt is used to give specific instructions to the generation AI model. For example, a prompt such as "The user is interested in history. Please generate detailed historical explanations related to the following castles" can be used.

[0276] The generated guidance content is sent to the user's device. The user terminal displays the guidance content in audio or text format, allowing the user to receive interesting and detailed information in their native language. For example, if a user with a particular interest in history visits an old castle, this system can provide in-depth information about the castle's construction background and important historical events.

[0277] In this way, the present invention makes it possible to provide users with enriching experiences at tourist destinations that are tailored to their individual interests.

[0278] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0279] Step 1:

[0280] The user enters their current GPS location and categories of interest using the user interface on their device. The entered information is compiled into a data structure on the device and prepared for transmission to the server. For example, if the user selects the "History" category, that selection and their GPS location are combined.

[0281] Step 2:

[0282] The terminal sends the location information and interest information obtained from the user to the server. The transmitted data is structured as packets and securely transferred to the server using a security protocol.

[0283] Step 3:

[0284] Based on the location information and interest information received from the terminal, the server searches for relevant information in the database. Here, a search query such as SQL is used to access the database and obtain information about tourist attractions and events related to the user.

[0285] Step 4:

[0286] The obtained information is set as input data for the generation AI model. The server generates a prompt sentence to instruct the AI model to generate customized guidance content based on the user's interests. Specifically, a prompt such as "The user is interested in history. Please generate a detailed historical explanation related to the following castles" is created.

[0287] Step 5:

[0288] The generation AI model generates individualized guidance content suitable for the user based on the prompt sentence. The generated guidance content is stored in the server in text format for subsequent processing.

[0289] Step 6:

[0290] The server sends the generated guidance content to the user terminal. The transmission is carried out in real time so that the user can use the information immediately.

[0291] Step 7:

[0292] The user terminal provides the received guidance to the user in either audio or text format. If speech synthesis is used, the terminal plays the content aloud, allowing the user to receive the information aurally. If a native language setting is configured, the guidance content is automatically adapted to the selected language.

[0293] Through this series of steps, the system provides users with a fulfilling travel experience tailored to their individual interests.

[0294] (Application Example 1)

[0295] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0296] Modern tourist destinations require information that caters to the diverse interests and languages ​​of visitors. However, current systems can only provide uniform information, making it difficult to offer a personalized tourist experience for each visitor. Therefore, the challenge lies in providing more individualized guide content in real time, based on each visitor's interests and location.

[0297] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0298] In this invention, the server includes means for acquiring location information and interest information via a user interface that accepts user input information; means for acquiring relevant information from a database based on the acquired location information and interest information; means for generating customized guide content based on the relevant information using a generative model; means for collecting interest information from visitors via voice and processing location information in real time; and means for communicating with a server on the cloud and generating content suitable for visitors using a generative AI model. This makes it possible to provide real-time tourist guide content based on the individual interests and location information of each visitor.

[0299] A "user interface" is an interface used by users to input information into a system, and it is responsible for receiving location information and interest information.

[0300] "Location information" refers to information that indicates the current location of a user or visitor, and is obtained using technologies such as GPS.

[0301] "Interest information" refers to information indicating categories or themes that users or visitors are particularly interested in, and is used to customize tourist guide content.

[0302] A "database" is a source of information where data is stored, and it is accessed to retrieve relevant information based on the user's location and interests.

[0303] A "generative model" is an algorithm or program that generates customized guide content based on acquired information, creating content that is suitable for the user.

[0304] "Guide content" refers to a collection of tourist information provided to users or visitors, which can be delivered in audio or text format.

[0305] "Speech synthesis" is a technology that converts generated text information into speech, providing information to visitors audibly.

[0306] A "cloud server" is a remote server accessed via the internet, and it serves as infrastructure for managing and processing the large amounts of data that a system handles.

[0307] This invention is a system that provides a personalized tourism experience for users visiting tourist destinations. This system generates real-time customized guide content based on the user's location information and information on interests, and distributes it to the user terminal.

[0308] First, the user uses a smartphone or a dedicated tourism guide robot and inputs location information and interest information using the user interface. This interface has a GPS for the user to confirm their current location and options for selecting categories of interest.

[0309] The input information of the user is transmitted to the cloud server via an Internet connection. The server utilizes this information to issue a query to the database stored internally and collect relevant tourism information. This information includes historical backgrounds related to the visited location and details of artworks.

[0310] The collected information is converted into stories and explanations tailored to the user's interests using a generative AI model. The generated guide content is converted into an audio format using text or speech synthesis technology and distributed to the user terminal. A Python library is used for speech synthesis, and the user can receive the content in their preferred language.

[0311] As a specific example, when a certain visitor is interested in "medieval history", a highly customized audio guide about the history of the castle they visit and important events of that era is provided. The visitor can also request information from the robot using a prompt sentence such as "Please tell me in detail about the medieval history of this castle."

[0312] As a result, the user can enjoy a fulfilling tourism experience that suits their own interests, and can obtain profound knowledge and discoveries, which is different from the conventional uniform tourism guides.

[0313] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0314] Step 1:

[0315] Users input their current location and categories of interest using a smartphone or a tourist guide robot. This input data is retrieved through the user interface and is readily available for easy manipulation via a GPS module and category selection options. As part of the input data processing, the location information is prepared for transmission to the server in latitude and longitude format.

[0316] Step 2:

[0317] The device sends user input information to the cloud server. In this communication, location information and interest information are sent together to the server in JSON format. During data transmission, encoding is performed to prevent data corruption and ensure maximum communication efficiency.

[0318] Step 3:

[0319] The server queries an internal database based on the received location and interest information. The database stores tourist destination information, and the query extracts only the data relevant to the user's interests. As part of data processing, an SQL statement is constructed to fetch the necessary information.

[0320] Step 4:

[0321] The server converts data acquired using a generative AI model into customized guide content. The AI ​​model is fed with acquired tourist information as input, and the process generates a story optimized for the user's interests. As a data calculation, natural language generation is performed on the text data.

[0322] Step 5:

[0323] The server delivers the generated guide content to the user's terminal. Using a speech synthesis library, the text content is converted into audio format, and the user's terminal prepares to play it. An audio file is generated as data output and sent to the terminal via the network.

[0324] Step 6:

[0325] The user terminal saves received audio files locally and plays them back according to user input. In particular, the system plays information of the user's interest based on prompts. Users can start, stop, and skip playback, efficiently obtaining tourist information through the audio guide.

[0326] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0327] This invention provides a system that personalizes users' travel experiences and enables the provision of detailed information based on their emotional state. The system includes a user terminal, a server, a database, and an emotion engine.

[0328] The user terminal is equipped with an interface for collecting location and interest information. Through this, the user provides the location of their destination and selects categories of interest. The terminal also prepares to acquire emotional information using voice and camera, with the user's permission. Emotional information is obtained from subtle changes in the user's facial expressions and vocal intonation.

[0329] Next, this information is sent to the server in real time. The server queries the database based on the received location and interest information to retrieve information related to the visited location. In this process, the type and amount of data required are dynamically changed according to the user's emotional state.

[0330] The server then uses an emotion engine to analyze emotional information and identify the user's current emotional state. This emotional state plays a crucial role in customizing the guide content; for example, if the user is excited, a more detailed and engaging story is prepared, while if they are calm, a more relaxing explanation is provided.

[0331] The generative model generates customized guide content based on collected data. This guide content is optimized based on the user's emotional state, interests, and language settings, enabling a personalized experience. The generated content is delivered to the user's device for playback in the appropriate format.

[0332] For example, in the case of a user visiting an art museum, the system reads their emotions and, if it determines that they are interested in or amazed by the artwork, provides more detailed information about the historical background of the work and anecdotes about the artist. This allows the user to receive information that perfectly matches their emotions at that moment.

[0333] Thus, by combining an emotional engine, the present invention can achieve a higher level of personalization than traditional guide systems, enriching the visitor experience.

[0334] The following describes the processing flow.

[0335] Step 1:

[0336] The user device displays a user interface, prompting the user to input location information and categories of interest. The user can also use the device's GPS to automatically set their location. Furthermore, the user grants access to the device's microphone and camera, enabling the capture of emotional information.

[0337] Step 2:

[0338] The user's device transmits acquired location information, interest information, and emotional data (voice and facial expression data) to the server. Emotional data is collected from the user's facial expressions and tone of voice in real time.

[0339] Step 3:

[0340] The server queries its internal database based on the transmitted location and interest information. This query extracts relevant information about the visited location. This information includes the history of the facility, details about the exhibits, and anecdotes about the artists and buildings.

[0341] Step 4:

[0342] The server uses an emotion engine to analyze the transmitted emotion data. From the analyzed data, it determines the user's current emotional state and provides the generation model with content adjustment instructions that are most appropriate for that emotion.

[0343] Step 5:

[0344] The server activates a generative model based on the acquired relevant information and emotional state data to generate customized guide content. For example, if the user is in an excited state, it will include dynamic narratives and detailed information that match their emotions.

[0345] Step 6:

[0346] The generated guide content is optimized based on the user's language settings and delivered from the server to the user's device. The guide content is provided to the user in the selected format (audio or text).

[0347] Step 7:

[0348] The user's device will play the received guide content as audio or display it as text as needed, allowing the user to receive rich, individually customized information. This process ensures that the user has an optimal experience tailored to their emotions at that moment.

[0349] (Example 2)

[0350] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0351] Conventional guide systems standardize the content provided based on the user's location and interest information, and do not offer the detailed personalization that takes into account the emotional state of individual users. As a result, it has been difficult for users to receive information that matches their emotions and interests throughout their sightseeing experience. This invention aims to improve the quality of the experience by analyzing the user's emotional information and dynamically adjusting the information provided based on that information.

[0352] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0353] In this invention, the server includes means for acquiring geographical and interest information via a display device that accepts user input information, means for analyzing the user's emotional information, and means for inputting prompts to a generating AI model to generate customized guide content. This makes it possible to provide information according to the user's emotional state, thereby enabling a more personalized travel experience.

[0354] "User input information" refers to geographical and interest information provided by the user through the interface.

[0355] A "display device" refers to a device equipped with an interface for users to input information and receive information from a system.

[0356] "Geographic information" refers to information that indicates the user's current location or the location of their destination.

[0357] "Interest information" refers to information related to the user's specific interests or selected categories.

[0358] A "storage device" refers to a device that includes a database that holds large amounts of data and provides information as needed.

[0359] "Emotional information" refers to data that indicates the user's emotional state, and is obtained through the analysis of voice and facial expressions.

[0360] "Means for analyzing emotional information" refers to methods or devices for processing a user's emotional information and identifying their current emotional state.

[0361] A "generative AI model" refers to a technology that utilizes artificial intelligence to generate optimized content based on user input and emotional information.

[0362] A "prompt" refers to an instruction or input sentence that a generative AI model uses to generate specific content.

[0363] "Customized guide content" refers to guidance information provided in a format that is most suitable for the user's experience, based on their individual data.

[0364] "User device" refers to a device used by a user to receive information.

[0365] This invention is a system that provides more individually customized information to users when they visit tourist destinations. The system utilizes user input information and emotional information to generate customized guide content using a generative AI model. Specifically, it uses the following hardware and software.

[0366] System Configuration

[0367] 1. User terminal

[0368] Users input geographical information about tourist destinations (such as their current location) and categories of interest (such as art or history) using devices such as smartphones or tablets. These devices are equipped with display devices and have the functionality to retrieve information through their interfaces.

[0369] 2. Emotion acquisition device

[0370] The device utilizes its built-in camera and microphone to detect the user's facial expressions and voice tone, collecting emotional information. This information is obtained only with the user's permission.

[0371] 3. Servers and Databases

[0372] The server retrieves relevant information from the database based on the geographical and interest information entered by the user. The database contains detailed information related to tourist destinations.

[0373] 4. Emotion Analysis Engine

[0374] The server uses an emotion analysis engine to analyze the acquired emotion information and identify the user's current emotional state. This enables the provision of information tailored to the user's emotional state.

[0375] 5. Generative AI Models

[0376] The server generates customized guide content using a generative AI model based on the analyzed data. This model operates based on pre-configured prompts.

[0377] Specific examples and prompt statements

[0378] For example, when a user visits a museum and views artwork, if they feel intrigued or surprised, the system generates a guide that explains the historical background of the artwork and anecdotes about the artist in detail. Another example of a prompt that might be used is, "Based on the user's current emotional state, please generate detailed information about the artwork in the museum that interests them."

[0379] This makes it possible to provide information optimized according to the user's emotions and interests, thereby enriching the tourism experience.

[0380] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0381] Step 1:

[0382] The user inputs geographical information about the tourist destination and categories of interest through the terminal's display device. The terminal acquires this input information and collects the current location's latitude and longitude data as geographical information and the selected categories (e.g., culture, nature, history, etc.) as interest information. The terminal's operation is the collection of information using the input interface.

[0383] Step 2:

[0384] The device uses voice recognition and camera functions to acquire the user's emotional information. This information includes emotional data derived from voice intonation and facial expressions. The acquired emotional information is transmitted to the server in real time by the device. In this process, the input is emotional information from facial expressions and voice, and the output is emotional data.

[0385] Step 3:

[0386] The server receives geographical information, interest information, and sentiment information sent from the terminal. The server uses this information to issue queries to its storage device (database) to retrieve information related to tourist destinations (e.g., details of exhibits, facility descriptions, etc.). Specific data manipulation in the database involves extracting records that match specific criteria. The input is location information and interest information, and the output is related information.

[0387] Step 4:

[0388] The server uses an emotion analysis engine to analyze the received emotion information and identify the user's emotional state. This analysis determines the type of emotion (joy, surprise, calmness, etc.), and the result is used for subsequent processing. The input is emotion data, and the output is an evaluation result of the emotional state.

[0389] Step 5:

[0390] The server prompts the AI ​​model based on the analysis results to generate customized guide content. Here, prompts are used, for example, "Create information tailored to the current emotional state," to create content optimized for the user experience. The input consists of the emotional state and prompts, while the output is the customized content.

[0391] Step 6:

[0392] The generated guide content is delivered from the server to the user's device. The device receives this content and provides it to the user in the form of audio guides or text displays. The input is the generated content, and the output is the information the user receives visually or aurally.

[0393] (Application Example 2)

[0394] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the smart glasses 214 as the "terminal".

[0395] Traditional navigation systems provide uniform information without considering the user's emotional state, making it difficult to maximize the individual user experience. Furthermore, they lack mechanisms to dynamically optimize information based on the user's real-time emotions, meaning the information users receive may not align with their expectations or interests.

[0396] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0397] In this invention, the server includes means for acquiring location data and information on areas of interest via an interface for receiving user input information; means for acquiring relevant data from information sources based on the acquired location data and information on areas of interest; means for analyzing the user's emotional state based on the acquired emotional information; means for generating personalized guide content based on the relevant data using a generative model with the results of the emotional analysis; and means for transmitting the generated guide content to the user terminal. This makes it possible to provide personalized guide information that is appropriate to the user's emotional state in real time.

[0398] "User input information" refers to location data and information related to areas of interest that users provide to the system.

[0399] An "interface" is a means of exchanging information between a user and a system.

[0400] "Location data" refers to information about the user's current location.

[0401] "Areas of interest" refers to information about categories and topics that the user is interested in.

[0402] An "information source" is a collection of searchable data that exists in databases or on the internet.

[0403] "Emotional information" refers to data about emotions that can be gleaned from a user's facial expressions and tone of voice.

[0404] "Emotional state" refers to the state of a user's emotions at a particular point in time.

[0405] A "generative model" is an algorithm used to generate content based on input information.

[0406] "Personalized guide content" refers to guidance information that is customized to the user's emotional state and interests.

[0407] A "user terminal" refers to a device that a user uses to receive information.

[0408] As an embodiment of this invention, a user terminal such as a smartphone or smart glasses is used. The user terminal is equipped with an interface that receives location data and information on areas of interest from the user. Furthermore, it is equipped with a camera and microphone to acquire emotional information that detects subtle changes in the user's facial expressions and voice. This emotional information is analyzed using software such as OpenCV.

[0409] The server receives location data and sentiment information transmitted from the user's device. Next, it uses a sentiment engine to retrieve relevant data from the information sources and analyze the user's emotional state. Based on the user's sentiment analysis, it generates personalized guide content using cloud services such as Google Cloud Platform. The generated content is then delivered to the user's device in an appropriate format.

[0410] For example, if a tourist visiting a historical building is detected to be emotionally excited, the server will provide the user with detailed history and interesting anecdotes about that building. In this process, the generative AI model can generate customized content by receiving instructions such as, "Input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, include detailed history and anecdotes."

[0411] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0412] Step 1:

[0413] The user terminal receives the user's location data and information about their areas of interest. This information is input as GPS data provided by the user and the categories of interests they have selected. This allows the system to obtain the user's interest information, which forms the basis for subsequent processing.

[0414] Step 2:

[0415] The user terminal collects user facial expression data and voice data through its camera and microphone. This data is input as emotional information, and the user's emotional state is analyzed in real time using OpenCV. The output here is the analyzed emotional state data.

[0416] Step 3:

[0417] The server receives location data, information on areas of interest, and sentiment information from the user's terminal. It then uses this data as input to query information sources and retrieve relevant data. This allows the server to prepare specific information to provide to the user.

[0418] Step 4:

[0419] The server uses an emotion engine to analyze the user's emotional state in detail based on the received emotion analysis data. This process helps determine the appropriate content direction based on the user's emotions. The output identifies the main emotional states expressed by the user.

[0420] Step 5:

[0421] The server uses a generative AI model to generate personalized guide content based on all input data. Prompts are used to provide specific instructions, such as, "Please input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, please include detailed history and anecdotes." The content generated during this process is then output.

[0422] Step 6:

[0423] The server sends the generated personalized guide content to the user's terminal. On the user's terminal, this content is played back in the appropriate format and provided to the user. The final output is personalized information in a format the user can receive.

[0424] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0425] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0426] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0427] [Third Embodiment]

[0428] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0429] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0430] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0431] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0432] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0433] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0434] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0435] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0436] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0437] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0438] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0439] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0440] This invention relates to an embodiment of a system that personalizes a user's travel experience and efficiently provides information. This system consists of a user terminal, a server, and a database.

[0441] First, the user's device displays an interface for collecting location and interest information from the user. The user can select their current GPS location and categories of interest, such as "history" or "art." Once the user has made their selections, the device sends this information to the server.

[0442] The server queries an internal database based on location and interest information received from the user. This database stores detailed information about each tourist destination and exhibit, and the server quickly searches and retrieves relevant information. The retrieved information is categorized according to what the user is likely to be interested in.

[0443] Subsequently, the server utilizes a generative model to generate customized guide content based on the acquired information. This generative model generates detailed explanations and stories tailored to the user's specific interests, smoothly structuring personalized content.

[0444] The generated guide content is sent to the user's device and can be received by the user in audio or text format. The user's device plays this content according to the user's language settings, so users can enjoy explanations and guides in their native language.

[0445] As a concrete example, consider a user who has a particular interest in history. If this user visits an old castle, the system will generate in-depth information about the castle's history, the context of its construction, and important historical events. This allows the user to gain detailed knowledge that would not be available through a typical visit, resulting in a more enriching sightseeing experience.

[0446] Thus, the system of the present invention can provide information tailored to the individual interests and backgrounds of users, making their experiences at tourist destinations more personal and enriching.

[0447] The following describes the processing flow.

[0448] Step 1:

[0449] The user's device requests permission from the user through a user interface to obtain their current location. This interface also displays an option to select interest categories related to their destination (e.g., history, art, architecture, etc.). The user enters and submits their location information and interest categories according to the on-screen instructions.

[0450] Step 2:

[0451] The user's device sends its acquired location information (latitude and longitude) and selected interest categories to the server. This data serves as the basis for generating uniquely customized guide content for each user.

[0452] Step 3:

[0453] The server queries its internal database based on the received location information and interest categories. This query extracts detailed information about tourist attractions and exhibits related to the user's current location, prioritizing information that matches the user's interests.

[0454] Step 4:

[0455] The server uses the acquired information to launch a generative model. This model is designed to generate content tailored to the user's interests, customizing and creating guide content that includes in-depth knowledge of historical context and art.

[0456] Step 5:

[0457] The server sends the generated guide content to the user interface on the user's terminal. This content is translated or converted to speech, taking into account the user's language settings, and provided in the form of audio guidance or text display.

[0458] Step 6:

[0459] The user's device analyzes the received guide content and plays it back in the format selected by the user (audio or text). The user views tourist attractions through the provided guide and enriches their visit experience with individually customized information.

[0460] This processing flow enables the system to provide users with timely information based on their individual needs and interests.

[0461] (Example 1)

[0462] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0463] Traditional tourist guide systems have struggled to provide detailed, real-time information based on individual user interests and location. Furthermore, they often lacked the ability to generate flexible guide content tailored to users' language and preferences, resulting in a lack of user satisfaction.

[0464] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0465] In this invention, the server includes means for acquiring location information and preference information via a user interface, means for acquiring relevant content from information resources, and means for generating personalized guidance content using a model for generation. This makes it possible to provide detailed and personalized guidance information in real time according to the user's individual interests and language settings.

[0466] A "user interface" is a means of interaction used to receive input information from the user and to obtain location information and preference information.

[0467] "Location information" refers to data that indicates the geographical location where a user's terminal or device is currently located.

[0468] "Preference information" refers to data related to categories and themes that users are interested in.

[0469] "Information resources" refer to databases and information systems that store and provide accessible information about tourist destinations and other related topics.

[0470] A "model for generation" refers to an algorithm or artificial intelligence model that creates personalized guidance content suitable for the user based on acquired information.

[0471] "Guidance content" refers to guide data generated by individually tailoring tourism-related information to the user's interests.

[0472] "User's device" refers to a terminal or device used by the user, which is hardware used to receive, display, or play guidance content.

[0473] A "prompt" is an instruction or question that is input into a model to elicit a specific output.

[0474] This invention is a system that provides information to users in a more personalized and efficient way, enhancing their tourism experience. The system consists of a user terminal, a server, and a database as an information resource.

[0475] The user selects their current GPS location and categories of interest, such as "history" or "art," through the user interface on their device. This information is collected by the user's device and sent to the server.

[0476] The server queries a database within its information resources based on location and preference information obtained through the user interface. These resources contain detailed information about tourist attractions and exhibits. The server uses this data to quickly search for and retrieve information relevant to the user's interests.

[0477] The acquired information is input into a generation model, and the generation AI model generates personalized guidance content. This prompt is used to give specific instructions to the generation AI model. For example, a prompt such as "The user is interested in history. Please generate detailed historical explanations related to the following castles" can be used.

[0478] The generated guidance content is sent to the user's device. The user terminal displays the guidance content in audio or text format, allowing the user to receive interesting and detailed information in their native language. For example, if a user with a particular interest in history visits an old castle, this system can provide in-depth information about the castle's construction background and important historical events.

[0479] In this way, the present invention makes it possible to provide users with enriching experiences at tourist destinations that are tailored to their individual interests.

[0480] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0481] Step 1:

[0482] The user enters their current GPS location and categories of interest using the user interface on their device. The entered information is compiled into a data structure on the device and prepared for transmission to the server. For example, if the user selects the "History" category, that selection and their GPS location are combined.

[0483] Step 2:

[0484] The device transmits location and interest information obtained from the user to the server. The transmitted data is structured as packets and securely transferred to the server using security protocols.

[0485] Step 3:

[0486] The server searches the database for relevant information based on location and interest information received from the terminal. Here, it accesses the database using search queries such as SQL to retrieve information about tourist destinations and events relevant to the user.

[0487] Step 4:

[0488] The acquired information is set as input data for the generating AI model. The server generates prompts, instructing the AI ​​model to generate customized guidance based on the user's interests. Specifically, a prompt such as "The user is interested in history. Please generate detailed historical explanations related to the following castles" is created.

[0489] Step 5:

[0490] The generation AI model generates personalized guidance content tailored to the user based on the prompt text. The generated guidance content is stored on the server in text format, ready for subsequent processing.

[0491] Step 6:

[0492] The server sends the generated guidance content to the user's terminal. The transmission is done in real time, ensuring the user can access the information immediately.

[0493] Step 7:

[0494] The user terminal provides the received guidance to the user in either audio or text format. If speech synthesis is used, the terminal plays the content aloud, allowing the user to receive the information aurally. If a native language setting is configured, the guidance content is automatically adapted to the selected language.

[0495] Through this series of steps, the system provides users with a fulfilling travel experience tailored to their individual interests.

[0496] (Application Example 1)

[0497] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0498] Modern tourist destinations require information that caters to the diverse interests and languages ​​of visitors. However, current systems can only provide uniform information, making it difficult to offer a personalized tourist experience for each visitor. Therefore, the challenge lies in providing more individualized guide content in real time, based on each visitor's interests and location.

[0499] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0500] In this invention, the server includes means for acquiring location information and interest information via a user interface that accepts user input information; means for acquiring relevant information from a database based on the acquired location information and interest information; means for generating customized guide content based on the relevant information using a generative model; means for collecting interest information from visitors via voice and processing location information in real time; and means for communicating with a server on the cloud and generating content suitable for visitors using a generative AI model. This makes it possible to provide real-time tourist guide content based on the individual interests and location information of each visitor.

[0501] A "user interface" is an interface used by users to input information into a system, and it is responsible for receiving location information and interest information.

[0502] "Location information" refers to information that indicates the current location of a user or visitor, and is obtained using technologies such as GPS.

[0503] "Interest information" refers to information indicating categories or themes that users or visitors are particularly interested in, and is used to customize tourist guide content.

[0504] A "database" is a source of information where data is stored, and it is accessed to retrieve relevant information based on the user's location and interests.

[0505] A "generative model" is an algorithm or program that generates customized guide content based on acquired information, creating content that is suitable for the user.

[0506] "Guide content" refers to a collection of tourist information provided to users or visitors, which can be delivered in audio or text format.

[0507] "Speech synthesis" is a technology that converts generated text information into speech, providing information to visitors audibly.

[0508] A "cloud server" is a remote server accessed via the internet, and it serves as infrastructure for managing and processing the large amounts of data that a system handles.

[0509] This invention is a system that provides personalized tourist experiences to users visiting tourist destinations. Based on the user's location and interests, this system generates and delivers customized guide content in real time to the user's device.

[0510] First, the user inputs their location and interests using a user interface via a smartphone or a dedicated tourist guide robot. This interface includes GPS to confirm the user's current location and options to select categories of interest.

[0511] User input information is sent to a cloud server via the internet connection. The server uses this information to issue queries to its internally stored database and collect relevant tourist information. This information includes details about the historical background and works of art related to the visited location.

[0512] The collected information is transformed into stories and explanations tailored to the user's interests using a generative AI model. The generated guide content is then converted into audio format using text or speech synthesis technology and delivered to the user's device. A Python library is used for speech synthesis, allowing users to receive the content in their preferred language.

[0513] For example, if a visitor is interested in "medieval history," they will be provided with a detailed, customized audio guide about the history of the castle they are visiting and the important events of that era. Visitors can also request information from the robot using prompts such as, "Tell me more about the medieval history of this castle."

[0514] This allows users to enjoy a fulfilling travel experience tailored to their interests, gaining deeper knowledge and making discoveries that differ from traditional, standardized tourist guides.

[0515] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0516] Step 1:

[0517] Users input their current location and categories of interest using a smartphone or a tourist guide robot. This input data is retrieved through the user interface and is readily available for easy manipulation via a GPS module and category selection options. As part of the input data processing, the location information is prepared for transmission to the server in latitude and longitude format.

[0518] Step 2:

[0519] The device sends user input information to the cloud server. In this communication, location information and interest information are sent together to the server in JSON format. During data transmission, encoding is performed to prevent data corruption and ensure maximum communication efficiency.

[0520] Step 3:

[0521] The server queries an internal database based on the received location and interest information. The database stores tourist destination information, and the query extracts only the data relevant to the user's interests. As part of data processing, an SQL statement is constructed to fetch the necessary information.

[0522] Step 4:

[0523] The server converts data acquired using a generative AI model into customized guide content. The AI ​​model is fed with acquired tourist information as input, and the process generates a story optimized for the user's interests. As a data calculation, natural language generation is performed on the text data.

[0524] Step 5:

[0525] The server delivers the generated guide content to the user's terminal. Using a speech synthesis library, the text content is converted into audio format, and the user's terminal prepares to play it. An audio file is generated as data output and sent to the terminal via the network.

[0526] Step 6:

[0527] The user terminal saves received audio files locally and plays them back according to user input. In particular, the system plays information of the user's interest based on prompts. Users can start, stop, and skip playback, efficiently obtaining tourist information through the audio guide.

[0528] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0529] This invention provides a system that personalizes users' travel experiences and enables the provision of detailed information based on their emotional state. The system includes a user terminal, a server, a database, and an emotion engine.

[0530] The user terminal is equipped with an interface for collecting location and interest information. Through this, the user provides the location of their destination and selects categories of interest. The terminal also prepares to acquire emotional information using voice and camera, with the user's permission. Emotional information is obtained from subtle changes in the user's facial expressions and vocal intonation.

[0531] Next, this information is sent to the server in real time. The server queries the database based on the received location and interest information to retrieve information related to the visited location. In this process, the type and amount of data required are dynamically changed according to the user's emotional state.

[0532] The server then uses an emotion engine to analyze emotional information and identify the user's current emotional state. This emotional state plays a crucial role in customizing the guide content; for example, if the user is excited, a more detailed and engaging story is prepared, while if they are calm, a more relaxing explanation is provided.

[0533] The generative model generates customized guide content based on collected data. This guide content is optimized based on the user's emotional state, interests, and language settings, enabling a personalized experience. The generated content is delivered to the user's device for playback in the appropriate format.

[0534] For example, in the case of a user visiting an art museum, the system reads their emotions and, if it determines that they are interested in or amazed by the artwork, provides more detailed information about the historical background of the work and anecdotes about the artist. This allows the user to receive information that perfectly matches their emotions at that moment.

[0535] Thus, by combining an emotional engine, the present invention can achieve a higher level of personalization than traditional guide systems, enriching the visitor experience.

[0536] The following describes the processing flow.

[0537] Step 1:

[0538] The user device displays a user interface, prompting the user to input location information and categories of interest. The user can also use the device's GPS to automatically set their location. Furthermore, the user grants access to the device's microphone and camera, enabling the capture of emotional information.

[0539] Step 2:

[0540] The user's device transmits acquired location information, interest information, and emotional data (voice and facial expression data) to the server. Emotional data is collected from the user's facial expressions and tone of voice in real time.

[0541] Step 3:

[0542] The server queries its internal database based on the transmitted location and interest information. This query extracts relevant information about the visited location. This information includes the history of the facility, details about the exhibits, and anecdotes about the artists and buildings.

[0543] Step 4:

[0544] The server uses an emotion engine to analyze the transmitted emotion data. From the analyzed data, it determines the user's current emotional state and provides the generation model with content adjustment instructions that are most appropriate for that emotion.

[0545] Step 5:

[0546] The server activates a generative model based on the acquired relevant information and emotional state data to generate customized guide content. For example, if the user is in an excited state, it will include dynamic narratives and detailed information that match their emotions.

[0547] Step 6:

[0548] The generated guide content is optimized based on the user's language settings and delivered from the server to the user's device. The guide content is provided to the user in the selected format (audio or text).

[0549] Step 7:

[0550] The user's device will play the received guide content as audio or display it as text as needed, allowing the user to receive rich, individually customized information. This process ensures that the user has an optimal experience tailored to their emotions at that moment.

[0551] (Example 2)

[0552] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0553] Conventional guide systems standardize the content provided based on the user's location and interest information, and do not offer the detailed personalization that takes into account the emotional state of individual users. As a result, it has been difficult for users to receive information that matches their emotions and interests throughout their sightseeing experience. This invention aims to improve the quality of the experience by analyzing the user's emotional information and dynamically adjusting the information provided based on that information.

[0554] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0555] In this invention, the server includes means for acquiring geographical and interest information via a display device that accepts user input information, means for analyzing the user's emotional information, and means for inputting prompts to a generating AI model to generate customized guide content. This makes it possible to provide information according to the user's emotional state, thereby enabling a more personalized travel experience.

[0556] "User input information" refers to geographical and interest information provided by the user through the interface.

[0557] A "display device" refers to a device equipped with an interface for users to input information and receive information from a system.

[0558] "Geographic information" refers to information that indicates the user's current location or the location of their destination.

[0559] "Interest information" refers to information related to the user's specific interests or selected categories.

[0560] A "storage device" refers to a device that includes a database that holds large amounts of data and provides information as needed.

[0561] "Emotional information" refers to data that indicates the user's emotional state, and is obtained through the analysis of voice and facial expressions.

[0562] "Means for analyzing emotional information" refers to methods or devices for processing a user's emotional information and identifying their current emotional state.

[0563] A "generative AI model" refers to a technology that utilizes artificial intelligence to generate optimized content based on user input and emotional information.

[0564] A "prompt" refers to an instruction or input sentence that a generative AI model uses to generate specific content.

[0565] "Customized guide content" refers to guidance information provided in a format that is most suitable for the user's experience, based on their individual data.

[0566] "User device" refers to a device used by a user to receive information.

[0567] This invention is a system that provides more individually customized information to users when they visit tourist destinations. The system utilizes user input information and emotional information to generate customized guide content using a generative AI model. Specifically, it uses the following hardware and software.

[0568] System Configuration

[0569] 1. User terminal

[0570] Users input geographical information about tourist destinations (such as their current location) and categories of interest (such as art or history) using devices such as smartphones or tablets. These devices are equipped with display devices and have the functionality to retrieve information through their interfaces.

[0571] 2. Emotion acquisition device

[0572] The device utilizes its built-in camera and microphone to detect the user's facial expressions and voice tone, collecting emotional information. This information is obtained only with the user's permission.

[0573] 3. Servers and Databases

[0574] The server retrieves relevant information from the database based on the geographical and interest information entered by the user. The database contains detailed information related to tourist destinations.

[0575] 4. Emotion Analysis Engine

[0576] The server uses an emotion analysis engine to analyze the acquired emotion information and identify the user's current emotional state. This enables the provision of information tailored to the user's emotional state.

[0577] 5. Generative AI Models

[0578] The server generates customized guide content using a generative AI model based on the analyzed data. This model operates based on pre-configured prompts.

[0579] Specific examples and prompt statements

[0580] For example, when a user visits a museum and views artwork, if they feel intrigued or surprised, the system generates a guide that explains the historical background of the artwork and anecdotes about the artist in detail. Another example of a prompt that might be used is, "Based on the user's current emotional state, please generate detailed information about the artwork in the museum that interests them."

[0581] This makes it possible to provide information optimized according to the user's emotions and interests, thereby enriching the tourism experience.

[0582] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0583] Step 1:

[0584] The user inputs geographical information about the tourist destination and categories of interest through the terminal's display device. The terminal acquires this input information and collects the current location's latitude and longitude data as geographical information and the selected categories (e.g., culture, nature, history, etc.) as interest information. The terminal's operation is the collection of information using the input interface.

[0585] Step 2:

[0586] The device uses voice recognition and camera functions to acquire the user's emotional information. This information includes emotional data derived from voice intonation and facial expressions. The acquired emotional information is transmitted to the server in real time by the device. In this process, the input is emotional information from facial expressions and voice, and the output is emotional data.

[0587] Step 3:

[0588] The server receives geographical information, interest information, and sentiment information sent from the terminal. The server uses this information to issue queries to its storage device (database) to retrieve information related to tourist destinations (e.g., details of exhibits, facility descriptions, etc.). Specific data manipulation in the database involves extracting records that match specific criteria. The input is location information and interest information, and the output is related information.

[0589] Step 4:

[0590] The server uses an emotion analysis engine to analyze the received emotion information and identify the user's emotional state. This analysis determines the type of emotion (joy, surprise, calmness, etc.), and the result is used for subsequent processing. The input is emotion data, and the output is an evaluation result of the emotional state.

[0591] Step 5:

[0592] The server prompts the AI ​​model based on the analysis results to generate customized guide content. Here, prompts are used, for example, "Create information tailored to the current emotional state," to create content optimized for the user experience. The input consists of the emotional state and prompts, while the output is the customized content.

[0593] Step 6:

[0594] The generated guide content is delivered from the server to the user's device. The device receives this content and provides it to the user in the form of audio guides or text displays. The input is the generated content, and the output is the information the user receives visually or aurally.

[0595] (Application Example 2)

[0596] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0597] Traditional navigation systems provide uniform information without considering the user's emotional state, making it difficult to maximize the individual user experience. Furthermore, they lack mechanisms to dynamically optimize information based on the user's real-time emotions, meaning the information users receive may not align with their expectations or interests.

[0598] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0599] In this invention, the server includes means for acquiring location data and information on areas of interest via an interface for receiving user input information; means for acquiring relevant data from information sources based on the acquired location data and information on areas of interest; means for analyzing the user's emotional state based on the acquired emotional information; means for generating personalized guide content based on the relevant data using a generative model with the results of the emotional analysis; and means for transmitting the generated guide content to the user terminal. This makes it possible to provide personalized guide information that is appropriate to the user's emotional state in real time.

[0600] "User input information" refers to location data and information related to areas of interest that users provide to the system.

[0601] An "interface" is a means of exchanging information between a user and a system.

[0602] "Location data" refers to information about the user's current location.

[0603] "Areas of interest" refers to information about categories and topics that the user is interested in.

[0604] An "information source" is a collection of searchable data that exists in databases or on the internet.

[0605] "Emotional information" refers to data about emotions that can be gleaned from a user's facial expressions and tone of voice.

[0606] "Emotional state" refers to the state of a user's emotions at a particular point in time.

[0607] A "generative model" is an algorithm used to generate content based on input information.

[0608] "Personalized guide content" refers to guidance information that is customized to the user's emotional state and interests.

[0609] A "user terminal" refers to a device that a user uses to receive information.

[0610] As an embodiment of this invention, a user terminal such as a smartphone or smart glasses is used. The user terminal is equipped with an interface that receives location data and information on areas of interest from the user. Furthermore, it is equipped with a camera and microphone to acquire emotional information that detects subtle changes in the user's facial expressions and voice. This emotional information is analyzed using software such as OpenCV.

[0611] The server receives location data and sentiment information transmitted from the user's device. Next, it uses a sentiment engine to retrieve relevant data from the information sources and analyze the user's emotional state. Based on the user's sentiment analysis, it generates personalized guide content using cloud services such as Google Cloud Platform. The generated content is then delivered to the user's device in an appropriate format.

[0612] For example, if a tourist visiting a historical building is detected to be emotionally excited, the server will provide the user with detailed history and interesting anecdotes about that building. In this process, the generative AI model can generate customized content by receiving instructions such as, "Input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, include detailed history and anecdotes."

[0613] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0614] Step 1:

[0615] The user terminal receives the user's location data and information about their areas of interest. This information is input as GPS data provided by the user and the categories of interests they have selected. This allows the system to obtain the user's interest information, which forms the basis for subsequent processing.

[0616] Step 2:

[0617] The user terminal collects user facial expression data and voice data through its camera and microphone. This data is input as emotional information, and the user's emotional state is analyzed in real time using OpenCV. The output here is the analyzed emotional state data.

[0618] Step 3:

[0619] The server receives location data, information on areas of interest, and sentiment information from the user's terminal. It then uses this data as input to query information sources and retrieve relevant data. This allows the server to prepare specific information to provide to the user.

[0620] Step 4:

[0621] The server uses an emotion engine to analyze the user's emotional state in detail based on the received emotion analysis data. This process helps determine the appropriate content direction based on the user's emotions. The output identifies the main emotional states expressed by the user.

[0622] Step 5:

[0623] The server uses a generative AI model to generate personalized guide content based on all input data. Prompts are used to provide specific instructions, such as, "Please input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, please include detailed history and anecdotes." The content generated during this process is then output.

[0624] Step 6:

[0625] The server sends the generated personalized guide content to the user's terminal. On the user's terminal, this content is played back in the appropriate format and provided to the user. The final output is personalized information in a format the user can receive.

[0626] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0627] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0628] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0629] [Fourth Embodiment]

[0630] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0631] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0632] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0633] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0634] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0635] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0636] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0637] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0638] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0639] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0640] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0641] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0642] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0643] This invention relates to an embodiment of a system that personalizes a user's travel experience and efficiently provides information. This system consists of a user terminal, a server, and a database.

[0644] First, the user's device displays an interface for collecting location and interest information from the user. The user can select their current GPS location and categories of interest, such as "history" or "art." Once the user has made their selections, the device sends this information to the server.

[0645] The server queries an internal database based on location and interest information received from the user. This database stores detailed information about each tourist destination and exhibit, and the server quickly searches and retrieves relevant information. The retrieved information is categorized according to what the user is likely to be interested in.

[0646] Subsequently, the server utilizes a generative model to generate customized guide content based on the acquired information. This generative model generates detailed explanations and stories tailored to the user's specific interests, smoothly structuring personalized content.

[0647] The generated guide content is sent to the user's device and can be received by the user in audio or text format. The user's device plays this content according to the user's language settings, so users can enjoy explanations and guides in their native language.

[0648] As a concrete example, consider a user who has a particular interest in history. If this user visits an old castle, the system will generate in-depth information about the castle's history, the context of its construction, and important historical events. This allows the user to gain detailed knowledge that would not be available through a typical visit, resulting in a more enriching sightseeing experience.

[0649] Thus, the system of the present invention can provide information tailored to the individual interests and backgrounds of users, making their experiences at tourist destinations more personal and enriching.

[0650] The following describes the processing flow.

[0651] Step 1:

[0652] The user's device requests permission from the user through a user interface to obtain their current location. This interface also displays an option to select interest categories related to their destination (e.g., history, art, architecture, etc.). The user enters and submits their location information and interest categories according to the on-screen instructions.

[0653] Step 2:

[0654] The user's device sends its acquired location information (latitude and longitude) and selected interest categories to the server. This data serves as the basis for generating uniquely customized guide content for each user.

[0655] Step 3:

[0656] The server queries its internal database based on the received location information and interest categories. This query extracts detailed information about tourist attractions and exhibits related to the user's current location, prioritizing information that matches the user's interests.

[0657] Step 4:

[0658] The server uses the acquired information to launch a generative model. This model is designed to generate content tailored to the user's interests, customizing and creating guide content that includes in-depth knowledge of historical context and art.

[0659] Step 5:

[0660] The server sends the generated guide content to the user interface on the user's terminal. This content is translated or converted to speech, taking into account the user's language settings, and provided in the form of audio guidance or text display.

[0661] Step 6:

[0662] The user's device analyzes the received guide content and plays it back in the format selected by the user (audio or text). The user views tourist attractions through the provided guide and enriches their visit experience with individually customized information.

[0663] This processing flow enables the system to provide users with timely information based on their individual needs and interests.

[0664] (Example 1)

[0665] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0666] Traditional tourist guide systems have struggled to provide detailed, real-time information based on individual user interests and location. Furthermore, they often lacked the ability to generate flexible guide content tailored to users' language and preferences, resulting in a lack of user satisfaction.

[0667] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0668] In this invention, the server includes means for acquiring location information and preference information via a user interface, means for acquiring relevant content from information resources, and means for generating personalized guidance content using a model for generation. This makes it possible to provide detailed and personalized guidance information in real time according to the user's individual interests and language settings.

[0669] A "user interface" is a means of interaction used to receive input information from the user and to obtain location information and preference information.

[0670] "Location information" refers to data that indicates the geographical location where a user's terminal or device is currently located.

[0671] "Preference information" refers to data related to categories and themes that users are interested in.

[0672] "Information resources" refer to databases and information systems that store and provide accessible information about tourist destinations and other related topics.

[0673] A "model for generation" refers to an algorithm or artificial intelligence model that creates personalized guidance content suitable for the user based on acquired information.

[0674] "Guidance content" refers to guide data generated by individually tailoring tourism-related information to the user's interests.

[0675] "User's device" refers to a terminal or device used by the user, which is hardware used to receive, display, or play guidance content.

[0676] A "prompt" is an instruction or question that is input into a model to elicit a specific output.

[0677] This invention is a system that provides information to users in a more personalized and efficient way, enhancing their tourism experience. The system consists of a user terminal, a server, and a database as an information resource.

[0678] The user selects their current GPS location and categories of interest, such as "history" or "art," through the user interface on their device. This information is collected by the user's device and sent to the server.

[0679] The server queries a database within its information resources based on location and preference information obtained through the user interface. These resources contain detailed information about tourist attractions and exhibits. The server uses this data to quickly search for and retrieve information relevant to the user's interests.

[0680] The acquired information is input into a generation model, and the generation AI model generates personalized guidance content. This prompt is used to give specific instructions to the generation AI model. For example, a prompt such as "The user is interested in history. Please generate detailed historical explanations related to the following castles" can be used.

[0681] The generated guidance content is sent to the user's device. The user terminal displays the guidance content in audio or text format, allowing the user to receive interesting and detailed information in their native language. For example, if a user with a particular interest in history visits an old castle, this system can provide in-depth information about the castle's construction background and important historical events.

[0682] In this way, the present invention makes it possible to provide users with enriching experiences at tourist destinations that are tailored to their individual interests.

[0683] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0684] Step 1:

[0685] The user enters their current GPS location and categories of interest using the user interface on their device. The entered information is compiled into a data structure on the device and prepared for transmission to the server. For example, if the user selects the "History" category, that selection and their GPS location are combined.

[0686] Step 2:

[0687] The device transmits location and interest information obtained from the user to the server. The transmitted data is structured as packets and securely transferred to the server using security protocols.

[0688] Step 3:

[0689] The server searches the database for relevant information based on location and interest information received from the terminal. Here, it accesses the database using search queries such as SQL to retrieve information about tourist destinations and events relevant to the user.

[0690] Step 4:

[0691] The acquired information is set as input data for the generating AI model. The server generates prompts, instructing the AI ​​model to generate customized guidance based on the user's interests. Specifically, a prompt such as "The user is interested in history. Please generate detailed historical explanations related to the following castles" is created.

[0692] Step 5:

[0693] The generation AI model generates personalized guidance content tailored to the user based on the prompt text. The generated guidance content is stored on the server in text format, ready for subsequent processing.

[0694] Step 6:

[0695] The server sends the generated guidance content to the user's terminal. The transmission is done in real time, ensuring the user can access the information immediately.

[0696] Step 7:

[0697] The user terminal provides the received guidance to the user in either audio or text format. If speech synthesis is used, the terminal plays the content aloud, allowing the user to receive the information aurally. If a native language setting is configured, the guidance content is automatically adapted to the selected language.

[0698] Through this series of steps, the system provides users with a fulfilling travel experience tailored to their individual interests.

[0699] (Application Example 1)

[0700] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0701] Modern tourist destinations require information that caters to the diverse interests and languages ​​of visitors. However, current systems can only provide uniform information, making it difficult to offer a personalized tourist experience for each visitor. Therefore, the challenge lies in providing more individualized guide content in real time, based on each visitor's interests and location.

[0702] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0703] In this invention, the server includes means for acquiring location information and interest information via a user interface that accepts user input information; means for acquiring relevant information from a database based on the acquired location information and interest information; means for generating customized guide content based on the relevant information using a generative model; means for collecting interest information from visitors via voice and processing location information in real time; and means for communicating with a server on the cloud and generating content suitable for visitors using a generative AI model. This makes it possible to provide real-time tourist guide content based on the individual interests and location information of each visitor.

[0704] A "user interface" is an interface used by users to input information into a system, and it is responsible for receiving location information and interest information.

[0705] "Location information" refers to information that indicates the current location of a user or visitor, and is obtained using technologies such as GPS.

[0706] "Interest information" refers to information indicating categories or themes that users or visitors are particularly interested in, and is used to customize tourist guide content.

[0707] A "database" is a source of information where data is stored, and it is accessed to retrieve relevant information based on the user's location and interests.

[0708] A "generative model" is an algorithm or program that generates customized guide content based on acquired information, creating content that is suitable for the user.

[0709] "Guide content" refers to a collection of tourist information provided to users or visitors, which can be delivered in audio or text format.

[0710] "Speech synthesis" is a technology that converts generated text information into speech, providing information to visitors audibly.

[0711] A "cloud server" is a remote server accessed via the internet, and it serves as infrastructure for managing and processing the large amounts of data that a system handles.

[0712] This invention is a system that provides personalized tourist experiences to users visiting tourist destinations. Based on the user's location and interests, this system generates and delivers customized guide content in real time to the user's device.

[0713] First, the user inputs their location and interests using a user interface via a smartphone or a dedicated tourist guide robot. This interface includes GPS to confirm the user's current location and options to select categories of interest.

[0714] User input information is sent to a cloud server via the internet connection. The server uses this information to issue queries to its internally stored database and collect relevant tourist information. This information includes details about the historical background and works of art related to the visited location.

[0715] The collected information is transformed into stories and explanations tailored to the user's interests using a generative AI model. The generated guide content is then converted into audio format using text or speech synthesis technology and delivered to the user's device. A Python library is used for speech synthesis, allowing users to receive the content in their preferred language.

[0716] For example, if a visitor is interested in "medieval history," they will be provided with a detailed, customized audio guide about the history of the castle they are visiting and the important events of that era. Visitors can also request information from the robot using prompts such as, "Tell me more about the medieval history of this castle."

[0717] This allows users to enjoy a fulfilling travel experience tailored to their interests, gaining deeper knowledge and making discoveries that differ from traditional, standardized tourist guides.

[0718] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0719] Step 1:

[0720] Users input their current location and categories of interest using a smartphone or a tourist guide robot. This input data is retrieved through the user interface and is readily available for easy manipulation via a GPS module and category selection options. As part of the input data processing, the location information is prepared for transmission to the server in latitude and longitude format.

[0721] Step 2:

[0722] The device sends user input information to the cloud server. In this communication, location information and interest information are sent together to the server in JSON format. During data transmission, encoding is performed to prevent data corruption and ensure maximum communication efficiency.

[0723] Step 3:

[0724] The server queries an internal database based on the received location and interest information. The database stores tourist destination information, and the query extracts only the data relevant to the user's interests. As part of data processing, an SQL statement is constructed to fetch the necessary information.

[0725] Step 4:

[0726] The server converts data acquired using a generative AI model into customized guide content. The AI ​​model is fed with acquired tourist information as input, and the process generates a story optimized for the user's interests. As a data calculation, natural language generation is performed on the text data.

[0727] Step 5:

[0728] The server delivers the generated guide content to the user's terminal. Using a speech synthesis library, the text content is converted into audio format, and the user's terminal prepares to play it. An audio file is generated as data output and sent to the terminal via the network.

[0729] Step 6:

[0730] The user terminal saves received audio files locally and plays them back according to user input. In particular, the system plays information of the user's interest based on prompts. Users can start, stop, and skip playback, efficiently obtaining tourist information through the audio guide.

[0731] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0732] This invention provides a system that personalizes users' travel experiences and enables the provision of detailed information based on their emotional state. The system includes a user terminal, a server, a database, and an emotion engine.

[0733] The user terminal is equipped with an interface for collecting location and interest information. Through this, the user provides the location of their destination and selects categories of interest. The terminal also prepares to acquire emotional information using voice and camera, with the user's permission. Emotional information is obtained from subtle changes in the user's facial expressions and vocal intonation.

[0734] Next, this information is sent to the server in real time. The server queries the database based on the received location and interest information to retrieve information related to the visited location. In this process, the type and amount of data required are dynamically changed according to the user's emotional state.

[0735] The server then uses an emotion engine to analyze emotional information and identify the user's current emotional state. This emotional state plays a crucial role in customizing the guide content; for example, if the user is excited, a more detailed and engaging story is prepared, while if they are calm, a more relaxing explanation is provided.

[0736] The generative model generates customized guide content based on collected data. This guide content is optimized based on the user's emotional state, interests, and language settings, enabling a personalized experience. The generated content is delivered to the user's device for playback in the appropriate format.

[0737] For example, in the case of a user visiting an art museum, the system reads their emotions and, if it determines that they are interested in or amazed by the artwork, provides more detailed information about the historical background of the work and anecdotes about the artist. This allows the user to receive information that perfectly matches their emotions at that moment.

[0738] Thus, by combining an emotional engine, the present invention can achieve a higher level of personalization than traditional guide systems, enriching the visitor experience.

[0739] The following describes the processing flow.

[0740] Step 1:

[0741] The user device displays a user interface, prompting the user to input location information and categories of interest. The user can also use the device's GPS to automatically set their location. Furthermore, the user grants access to the device's microphone and camera, enabling the capture of emotional information.

[0742] Step 2:

[0743] The user's device transmits acquired location information, interest information, and emotional data (voice and facial expression data) to the server. Emotional data is collected from the user's facial expressions and tone of voice in real time.

[0744] Step 3:

[0745] The server queries its internal database based on the transmitted location and interest information. This query extracts relevant information about the visited location. This information includes the history of the facility, details about the exhibits, and anecdotes about the artists and buildings.

[0746] Step 4:

[0747] The server uses an emotion engine to analyze the transmitted emotion data. From the analyzed data, it determines the user's current emotional state and provides the generation model with content adjustment instructions that are most appropriate for that emotion.

[0748] Step 5:

[0749] The server activates a generative model based on the acquired relevant information and emotional state data to generate customized guide content. For example, if the user is in an excited state, it will include dynamic narratives and detailed information that match their emotions.

[0750] Step 6:

[0751] The generated guide content is optimized based on the user's language settings and delivered from the server to the user's device. The guide content is provided to the user in the selected format (audio or text).

[0752] Step 7:

[0753] The user's device will play the received guide content as audio or display it as text as needed, allowing the user to receive rich, individually customized information. This process ensures that the user has an optimal experience tailored to their emotions at that moment.

[0754] (Example 2)

[0755] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0756] Conventional guide systems standardize the content provided based on the user's location and interest information, and do not offer the detailed personalization that takes into account the emotional state of individual users. As a result, it has been difficult for users to receive information that matches their emotions and interests throughout their sightseeing experience. This invention aims to improve the quality of the experience by analyzing the user's emotional information and dynamically adjusting the information provided based on that information.

[0757] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0758] In this invention, the server includes means for acquiring geographical and interest information via a display device that accepts user input information, means for analyzing the user's emotional information, and means for inputting prompts to a generating AI model to generate customized guide content. This makes it possible to provide information according to the user's emotional state, thereby enabling a more personalized travel experience.

[0759] "User input information" refers to geographical and interest information provided by the user through the interface.

[0760] A "display device" refers to a device equipped with an interface for users to input information and receive information from a system.

[0761] "Geographic information" refers to information that indicates the user's current location or the location of their destination.

[0762] "Interest information" refers to information related to the user's specific interests or selected categories.

[0763] A "storage device" refers to a device that includes a database that holds large amounts of data and provides information as needed.

[0764] "Emotional information" refers to data that indicates the user's emotional state, and is obtained through the analysis of voice and facial expressions.

[0765] "Means for analyzing emotional information" refers to methods or devices for processing a user's emotional information and identifying their current emotional state.

[0766] A "generative AI model" refers to a technology that utilizes artificial intelligence to generate optimized content based on user input and emotional information.

[0767] A "prompt" refers to an instruction or input sentence that a generative AI model uses to generate specific content.

[0768] "Customized guide content" refers to guidance information provided in a format that is most suitable for the user's experience, based on their individual data.

[0769] "User device" refers to a device used by a user to receive information.

[0770] This invention is a system that provides more individually customized information to users when they visit tourist destinations. The system utilizes user input information and emotional information to generate customized guide content using a generative AI model. Specifically, it uses the following hardware and software.

[0771] System Configuration

[0772] 1. User terminal

[0773] Users input geographical information about tourist destinations (such as their current location) and categories of interest (such as art or history) using devices such as smartphones or tablets. These devices are equipped with display devices and have the functionality to retrieve information through their interfaces.

[0774] 2. Emotion acquisition device

[0775] The device utilizes its built-in camera and microphone to detect the user's facial expressions and voice tone, collecting emotional information. This information is obtained only with the user's permission.

[0776] 3. Servers and Databases

[0777] The server retrieves relevant information from the database based on the geographical and interest information entered by the user. The database contains detailed information related to tourist destinations.

[0778] 4. Emotion Analysis Engine

[0779] The server uses an emotion analysis engine to analyze the acquired emotion information and identify the user's current emotional state. This enables the provision of information tailored to the user's emotional state.

[0780] 5. Generative AI Models

[0781] The server generates customized guide content using a generative AI model based on the analyzed data. This model operates based on pre-configured prompts.

[0782] Specific examples and prompt statements

[0783] For example, when a user visits a museum and views artwork, if they feel intrigued or surprised, the system generates a guide that explains the historical background of the artwork and anecdotes about the artist in detail. Another example of a prompt that might be used is, "Based on the user's current emotional state, please generate detailed information about the artwork in the museum that interests them."

[0784] This makes it possible to provide information optimized according to the user's emotions and interests, thereby enriching the tourism experience.

[0785] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0786] Step 1:

[0787] The user inputs geographical information about the tourist destination and categories of interest through the terminal's display device. The terminal acquires this input information and collects the current location's latitude and longitude data as geographical information and the selected categories (e.g., culture, nature, history, etc.) as interest information. The terminal's operation is the collection of information using the input interface.

[0788] Step 2:

[0789] The device uses voice recognition and camera functions to acquire the user's emotional information. This information includes emotional data derived from voice intonation and facial expressions. The acquired emotional information is transmitted to the server in real time by the device. In this process, the input is emotional information from facial expressions and voice, and the output is emotional data.

[0790] Step 3:

[0791] The server receives geographical information, interest information, and sentiment information sent from the terminal. The server uses this information to issue queries to its storage device (database) to retrieve information related to tourist destinations (e.g., details of exhibits, facility descriptions, etc.). Specific data manipulation in the database involves extracting records that match specific criteria. The input is location information and interest information, and the output is related information.

[0792] Step 4:

[0793] The server uses an emotion analysis engine to analyze the received emotion information and identify the user's emotional state. This analysis determines the type of emotion (joy, surprise, calmness, etc.), and the result is used for subsequent processing. The input is emotion data, and the output is an evaluation result of the emotional state.

[0794] Step 5:

[0795] The server prompts the AI ​​model based on the analysis results to generate customized guide content. Here, prompts are used, for example, "Create information tailored to the current emotional state," to create content optimized for the user experience. The input consists of the emotional state and prompts, while the output is the customized content.

[0796] Step 6:

[0797] The generated guide content is delivered from the server to the user's device. The device receives this content and provides it to the user in the form of audio guides or text displays. The input is the generated content, and the output is the information the user receives visually or aurally.

[0798] (Application Example 2)

[0799] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0800] Traditional navigation systems provide uniform information without considering the user's emotional state, making it difficult to maximize the individual user experience. Furthermore, they lack mechanisms to dynamically optimize information based on the user's real-time emotions, meaning the information users receive may not align with their expectations or interests.

[0801] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0802] In this invention, the server includes means for acquiring location data and information on areas of interest via an interface for receiving user input information; means for acquiring relevant data from information sources based on the acquired location data and information on areas of interest; means for analyzing the user's emotional state based on the acquired emotional information; means for generating personalized guide content based on the relevant data using a generative model with the results of the emotional analysis; and means for transmitting the generated guide content to the user terminal. This makes it possible to provide personalized guide information that is appropriate to the user's emotional state in real time.

[0803] "User input information" refers to location data and information related to areas of interest that users provide to the system.

[0804] An "interface" is a means of exchanging information between a user and a system.

[0805] "Location data" refers to information about the user's current location.

[0806] "Areas of interest" refers to information about categories and topics that the user is interested in.

[0807] An "information source" is a collection of searchable data that exists in databases or on the internet.

[0808] "Emotional information" refers to data about emotions that can be gleaned from a user's facial expressions and tone of voice.

[0809] "Emotional state" refers to the state of a user's emotions at a particular point in time.

[0810] A "generative model" is an algorithm used to generate content based on input information.

[0811] "Personalized guide content" refers to guidance information that is customized to the user's emotional state and interests.

[0812] A "user terminal" refers to a device that a user uses to receive information.

[0813] As an embodiment of this invention, a user terminal such as a smartphone or smart glasses is used. The user terminal is equipped with an interface that receives location data and information on areas of interest from the user. Furthermore, it is equipped with a camera and microphone to acquire emotional information that detects subtle changes in the user's facial expressions and voice. This emotional information is analyzed using software such as OpenCV.

[0814] The server receives location data and sentiment information transmitted from the user's device. Next, it uses a sentiment engine to retrieve relevant data from the information sources and analyze the user's emotional state. Based on the user's sentiment analysis, it generates personalized guide content using cloud services such as Google Cloud Platform. The generated content is then delivered to the user's device in an appropriate format.

[0815] For example, if a tourist visiting a historical building is detected to be emotionally excited, the server will provide the user with detailed history and interesting anecdotes about that building. In this process, the generative AI model can generate customized content by receiving instructions such as, "Input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, include detailed history and anecdotes."

[0816] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0817] Step 1:

[0818] The user terminal receives the user's location data and information about their areas of interest. This information is input as GPS data provided by the user and the categories of interests they have selected. This allows the system to obtain the user's interest information, which forms the basis for subsequent processing.

[0819] Step 2:

[0820] The user terminal collects user facial expression data and voice data through its camera and microphone. This data is input as emotional information, and the user's emotional state is analyzed in real time using OpenCV. The output here is the analyzed emotional state data.

[0821] Step 3:

[0822] The server receives location data, information on areas of interest, and sentiment information from the user's terminal. It then uses this data as input to query information sources and retrieve relevant data. This allows the server to prepare specific information to provide to the user.

[0823] Step 4:

[0824] The server uses an emotion engine to analyze the user's emotional state in detail based on the received emotion analysis data. This process helps determine the appropriate content direction based on the user's emotions. The output identifies the main emotional states expressed by the user.

[0825] Step 5:

[0826] The server uses a generative AI model to generate personalized guide content based on all input data. Prompts are used to provide specific instructions, such as, "Please input the user's facial expression data and location information to generate emotion-based tourist information. If the user is excited, please include detailed history and anecdotes." The content generated during this process is then output.

[0827] Step 6:

[0828] The server sends the generated personalized guide content to the user's terminal. On the user's terminal, this content is played back in the appropriate format and provided to the user. The final output is personalized information in a format the user can receive.

[0829] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0830] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0831] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0832] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0833] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. In the upper and lower directions of the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. Also, the upper side of the concentric circles is where "pleasant" emotions are located, and the lower side is where "unpleasant" emotions are located. In this way, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0834] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0835] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0836] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0837] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0838] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0839] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0840] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0841] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0842] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0843] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0844] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0845] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0846] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0847] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0848] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0849] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0850] The following is further disclosed regarding the embodiments described above.

[0851] (Claim 1)

[0852] A means for acquiring location information and interest information through a user interface that accepts user input information,

[0853] A means for obtaining relevant information from a database based on the acquired location information and interest information,

[0854] A means for generating customized guide content based on the aforementioned related information using a generative model,

[0855] A means for distributing the generated guide content to the user terminal,

[0856] A system that includes this.

[0857] (Claim 2)

[0858] The system according to claim 1, further comprising means for delivering the guide content in an audio or text format playable on a user terminal.

[0859] (Claim 3)

[0860] The system according to claim 1, further comprising means for optimizing the language based on the user's language settings in generating the guide content.

[0861] "Example 1"

[0862] (Claim 1)

[0863] A means for acquiring location information and preference information through a user interface that accepts user input information,

[0864] A means for obtaining relevant content from information resources based on the acquired location information and preference information,

[0865] A means for generating personalized guidance content based on the aforementioned related content using a model for generation,

[0866] Means for providing the generated guidance content to the user's device,

[0867] A means of dynamically inputting prompts into the generative model based on the acquired information,

[0868] A system that includes this.

[0869] (Claim 2)

[0870] The system according to claim 1, further comprising means for providing the aforementioned guidance content in an audio or printable format playable on the user's device.

[0871] (Claim 3)

[0872] The system according to claim 1, further comprising means for adapting the language based on the user's selected language settings when generating the aforementioned guidance content.

[0873] "Application Example 1"

[0874] (Claim 1)

[0875] A means for acquiring location information and interest information through a user interface that accepts user input information,

[0876] A means for obtaining relevant information from a database based on the acquired location information and interest information,

[0877] A means for generating customized guide content based on the aforementioned related information using a generative model,

[0878] A means for distributing the generated guide content to the user terminal,

[0879] A means of collecting interest information from visitors via voice and processing location information in real time,

[0880] A means of communicating with a server in the cloud and generating content suitable for visitors using a generative AI model,

[0881] A system that includes this.

[0882] (Claim 2)

[0883] The system according to claim 1, further comprising means for delivering the guide content in audio or text format playable on a user's terminal, and for providing information to visitors using speech synthesis.

[0884] (Claim 3)

[0885] The system according to claim 1, further comprising means for optimizing the language based on the user's language settings and providing customized information via a cloud server in generating the guide content.

[0886] "Example 2 of combining an emotion engine"

[0887] (Claim 1)

[0888] A means for acquiring geographical information and interest information via a display device that accepts user input information,

[0889] Means for obtaining relevant information from a storage device based on the acquired geographical information and interest information,

[0890] A means of analyzing user sentiment information,

[0891] Means for dynamically adjusting the type and amount of relevant information based on the aforementioned emotional information,

[0892] A means for inputting prompts into a generation AI model to generate customized guide content based on the aforementioned related information,

[0893] Means for distributing the generated guide content to a user device,

[0894] A system that includes this.

[0895] (Claim 2)

[0896] The system according to claim 1, further comprising means for delivering the guide content in an audio or text format playable on a user device.

[0897] (Claim 3)

[0898] The system according to claim 1, further comprising means for optimizing the language based on the user's language settings in generating the guide content.

[0899] "Application example 2 when combining with an emotional engine"

[0900] (Claim 1)

[0901] A means for acquiring location data and information on areas of interest via an interface that accepts user input information,

[0902] A means for obtaining relevant data from an information source based on the acquired location data and information on areas of interest,

[0903] A means for analyzing the user's emotional state based on acquired emotional information,

[0904] A means for generating personalized guide content based on the aforementioned related data using a generative model based on the results of sentiment analysis,

[0905] Means for transmitting the generated guide content to the user terminal,

[0906] A system that includes this.

[0907] (Claim 2)

[0908] The system according to claim 1, wherein the personalized guide content is delivered as audio or text executable on the user's terminal.

[0909] (Claim 3)

[0910] The system according to claim 1, wherein the language is adjusted based on the user's language selection when generating the personalized guide content. [Explanation of Symbols]

[0911] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A means for acquiring location information and interest information through a user interface that accepts user input information, A means for obtaining relevant information from a database based on the acquired location information and interest information, A means for generating customized guide content based on the aforementioned related information using a generative model, A means for distributing the generated guide content to the user terminal, A means of collecting interest information from visitors via voice and processing location information in real time, A means of communicating with a server in the cloud and generating content suitable for visitors using a generative AI model, A system that includes this.

2. The system according to claim 1, further comprising means for delivering the guide content in audio or text format playable on a user terminal, and for providing information to visitors using speech synthesis.

3. The system according to claim 1, further comprising means for optimizing the language based on the user's language settings and providing customized information via a cloud server in generating the guide content.