system

The system addresses the challenge of accessing tailored financial advice by using a generative AI model to analyze user data, including emotions, providing accurate and confidential financial guidance.

JP2026101295APending Publication Date: 2026-06-22SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-10
Publication Date
2026-06-22

AI Technical Summary

Technical Problem

Individuals face challenges in accessing appropriate and reliable financial advice tailored to their specific needs and emotional states, leading to a lack of confidence in financial planning and potential delays in improving financial knowledge, which can affect societal economic stability.

Method used

A system utilizing a generative artificial intelligence model that analyzes user input data, including emotional data, to provide personalized financial advice through speech recognition and synthesis, considering behavioral history and ensuring data privacy with encryption.

Benefits of technology

Enables users to receive highly accurate, emotionally sensitive financial advice in real-time, improving financial knowledge and planning with confidence, while protecting privacy and enhancing advice accuracy over time.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026101295000001_ABST
    Figure 2026101295000001_ABST
Patent Text Reader

Abstract

We provide the system. [Solution] A means of analyzing input information using a generative artificial intelligence model and generating personalized financial advice for the user, A means of analyzing emotional information from users and deriving responses appropriate to their emotional state, A speech recognition means that processes voice input and converts it into text information, A speech synthesis means that provides the generated financial advice as an audio output, A means of analyzing users' spending information in real time and providing advice on reducing spending based on their individual financial situation, A system that includes this.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] Many people are worried about their future finances, but there are situations where they cannot access appropriate and reliable advisors. Also, due to the lack of financial advice according to individual needs and emotions, the environment for users to plan with confidence is not well established. Such problems lead to a delay in the improvement of individual financial knowledge and may affect the economic stability of the whole society.

Means for Solving the Problems

[0005] This invention provides personalized financial advice to users in real time by using a generative artificial intelligence model. The AI ​​analyzes the user's input data and also considers emotional data, enabling appropriate responses that respond to emotions. Furthermore, by incorporating speech recognition means, users can input information by voice and receive advice in voice via speech synthesis means. It has a function that considers the user's behavior history, providing highly accurate advice based on past situations, and further protects user privacy with data encryption means. Through this system, it enables individual users to improve their financial knowledge and plan for the future with confidence.

[0006] A "generative artificial intelligence model" is a form of artificial intelligence technology that can analyze input data and dynamically generate output based on the results.

[0007] A "user" refers to an entity that utilizes the system and receives personalized financial advice.

[0008] "Financial advice" refers to information and instructions that suggest the optimal financial actions based on the user's financial situation and wishes.

[0009] "Emotional data" refers to information that indicates the user's current psychological and emotional state, which AI analyzes and uses to adjust its responses.

[0010] "Speech recognition means" refers to technology that analyzes speech input and converts it into corresponding text data.

[0011] "Speech synthesis means" refers to technology that outputs text data as speech, enabling users to receive information in audio format.

[0012] "Behavioral history" refers to data that records a user's past actions and inputs, and based on this, AI models can make more sophisticated suggestions.

[0013] "Data encryption methods" refer to technologies that encrypt electronic data to protect it from unauthorized access and privacy violations from external sources. [Brief explanation of the drawing]

[0014] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, which incorporates an emotion engine. [Figure 14]It is a sequence diagram showing the processing flow of a data processing system in Application Example 2 when a sentiment engine is combined.

Embodiments for Carrying Out the Invention

[0015] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0016] First, the terms used in the following description will be explained.

[0017] In the following embodiments, a labeled processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0018] In the following embodiments, a labeled RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0019] In the following embodiments, a labeled storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.

[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0022] [First Embodiment]

[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0035] This invention describes a system that provides personalized financial advice to users. The following explains how the server, terminal, and user interact to realize this system.

[0036] First, the user accesses the application using their device via voice or text. When the user enters a question, the device sends that input data to the server. If voice input is used, the device utilizes speech recognition to convert it into text data and sends this data to the server.

[0037] The server analyzes the received data. A generative artificial intelligence model is used to understand the user's input and extract the information necessary to generate appropriate financial advice. This model can provide highly accurate advice by referencing a large amount of pre-stored financial data and the user's behavioral history.

[0038] Furthermore, the server uses an emotional intelligence module to determine the user's emotional state from their input data. For example, if the server determines that a user's question is emotionally tense, it will construct advice in a reassuring tone and prepare an emotionally sensitive response.

[0039] The advice generated by the server is returned to the terminal as voice output using speech synthesis technology. It is also displayed on the screen as text. For example, specific suggestions such as, "Let's review this month's expenses and try reducing eating out to save money," may be given.

[0040] Users can act based on the advice provided, and the system is particularly easy to use for senior citizens and visually impaired users, especially through audio delivery.

[0041] This system protects user privacy and ensures data security by encrypting user data. Furthermore, by accumulating behavioral history, it can continuously improve the accuracy of its advice. This helps users build reliable long-term financial plans.

[0042] The following describes the processing flow.

[0043] Step 1:

[0044] The user activates the device and enters a question into the application via voice input or text input. The device then converts the voice-input data into text data using speech recognition technology.

[0045] Step 2:

[0046] The terminal packets the text data entered by the user and sends it to the server via the internet. The user ID and session information are also sent along with the packet.

[0047] Step 3:

[0048] The server receives data packets from the terminal. The received data first undergoes a security check to ensure its safety. Then, a natural language processing module is used to analyze the data and identify the intent and content of the question.

[0049] Step 4:

[0050] Based on the analysis results, the server generates personalized financial advice for the user using a generative artificial intelligence model. This process references the user's behavioral history and previously learned financial data.

[0051] Step 5:

[0052] The server uses an emotional intelligence module to analyze the user's emotional data. It identifies emotional tension and stress from the questions asked and, if necessary, constructs empathetic advice that addresses those emotions.

[0053] Step 6:

[0054] The advice generated on the server is formatted into a user-friendly format and sent to the terminal as encrypted data.

[0055] Step 7:

[0056] The terminal receives a response from the server, decodes the data, and displays it in the application's user interface. Simultaneously, it outputs advice in voice using speech synthesis technology.

[0057] Step 8:

[0058] Users read the advice displayed on the screen or listen to it aloud, and take action as needed. During this process, the device updates the user's behavior history, which is used to improve the accuracy of advice for future interactions.

[0059] (Example 1)

[0060] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0061] Financial management and asset building in modern society vary greatly depending on individual circumstances and emotions, thus requiring personalized advice. However, conventional systems have problems in that they have difficulty taking into account the emotional state of users and do not adequately protect user privacy or utilize their behavioral history. This invention aims to address these issues and realize a system that provides highly accurate, personalized advice.

[0062] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0063] In this invention, the server includes means for analyzing information using a generative artificial intelligence model and generating personalized advice, means for analyzing emotional information from the user and deriving a response corresponding to the emotional state, and acoustic recognition means for processing acoustic input and converting it into textual information. This makes it possible to provide accurate advice that is appropriate to the user's situation and emotions.

[0064] A "generative artificial intelligence model" is a machine learning algorithm that uses a large amount of pre-trained data to produce appropriate outputs for new input data.

[0065] "User" refers to an individual or legal entity that uses the system to input information and receive advice.

[0066] "Emotional information" refers to data that expresses the user's emotions and psychological state.

[0067] "Audio input" refers to audio signals that are input into the system via a microphone or similar device.

[0068] "Textual information" refers to a format in which analyzed data is represented using text.

[0069] An "emotional intelligence module" is a software component that analyzes the user's emotional state and generates appropriate responses based on that analysis.

[0070] "Sound generation means" refers to the technology and process for converting text data into speech and transmitting it to the user.

[0071] "Behavioral history" refers to records of actions and inputs made by users in the past, and is data used to improve the accuracy of future suggestions.

[0072] "Digitized information encryption" refers to a technology that protects data to safeguard user privacy, and is a method of transforming information using a specific algorithm.

[0073] This invention describes a specific embodiment of a system that provides personalized advice to users. The user accesses the system via voice or text using their own terminal and inputs a question or request for advice. The terminal transmits this information to a server via an internet connection. If voice input is provided, the terminal utilizes acoustic recognition capabilities to convert the voice signal into text information.

[0074] The server uses a generative artificial intelligence model to analyze the input text information. Based on a large amount of pre-trained data, the generative AI model understands the user's question, extracts relevant information, and generates appropriate advice. In this process, the user's emotional information is also analyzed through an emotional intelligence module, and a response appropriate to their emotional state is derived.

[0075] The generated advice is returned to the terminal as an audible output via an acoustic generation device. It is also displayed on the terminal screen as text information. This system is designed to efficiently process the user's acoustic input and provide optimal responses tailored to the user's language habits and emotions. The user's behavioral history is also accumulated and considered when providing personalized advice during subsequent consultations.

[0076] As a concrete example, let's consider a scenario where a user enters a text prompt asking, "How can I increase my savings?" The system then uses a generative artificial intelligence model to analyze this prompt and generate specific advice, such as, "Try setting up an automatic savings plan that sets a certain percentage of your monthly income," which is then communicated to the user via both voice and text. This system provides the user with guidance to achieve their financial goals.

[0077] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0078] Step 1:

[0079] The user inputs questions using their device via voice or text. Specific user actions include typing prompts such as "How should I manage my expenses this month?" or asking questions verbally. Based on this input, the device converts the voice input into text using its acoustic recognition function.

[0080] Step 2:

[0081] The terminal sends the converted character information to the server. Specifically, this data is transferred to the server via the internet connection. The input data is character information provided by the user, and this information is used for analysis on the server.

[0082] Step 3:

[0083] The server analyzes this textual information using a generative artificial intelligence model. The server extracts important keywords and related information from the received text and generates appropriate advice. This data processing results in optimal financial advice based on the user's question.

[0084] Step 4:

[0085] The server analyzes the user's emotional information through an emotional intelligence module. This analysis evaluates the emotional state and generates a response with a corresponding tone and content. Specifically, if the user's input indicates tension or anxiety, the server generates reassuring language.

[0086] Step 5:

[0087] The server converts the generated advice into audio data using an audio generation device and sends it to the terminal. The output data is provided as audio output. Specifically, advice such as "First, I recommend listing your expenses and reviewing them" is returned in audio format.

[0088] Step 6:

[0089] The terminal plays audio data provided by the server and simultaneously displays it as text. The user can then take action based on this. Through the outputted audio and text information, the user receives personalized advice and appropriate guidance for action.

[0090] (Application Example 1)

[0091] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0092] In today's consumer society, many individuals are required to efficiently manage their spending and improve their financial situation. However, traditional methods make it difficult to obtain specific, real-time advice tailored to individual financial circumstances. Furthermore, there is a lack of systems that provide advice that takes into account specific emotional states. Therefore, there is a need for a system that allows users to manage their spending in a sustainable way without undue burden.

[0093] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0094] In this invention, the server includes means for analyzing input information using a generative artificial intelligence model and generating personalized financial advice for the user, means for analyzing emotional information from the user and deriving a response corresponding to the emotional state, and speech recognition means for processing voice input and converting it into text information. As a result, the user can receive real-time spending management advice tailored to their individual financial situation, as well as appropriate advice that takes their emotions into consideration.

[0095] A "generative artificial intelligence model" is an advanced computational model used to learn from large amounts of data and perform analysis and prediction of new information.

[0096] "Input information" refers to data provided by the user, which is incorporated into the system as audio or text.

[0097] "Personalized financial advice" refers to specific financial management advice created individually based on each user's specific circumstances and behavioral history.

[0098] "Emotional information" refers to data that indicates a user's emotional state, and includes information that reflects emotional elements in questions and statements.

[0099] "Speech recognition means" refers to technology for analyzing speech data and converting it into text information.

[0100] "Speech synthesis means" refers to a technology that converts text information into speech data and provides the information to the user as speech.

[0101] "Real-time spending data analysis" is a process that instantly analyzes a user's latest spending data and generates advice based on that data.

[0102] "Information encryption methods" are technologies used to protect data from third parties and are techniques for securely protecting information.

[0103] The system that realizes this invention operates in cooperation with a user, a terminal, and a server. The user accesses the application using voice or text via a terminal such as a smartphone. First, when the user enters a financial question, the terminal sends the input data to the server. In the case of voice input, the terminal uses speech recognition technology to convert the voice data into text information and sends it to the server. The speech recognition technology used in this process is a commonly used cloud-based speech recognition service.

[0104] The server utilizes a generative artificial intelligence model to analyze the user input information it receives. This model references a vast amount of historically accumulated financial data and, while also considering the user's unique behavioral history, generates personalized financial advice. For example, it can analyze this month's spending and provide advice such as, "Your spending is high this month due to frequent eating out. If you limit eating out to two times until your next payday, your savings will increase."

[0105] Furthermore, the server analyzes the user's emotional information and uses emotional intelligence to understand their emotional state. If it determines that the user is in an emotionally distressed state, it will construct advice in a tone that is sensitive to their feelings and strive to reassure the user.

[0106] The generated advice is sent to the device as audio using speech synthesis technology and played back to the user. It is also displayed as text on the device screen, making it easy for users to understand the advice in any situation. A general-purpose speech synthesis engine is used for the speech synthesis technology.

[0107] To protect your privacy, user data is securely encrypted on the server to prevent unauthorized access by third parties. This allows you to use the system with peace of mind.

[0108] Examples of prompts include specific questions based on the user's financial situation and goals, such as, "Please analyze your spending trends over the past three months and tell me where you can save money," or "Where can I cut back this month to reach my current savings goal?"

[0109] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0110] Step 1:

[0111] The user enters financial-related questions into the terminal via voice or text. If voice input is used, the terminal uses speech recognition to convert the voice data into text data. In this process, the voice data is input into the speech recognition engine, and the resulting text data is output.

[0112] Step 2:

[0113] The terminal sends the converted character data to the server. The server analyzes the received input data and performs the necessary preprocessing for analysis. Specifically, it removes unnecessary spaces and special characters and formats the prompt text so that it is in a format suitable for the generating AI model.

[0114] Step 3:

[0115] The server utilizes a generative artificial intelligence model to generate personalized financial advice based on analyzed input data. The model receives formatted data as prompts and outputs personalized financial advice text. This output is based on pre-trained financial data and behavioral history.

[0116] Step 4:

[0117] The server uses emotional intelligence technology to analyze emotional information from user input. The emotion analysis engine receives the user's text data and outputs data indicating the emotional state. Based on this data, it constructs advice with a tone and content that takes the user's emotional state into consideration.

[0118] Step 5:

[0119] The generated financial advice is converted into audio data using speech synthesis technology. The server inputs the advice's text data into the speech synthesis engine and outputs it as audio data. This process converts the advice into a format that is easy for the user to understand.

[0120] Step 6:

[0121] The server sends the generated audio and text advice to the terminal. The terminal receives this and plays it back to the user as audio and displays the text on the screen. This allows the user to confirm the advice visually and audibly.

[0122] Step 7:

[0123] The server updates the user's behavior history and stores it in the database to improve the accuracy of future advice. The update process tracks the advice received by the user and their responses, contributing to continuous system improvement.

[0124] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0125] This invention illustrates a specific embodiment of a system that provides personalized financial advice to users using a generative artificial intelligence model and an emotion engine. This system analyzes the user's input, recognizes their emotional state, and then generates and provides optimal advice.

[0126] First, the user accesses the application using their device via voice or text. When the user enters a question or inquiry, the input data is sent from the device to the server. In the case of voice input, the device converts the voice data into text using its speech recognition function and sends it to the server.

[0127] The server analyzes the received text data. A generative artificial intelligence model processes this data to create personalized financial advice based on the user's input. Simultaneously, an emotion engine analyzes the user's text data and voice characteristics to assess the user's emotional state. Based on this assessment, the content and tone of the advice are adjusted. For example, if the server determines that the user is stressed, it will be configured to provide gentle and encouraging advice.

[0128] The generated advice is sent to the terminal as voice output via speech synthesis and simultaneously displayed on the screen as text. Specifically, if the user inputs "I feel anxious about future investments," the emotion engine detects the anxiety, and the server provides reassuring advice such as "Don't worry. Considering this investment strategy, the risks are managed."

[0129] Furthermore, this system saves users' behavioral history and uses it to continuously optimize responses to users by providing advice for future interactions. It also protects user privacy and ensures secure data transmission through data encryption.

[0130] In this way, it becomes possible to provide optimal financial advice tailored to the user's situation and emotions in real time, thereby realizing more reliable personalized support.

[0131] The following describes the processing flow.

[0132] Step 1:

[0133] Users access the application through their device and input financial questions or concerns via voice or text. In the case of voice input, the device's voice recognition function converts the voice into text data.

[0134] Step 2:

[0135] The terminal bundles the user's input data into packets and sends them to the server using a secure protocol. User ID and session information are also transmitted during this process.

[0136] Step 3:

[0137] The server first performs a security check on the data received from the terminal to confirm its safety. Next, it uses a natural language processing module to analyze the input data and extract the information it contains.

[0138] Step 4:

[0139] The server uses a generative artificial intelligence model to generate personalized financial advice based on the extracted information. This process also takes into account the user's behavioral history and past advice.

[0140] Step 5:

[0141] The server uses an emotion engine to analyze the user's emotional state from their input. It detects emotions such as stress, relief, and anxiety from the user's vocabulary and voice characteristics, and adjusts the tone and content of advice based on the results.

[0142] Step 6:

[0143] The tailored advice is converted to a speech format via a speech synthesis module and sent to the device along with the text. Here too, the data is encrypted to ensure privacy.

[0144] Step 7:

[0145] The terminal decrypts the data received from the server and displays the advice as text on the user interface, while also providing a voice-over function.

[0146] Step 8:

[0147] Users review the advice provided and incorporate it into their future actions. At this time, their reactions and new inputs are recorded as part of their behavioral history and used to improve the accuracy of future advice.

[0148] (Example 2)

[0149] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0150] Traditional financial advice systems often fail to adequately consider a user's individual emotional state or past behavioral history, potentially resulting in inadequate advice. Furthermore, they lack sufficient user privacy protection, raising concerns about data security. These issues lead to users lacking reliable support.

[0151] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0152] In this invention, the server includes means for analyzing input information using a data processing device and generating personalized economic advice for the user, means for evaluating emotional information from the user and adjusting the response based on that evaluation, and a speech recognition device for processing voice input and converting it into text information. This makes it possible to provide optimized economic advice based on the user's individual emotional state and past behavioral history.

[0153] A "data processing device" is a device used to analyze input information and generate specific results based on that data.

[0154] "Users" refers to individuals or organizations that use this system, and in particular, to those seeking financial advice.

[0155] "Financial advice" refers to information that provides guidance and suggestions regarding a user's financial situation and investment strategy.

[0156] "Emotional information" refers to data that indicates the emotional state expressed by a user, and is derived from information analyzed from text and audio.

[0157] A "speech recognition device" is a device or software equipped with technology to analyze speech signals and convert them into text information.

[0158] A "speech generation device" is a device or software equipped with the technology to analyze textual information and output it as an audio signal.

[0159] "Past behavioral history" refers to a collection of data that records actions and history that a user has performed in the past.

[0160] "Privacy protection" refers to the means and technologies used to protect personal information from unauthorized access and misuse.

[0161] This invention is a system that provides personalized financial advice to users in real time. The system uses a generative AI model and an emotion engine to generate optimal advice tailored to the user's needs.

[0162] The user first accesses the application using a device. The user then inputs their questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. This process utilizes standard speech recognition software.

[0163] The converted text data is securely transmitted to the server using encryption technology. The server uses data processing equipment to analyze the received data. Here, a generative AI model (e.g., a large-scale language model) is used to generate personalized economic advice based on the user's input.

[0164] Simultaneously, the server uses an emotion engine to evaluate the user's emotional information. This emotional information is determined from the content of the text data and the expressions it contains. Based on this evaluation, the server adjusts the content and tone of the generated advice.

[0165] Finally, the adjusted advice is converted into audio data using a speech generator and sent to the terminal. The terminal plays the advice in audio format and simultaneously displays it in text format.

[0166] For example, if a user enters "I'm worried about how I should save for my children's education," the server uses a generative AI model to create personalized advice for this question and delivers it to the user in a reassuring tone. This process allows users to receive optimized advice based on their emotional state and past behavioral history.

[0167] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0168] Step 1:

[0169] Users access the application using their device and input questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. In this process, it receives a voice signal as input and obtains output converted into text using speech recognition technology.

[0170] Step 2:

[0171] The terminal sends the text data converted by speech recognition to the server. During this process, data encryption technology is used to securely transmit the text data to the server. The input is encrypted text data, and the output is received in a format that the server can analyze.

[0172] Step 3:

[0173] The server analyzes the received text data using a data processing device and generates personalized economic advice using a generative AI model. The input is text data, and the generative AI model performs data calculations based on this data to produce advice as output.

[0174] Step 4:

[0175] The server uses an emotion engine to evaluate user emotional information from text data. It receives text data as input, analyzes its content to identify the user's emotional state, and outputs the evaluation result.

[0176] Step 5:

[0177] The tone and content of the generated advice are adjusted based on the emotion evaluation results. This results in optimized advice that is tailored to the user's emotional state. The input consists of advice from the generating AI model and the results of the emotion evaluation, and optimized advice is output.

[0178] Step 6:

[0179] The server converts optimized advice into audio data using a speech generator and sends it to the terminal. It also outputs multimedia-format advice received as input as audio data and sends it to the terminal.

[0180] Step 7:

[0181] The terminal plays audio data sent from the server and displays text data on the screen. Users can listen to the provided advice in audio and confirm it in text. The input is data sent from the server, and this is used to produce output in the form of audio playback and text display.

[0182] (Application Example 2)

[0183] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0184] Modern asset management requires the provision of appropriate financial advice in real time, tailored to the user's emotional state. However, conventional systems have faced challenges in accurately understanding a user's specific emotional state and generating personalized advice accordingly. Furthermore, protecting user privacy while providing advice in real time via voice input and output is also a crucial challenge.

[0185] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0186] In this invention, the server includes means for analyzing input information using a generative artificial intelligence system and generating personalized asset management suggestions for the user; means for evaluating emotional information from the user and deriving a response corresponding to the emotional state; means for processing voice input and converting it into text information; and means for providing emotionally appropriate financial advice when conducting electronic transactions. This makes it possible to achieve more appropriate asset management while simultaneously providing real-time financial advice tailored to the user's emotions and protecting their privacy.

[0187] A "generative artificial intelligence system" is an artificial intelligence technology that analyzes user input information and generates personalized asset management proposals.

[0188] "Emotional information" refers to data that indicates the emotional state of a user, and this data is used to analyze that state and derive personalized responses.

[0189] An "asset management proposal" is specific advice that presents the optimal management method based on the user's financial situation.

[0190] "Speech recognition means" refers to a technical means for receiving speech input and converting it into text information.

[0191] "Electronic transactions" refer to all acts of buying, selling, or paying for goods and services conducted through online platforms or similar means.

[0192] "Speech synthesis means" refers to a technical means for outputting the generated asset management proposal as speech.

[0193] "Privacy protection" refers to information security measures that protect users' personal information and transaction data from third parties and manage them securely.

[0194] In this invention, the user accesses the system from a terminal via voice or text input. The terminal uses speech recognition software to convert the voice-input information into text information. During this process, the voice-input information is analyzed by natural language processing software. Specifically, a speech recognition engine installed in the terminal is used for speech recognition. The converted text information is then transmitted to a server.

[0195] The server utilizes a generative artificial intelligence system to generate personalized asset management suggestions from received text information. Here, the generative AI model plays a crucial role in generating advice based on user input. The server also uses an emotion engine to assess the user's emotional state and adjust the suggestions accordingly. For example, if the server determines that the user is feeling anxious about cost reduction, it will adjust its advice to be delivered in a gentler tone.

[0196] The generated asset management proposals are sent from the server to the terminal. The system utilizes speech synthesis technology to provide the proposals as voice output. The proposals are also displayed in text format, allowing users to visually review them.

[0197] As a concrete example of this system, if a user inputs "I'm worried about how to save money on my next trip," the emotion engine detects anxiety about saving money. In response, the server generates advice such as, "Your trip is definitely worth enjoying. However, if you want to save money, consider the following approaches." This advice is provided in both voice and text formats, delivering information in a way that suits the user's needs.

[0198] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0199] Step 1:

[0200] Users input questions and inquiries via voice or text through their device. This input is provided as voice in the case of voice input, and as text data in the case of text input. Voice input is converted into text data by the device's speech recognition engine. The speech recognition engine analyzes the voice signal and generates a corresponding string, which is then output as text data.

[0201] Step 2:

[0202] The terminal sends the text data obtained as a result of speech recognition to the server. Here, the converted text data is transferred to the server side. In particular, the user's consultation content is transmitted as data for processing by the generative AI model.

[0203] Step 3:

[0204] The server analyzes the received text data and uses a generative artificial intelligence system to generate personalized asset management suggestions. In this process, the generative AI model generates prompts based on the user's questions and analyzes them to determine the best advice. The generated prompts are tailored to the user's specific needs and are used to generate subsequent responses.

[0205] Step 4:

[0206] Simultaneously, the server uses an emotion engine to evaluate emotional information from the user's text data. This involves analyzing words and phrases contained in the text to identify emotional states. For example, if negative emotions are expressed, the system may estimate the likelihood of anxiety or stress.

[0207] Step 5:

[0208] The server adjusts the tone and content of the suggestions generated based on emotional information to form the final asset management recommendations. This adjustment includes refining the suggestions to correspond to the emotional state. As output, personalized advice is provided as text data.

[0209] Step 6:

[0210] The generated suggestions are sent from the server to the terminal and provided to the user as voice output using speech synthesis technology. Furthermore, they are also displayed on the screen in text format. In this final step, the advice is output to the user's terminal in a way that is clearly visible as both voice and text.

[0211] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0212] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0213] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0214] [Second Embodiment]

[0215] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0216] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0217] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0218] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0219] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0220] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0221] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0222] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0223] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0224] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0225] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0226] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0227] This invention describes a system that provides personalized financial advice to users. The following explains how the server, terminal, and user interact to realize this system.

[0228] First, the user accesses the application using their device via voice or text. When the user enters a question, the device sends that input data to the server. If voice input is used, the device utilizes speech recognition to convert it into text data and sends this data to the server.

[0229] The server analyzes the received data. A generative artificial intelligence model is used to understand the user's input and extract the information necessary to generate appropriate financial advice. This model can provide highly accurate advice by referencing a large amount of pre-stored financial data and the user's behavioral history.

[0230] Furthermore, the server uses an emotional intelligence module to determine the user's emotional state from their input data. For example, if the server determines that a user's question is emotionally tense, it will construct advice in a reassuring tone and prepare an emotionally sensitive response.

[0231] The advice generated by the server is returned to the terminal as voice output using speech synthesis technology. It is also displayed on the screen as text. For example, specific suggestions such as, "Let's review this month's expenses and try reducing eating out to save money," may be given.

[0232] Users can act based on the advice provided, and the system is particularly easy to use for senior citizens and visually impaired users, especially through audio delivery.

[0233] This system protects user privacy and ensures data security by encrypting user data. Furthermore, by accumulating behavioral history, it can continuously improve the accuracy of its advice. This helps users build reliable long-term financial plans.

[0234] The following describes the processing flow.

[0235] Step 1:

[0236] The user activates the device and enters a question into the application via voice input or text input. The device then converts the voice-input data into text data using speech recognition technology.

[0237] Step 2:

[0238] The terminal packets the text data entered by the user and sends it to the server via the internet. The user ID and session information are also sent along with the packet.

[0239] Step 3:

[0240] The server receives data packets from the terminal. The received data first undergoes a security check to ensure its safety. Then, a natural language processing module is used to analyze the data and identify the intent and content of the question.

[0241] Step 4:

[0242] Based on the analysis results, the server generates personalized financial advice for the user using a generative artificial intelligence model. This process references the user's behavioral history and previously learned financial data.

[0243] Step 5:

[0244] The server uses an emotional intelligence module to analyze the user's emotional data. It identifies emotional tension and stress from the questions asked and, if necessary, constructs empathetic advice that addresses those emotions.

[0245] Step 6:

[0246] The advice generated on the server is formatted into a user-friendly format and sent to the terminal as encrypted data.

[0247] Step 7:

[0248] The terminal receives a response from the server, decodes the data, and displays it in the application's user interface. Simultaneously, it outputs advice in voice using speech synthesis technology.

[0249] Step 8:

[0250] Users read the advice displayed on the screen or listen to it aloud, and take action as needed. During this process, the device updates the user's behavior history, which is used to improve the accuracy of advice for future interactions.

[0251] (Example 1)

[0252] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0253] Financial management and asset building in modern society vary greatly depending on individual circumstances and emotions, thus requiring personalized advice. However, conventional systems have problems in that they have difficulty taking into account the emotional state of users and do not adequately protect user privacy or utilize their behavioral history. This invention aims to address these issues and realize a system that provides highly accurate, personalized advice.

[0254] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0255] In this invention, the server includes means for analyzing information using a generative artificial intelligence model and generating personalized advice, means for analyzing emotional information from the user and deriving a response corresponding to the emotional state, and acoustic recognition means for processing acoustic input and converting it into textual information. This makes it possible to provide accurate advice that is appropriate to the user's situation and emotions.

[0256] A "generative artificial intelligence model" is a machine learning algorithm that uses a large amount of pre-trained data to produce appropriate outputs for new input data.

[0257] "User" refers to an individual or legal entity that uses the system to input information and receive advice.

[0258] "Emotional information" refers to data that expresses the user's emotions and psychological state.

[0259] "Audio input" refers to audio signals that are input into the system via a microphone or similar device.

[0260] "Textual information" refers to a format in which analyzed data is represented using text.

[0261] An "emotional intelligence module" is a software component that analyzes the user's emotional state and generates appropriate responses based on that analysis.

[0262] "Sound generation means" refers to the technology and process for converting text data into speech and transmitting it to the user.

[0263] "Behavioral history" refers to records of actions and inputs made by users in the past, and is data used to improve the accuracy of future suggestions.

[0264] "Digitized information encryption" refers to a technology that protects data to safeguard user privacy, and is a method of transforming information using a specific algorithm.

[0265] This invention describes a specific embodiment of a system that provides personalized advice to users. The user accesses the system via voice or text using their own terminal and inputs a question or request for advice. The terminal transmits this information to a server via an internet connection. If voice input is provided, the terminal utilizes acoustic recognition capabilities to convert the voice signal into text information.

[0266] The server uses a generative artificial intelligence model to analyze the input text information. Based on a large amount of pre-trained data, the generative AI model understands the user's question, extracts relevant information, and generates appropriate advice. In this process, the user's emotional information is also analyzed through an emotional intelligence module, and a response appropriate to their emotional state is derived.

[0267] The generated advice is returned to the terminal as an audible output via an acoustic generation device. It is also displayed on the terminal screen as text information. This system is designed to efficiently process the user's acoustic input and provide optimal responses tailored to the user's language habits and emotions. The user's behavioral history is also accumulated and considered when providing personalized advice during subsequent consultations.

[0268] As a concrete example, let's consider a scenario where a user enters a text prompt asking, "How can I increase my savings?" The system then uses a generative artificial intelligence model to analyze this prompt and generate specific advice, such as, "Try setting up an automatic savings plan that sets a certain percentage of your monthly income," which is then communicated to the user via both voice and text. This system provides the user with guidance to achieve their financial goals.

[0269] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0270] Step 1:

[0271] The user inputs questions using their device via voice or text. Specific user actions include typing prompts such as "How should I manage my expenses this month?" or asking questions verbally. Based on this input, the device converts the voice input into text using its acoustic recognition function.

[0272] Step 2:

[0273] The terminal sends the converted character information to the server. Specifically, this data is transferred to the server via the internet connection. The input data is character information provided by the user, and this information is used for analysis on the server.

[0274] Step 3:

[0275] The server analyzes this textual information using a generative artificial intelligence model. The server extracts important keywords and related information from the received text and generates appropriate advice. This data processing results in optimal financial advice based on the user's question.

[0276] Step 4:

[0277] The server analyzes the user's emotional information through an emotional intelligence module. This analysis evaluates the emotional state and generates a response with a corresponding tone and content. Specifically, if the user's input indicates tension or anxiety, the server generates reassuring language.

[0278] Step 5:

[0279] The server converts the generated advice into audio data using an audio generation device and sends it to the terminal. The output data is provided as audio output. Specifically, advice such as "First, I recommend listing your expenses and reviewing them" is returned in audio format.

[0280] Step 6:

[0281] The terminal plays audio data provided by the server and simultaneously displays it as text. The user can then take action based on this. Through the outputted audio and text information, the user receives personalized advice and appropriate guidance for action.

[0282] (Application Example 1)

[0283] Next, Application Example 1 will be described. In the following description, the data processing device 12 is referred to as a "server", and the smart glasses 214 are referred to as a "terminal".

[0284] In modern consumer society, many individuals are required to efficiently manage their expenditures and improve their financial situations. However, with conventional methods, it is difficult to obtain specific advice tailored to individual financial situations in real time. Additionally, since there is also a lack of a mechanism for providing advice that takes into account specific emotional states, there is a need for a system that enables users to manage their expenditures sustainably without difficulty.

[0285] The specific processing by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following respective means.

[0286] In this invention, the server includes means for analyzing input information using a generative artificial intelligence model and generating individualized financial advice for the user, means for analyzing emotional information from the user and deriving a response according to the emotional state, and voice recognition means for processing voice input and converting it into character information. As a result, the user can receive real-time expenditure management advice according to their individual financial situation and can obtain appropriate advice that takes emotions into account.

[0287] The "generative artificial intelligence model" is an advanced computational model that learns from a large amount of data and is used for analyzing and predicting new information.

[0288] The "input information" is data provided by the user and is information that is captured by the system as voice or characters.

[0289] The "individualized financial advice" is specific financial management advice created individually based on the specific situations and behavioral histories of each user.

[0290] The "emotional information" is data indicating the emotional state of the user and is information including emotional elements reflected in questions and statements.

[0291] "Speech recognition means" refers to technology for analyzing speech data and converting it into text information.

[0292] "Speech synthesis means" refers to a technology that converts text information into speech data and provides the information to the user as speech.

[0293] "Real-time spending data analysis" is a process that instantly analyzes a user's latest spending data and generates advice based on that data.

[0294] "Information encryption methods" are technologies used to protect data from third parties and are techniques for securely protecting information.

[0295] The system that realizes this invention operates in cooperation with a user, a terminal, and a server. The user accesses the application using voice or text via a terminal such as a smartphone. First, when the user enters a financial question, the terminal sends the input data to the server. In the case of voice input, the terminal uses speech recognition technology to convert the voice data into text information and sends it to the server. The speech recognition technology used in this process is a commonly used cloud-based speech recognition service.

[0296] The server utilizes a generative artificial intelligence model to analyze the user input information it receives. This model references a vast amount of historically accumulated financial data and, while also considering the user's unique behavioral history, generates personalized financial advice. For example, it can analyze this month's spending and provide advice such as, "Your spending is high this month due to frequent eating out. If you limit eating out to two times until your next payday, your savings will increase."

[0297] Furthermore, the server analyzes the user's emotional information and uses emotional intelligence to understand their emotional state. If it determines that the user is in an emotionally distressed state, it will construct advice in a tone that is sensitive to their feelings and strive to reassure the user.

[0298] The generated advice is sent to the device as audio using speech synthesis technology and played back to the user. It is also displayed as text on the device screen, making it easy for users to understand the advice in any situation. A general-purpose speech synthesis engine is used for the speech synthesis technology.

[0299] To protect your privacy, user data is securely encrypted on the server to prevent unauthorized access by third parties. This allows you to use the system with peace of mind.

[0300] Examples of prompts include specific questions based on the user's financial situation and goals, such as, "Please analyze your spending trends over the past three months and tell me where you can save money," or "Where can I cut back this month to reach my current savings goal?"

[0301] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0302] Step 1:

[0303] The user enters financial-related questions into the terminal via voice or text. If voice input is used, the terminal uses speech recognition to convert the voice data into text data. In this process, the voice data is input into the speech recognition engine, and the resulting text data is output.

[0304] Step 2:

[0305] The terminal sends the converted character data to the server. The server analyzes the received input data and performs the necessary preprocessing. Specifically, it removes unnecessary spaces and special characters and formats the prompt text into a form suitable for the generative AI model.

[0306] Step 3:

[0307] The server utilizes a generative artificial intelligence model to generate personalized financial advice based on the analyzed input data. The model takes the data formatted as a prompt text as input and obtains, as its output, the text of financial advice suitable for the user. This output is based on pre-trained financial data and behavioral history.

[0308] Step 4:

[0309] The server uses emotional intelligence technology to analyze emotional information from the user's input. The text data of the user input is input into the emotion analysis engine, which outputs data indicating the emotional state. Based on this data, advice is constructed with a tone and content that takes the emotional state into consideration.

[0310] Step 5:

[0311] The generated financial advice is converted into audio data using text-to-speech technology. The server inputs the text data of the advice into the text-to-speech engine and outputs it as audio data. Through this process, the advice is converted into a form that is easy for the user to listen to.

[0312] Step 6:

[0313] The server sends the generated audio and text advice to the terminal. The terminal receives this, plays it as audio for the user, and displays the text on the screen. This enables the user to visually and auditorily confirm the advice.

[0314] Step 7:

[0315] The server updates the user's behavior history and stores it in the database to improve the accuracy of future advice. The update process tracks the advice received by the user and their responses, contributing to continuous system improvement.

[0316] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0317] This invention illustrates a specific embodiment of a system that provides personalized financial advice to users using a generative artificial intelligence model and an emotion engine. This system analyzes the user's input, recognizes their emotional state, and then generates and provides optimal advice.

[0318] First, the user accesses the application using their device via voice or text. When the user enters a question or inquiry, the input data is sent from the device to the server. In the case of voice input, the device converts the voice data into text using its speech recognition function and sends it to the server.

[0319] The server analyzes the received text data. A generative artificial intelligence model processes this data to create personalized financial advice based on the user's input. Simultaneously, an emotion engine analyzes the user's text data and voice characteristics to assess the user's emotional state. Based on this assessment, the content and tone of the advice are adjusted. For example, if the server determines that the user is stressed, it will be configured to provide gentle and encouraging advice.

[0320] The generated advice is sent to the terminal as voice output via speech synthesis and simultaneously displayed on the screen as text. Specifically, if the user inputs "I feel anxious about future investments," the emotion engine detects the anxiety, and the server provides reassuring advice such as "Don't worry. Considering this investment strategy, the risks are managed."

[0321] Furthermore, this system saves users' behavioral history and uses it to continuously optimize responses to users by providing advice for future interactions. It also protects user privacy and ensures secure data transmission through data encryption.

[0322] In this way, it becomes possible to provide optimal financial advice tailored to the user's situation and emotions in real time, thereby realizing more reliable personalized support.

[0323] The following describes the processing flow.

[0324] Step 1:

[0325] Users access the application through their device and input financial questions or concerns via voice or text. In the case of voice input, the device's voice recognition function converts the voice into text data.

[0326] Step 2:

[0327] The terminal bundles the user's input data into packets and sends them to the server using a secure protocol. User ID and session information are also transmitted during this process.

[0328] Step 3:

[0329] The server first performs a security check on the data received from the terminal to confirm its safety. Next, it uses a natural language processing module to analyze the input data and extract the information it contains.

[0330] Step 4:

[0331] The server uses a generative artificial intelligence model to generate personalized financial advice based on the extracted information. This process also takes into account the user's behavioral history and past advice.

[0332] Step 5:

[0333] The server uses an emotion engine to analyze the user's emotional state from their input. It detects emotions such as stress, relief, and anxiety from the user's vocabulary and voice characteristics, and adjusts the tone and content of advice based on the results.

[0334] Step 6:

[0335] The tailored advice is converted to a speech format via a speech synthesis module and sent to the device along with the text. Here too, the data is encrypted to ensure privacy.

[0336] Step 7:

[0337] The terminal decrypts the data received from the server and displays the advice as text on the user interface, while also providing a voice-over function.

[0338] Step 8:

[0339] Users review the advice provided and incorporate it into their future actions. At this time, their reactions and new inputs are recorded as part of their behavioral history and used to improve the accuracy of future advice.

[0340] (Example 2)

[0341] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0342] Traditional financial advice systems often fail to adequately consider a user's individual emotional state or past behavioral history, potentially resulting in inadequate advice. Furthermore, they lack sufficient user privacy protection, raising concerns about data security. These issues lead to users lacking reliable support.

[0343] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0344] In this invention, the server includes means for analyzing input information using a data processing device and generating personalized economic advice for the user, means for evaluating emotional information from the user and adjusting the response based on that evaluation, and a speech recognition device for processing voice input and converting it into text information. This makes it possible to provide optimized economic advice based on the user's individual emotional state and past behavioral history.

[0345] A "data processing device" is a device used to analyze input information and generate specific results based on that data.

[0346] "Users" refers to individuals or organizations that use this system, and in particular, to those seeking financial advice.

[0347] "Financial advice" refers to information that provides guidance and suggestions regarding a user's financial situation and investment strategy.

[0348] "Emotional information" refers to data that indicates the emotional state expressed by a user, and is derived from information analyzed from text and audio.

[0349] A "speech recognition device" is a device or software equipped with technology to analyze speech signals and convert them into text information.

[0350] A "speech generation device" is a device or software equipped with the technology to analyze textual information and output it as an audio signal.

[0351] "Past behavioral history" refers to a collection of data that records actions and history that a user has performed in the past.

[0352] "Privacy protection" refers to the means and technologies used to protect personal information from unauthorized access and misuse.

[0353] This invention is a system that provides personalized financial advice to users in real time. The system uses a generative AI model and an emotion engine to generate optimal advice tailored to the user's needs.

[0354] The user first accesses the application using a device. The user then inputs their questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. This process utilizes standard speech recognition software.

[0355] The converted text data is securely transmitted to the server using encryption technology. The server uses data processing equipment to analyze the received data. Here, a generative AI model (e.g., a large-scale language model) is used to generate personalized economic advice based on the user's input.

[0356] Simultaneously, the server uses an emotion engine to evaluate the user's emotional information. This emotional information is determined from the content of the text data and the expressions it contains. Based on this evaluation, the server adjusts the content and tone of the generated advice.

[0357] Finally, the adjusted advice is converted into audio data using a speech generator and sent to the terminal. The terminal plays the advice in audio format and simultaneously displays it in text format.

[0358] For example, if a user enters "I'm worried about how I should save for my children's education," the server uses a generative AI model to create personalized advice for this question and delivers it to the user in a reassuring tone. This process allows users to receive optimized advice based on their emotional state and past behavioral history.

[0359] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0360] Step 1:

[0361] Users access the application using their device and input questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. In this process, it receives a voice signal as input and obtains output converted into text using speech recognition technology.

[0362] Step 2:

[0363] The terminal sends the text data converted by speech recognition to the server. During this process, data encryption technology is used to securely transmit the text data to the server. The input is encrypted text data, and the output is received in a format that the server can analyze.

[0364] Step 3:

[0365] The server analyzes the received text data using a data processing device and generates personalized economic advice using a generative AI model. The input is text data, and the generative AI model performs data calculations based on this data to produce advice as output.

[0366] Step 4:

[0367] The server uses an emotion engine to evaluate user emotional information from text data. It receives text data as input, analyzes its content to identify the user's emotional state, and outputs the evaluation result.

[0368] Step 5:

[0369] The tone and content of the generated advice are adjusted based on the emotion evaluation results. This results in optimized advice that is tailored to the user's emotional state. The input consists of advice from the generating AI model and the results of the emotion evaluation, and optimized advice is output.

[0370] Step 6:

[0371] The server converts optimized advice into audio data using a speech generator and sends it to the terminal. It also outputs multimedia-format advice received as input as audio data and sends it to the terminal.

[0372] Step 7:

[0373] The terminal plays audio data sent from the server and displays text data on the screen. Users can listen to the provided advice in audio and confirm it in text. The input is data sent from the server, and this is used to produce output in the form of audio playback and text display.

[0374] (Application Example 2)

[0375] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0376] Modern asset management requires the provision of appropriate financial advice in real time, tailored to the user's emotional state. However, conventional systems have faced challenges in accurately understanding a user's specific emotional state and generating personalized advice accordingly. Furthermore, protecting user privacy while providing advice in real time via voice input and output is also a crucial challenge.

[0377] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0378] In this invention, the server includes means for analyzing input information using a generative artificial intelligence system and generating personalized asset management suggestions for the user; means for evaluating emotional information from the user and deriving a response corresponding to the emotional state; means for processing voice input and converting it into text information; and means for providing emotionally appropriate financial advice when conducting electronic transactions. This makes it possible to achieve more appropriate asset management while simultaneously providing real-time financial advice tailored to the user's emotions and protecting their privacy.

[0379] A "generative artificial intelligence system" is an artificial intelligence technology that analyzes user input information and generates personalized asset management proposals.

[0380] "Emotional information" refers to data that indicates the emotional state of a user, and this data is used to analyze that state and derive personalized responses.

[0381] An "asset management proposal" is specific advice that presents the optimal management method based on the user's financial situation.

[0382] "Speech recognition means" refers to a technical means for receiving speech input and converting it into text information.

[0383] "Electronic transactions" refer to all acts of buying, selling, or paying for goods and services conducted through online platforms or similar means.

[0384] "Speech synthesis means" refers to a technical means for outputting the generated asset management proposal as speech.

[0385] "Privacy protection" refers to information security measures that protect users' personal information and transaction data from third parties and manage them securely.

[0386] In this invention, the user accesses the system from a terminal via voice or text input. The terminal uses speech recognition software to convert the voice-input information into text information. During this process, the voice-input information is analyzed by natural language processing software. Specifically, a speech recognition engine installed in the terminal is used for speech recognition. The converted text information is then transmitted to a server.

[0387] The server utilizes a generative artificial intelligence system to generate personalized asset management suggestions from received text information. Here, the generative AI model plays a crucial role in generating advice based on user input. The server also uses an emotion engine to assess the user's emotional state and adjust the suggestions accordingly. For example, if the server determines that the user is feeling anxious about cost reduction, it will adjust its advice to be delivered in a gentler tone.

[0388] The generated asset management proposals are sent from the server to the terminal. The system utilizes speech synthesis technology to provide the proposals as voice output. The proposals are also displayed in text format, allowing users to visually review them.

[0389] As a concrete example of this system, if a user inputs "I'm worried about how to save money on my next trip," the emotion engine detects anxiety about saving money. In response, the server generates advice such as, "Your trip is definitely worth enjoying. However, if you want to save money, consider the following approaches." This advice is provided in both voice and text formats, delivering information in a way that suits the user's needs.

[0390] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0391] Step 1:

[0392] Users input questions and inquiries via voice or text through their device. This input is provided as voice in the case of voice input, and as text data in the case of text input. Voice input is converted into text data by the device's speech recognition engine. The speech recognition engine analyzes the voice signal and generates a corresponding string, which is then output as text data.

[0393] Step 2:

[0394] The terminal sends the text data obtained as a result of speech recognition to the server. Here, the converted text data is transferred to the server side. In particular, the user's consultation content is transmitted as data for processing by the generative AI model.

[0395] Step 3:

[0396] The server analyzes the received text data and uses a generative artificial intelligence system to generate personalized asset management suggestions. In this process, the generative AI model generates prompts based on the user's questions and analyzes them to determine the best advice. The generated prompts are tailored to the user's specific needs and are used to generate subsequent responses.

[0397] Step 4:

[0398] Simultaneously, the server uses an emotion engine to evaluate emotional information from the user's text data. This involves analyzing words and phrases contained in the text to identify emotional states. For example, if negative emotions are expressed, the system may estimate the likelihood of anxiety or stress.

[0399] Step 5:

[0400] The server adjusts the tone and content of the suggestions generated based on emotional information to form the final asset management recommendations. This adjustment includes refining the suggestions to correspond to the emotional state. As output, personalized advice is provided as text data.

[0401] Step 6:

[0402] The generated suggestions are sent from the server to the terminal and provided to the user as voice output using speech synthesis technology. Furthermore, they are also displayed on the screen in text format. In this final step, the advice is output to the user's terminal in a way that is clearly visible as both voice and text.

[0403] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0404] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0405] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0406] [Third Embodiment]

[0407] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0408] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0409] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0410] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0411] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0412] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0413] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0414] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0415] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0416] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0417] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0418] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0419] This invention describes a system that provides personalized financial advice to users. The following explains how the server, terminal, and user interact to realize this system.

[0420] First, the user accesses the application using their device via voice or text. When the user enters a question, the device sends that input data to the server. If voice input is used, the device utilizes speech recognition to convert it into text data and sends this data to the server.

[0421] The server analyzes the received data. A generative artificial intelligence model is used to understand the user's input and extract the information necessary to generate appropriate financial advice. This model can provide highly accurate advice by referencing a large amount of pre-stored financial data and the user's behavioral history.

[0422] Furthermore, the server uses an emotional intelligence module to determine the user's emotional state from their input data. For example, if the server determines that a user's question is emotionally tense, it will construct advice in a reassuring tone and prepare an emotionally sensitive response.

[0423] The advice generated by the server is returned to the terminal as voice output using speech synthesis technology. It is also displayed on the screen as text. For example, specific suggestions such as, "Let's review this month's expenses and try reducing eating out to save money," may be given.

[0424] Users can act based on the advice provided, and the system is particularly easy to use for senior citizens and visually impaired users, especially through audio delivery.

[0425] This system protects user privacy and ensures data security by encrypting user data. Furthermore, by accumulating behavioral history, it can continuously improve the accuracy of its advice. This helps users build reliable long-term financial plans.

[0426] The following describes the processing flow.

[0427] Step 1:

[0428] The user activates the device and enters a question into the application via voice input or text input. The device then converts the voice-input data into text data using speech recognition technology.

[0429] Step 2:

[0430] The terminal packets the text data entered by the user and sends it to the server via the internet. The user ID and session information are also sent along with the packet.

[0431] Step 3:

[0432] The server receives data packets from the terminal. The received data first undergoes a security check to ensure its safety. Then, a natural language processing module is used to analyze the data and identify the intent and content of the question.

[0433] Step 4:

[0434] Based on the analysis results, the server generates personalized financial advice for the user using a generative artificial intelligence model. This process references the user's behavioral history and previously learned financial data.

[0435] Step 5:

[0436] The server uses an emotional intelligence module to analyze the user's emotional data. It identifies emotional tension and stress from the questions asked and, if necessary, constructs empathetic advice that addresses those emotions.

[0437] Step 6:

[0438] The advice generated on the server is formatted into a user-friendly format and sent to the terminal as encrypted data.

[0439] Step 7:

[0440] The terminal receives a response from the server, decodes the data, and displays it in the application's user interface. Simultaneously, it outputs advice in voice using speech synthesis technology.

[0441] Step 8:

[0442] Users read the advice displayed on the screen or listen to it aloud, and take action as needed. During this process, the device updates the user's behavior history, which is used to improve the accuracy of advice for future interactions.

[0443] (Example 1)

[0444] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0445] Financial management and asset building in modern society vary greatly depending on individual circumstances and emotions, thus requiring personalized advice. However, conventional systems have problems in that they have difficulty taking into account the emotional state of users and do not adequately protect user privacy or utilize their behavioral history. This invention aims to address these issues and realize a system that provides highly accurate, personalized advice.

[0446] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0447] In this invention, the server includes means for analyzing information using a generative artificial intelligence model and generating personalized advice, means for analyzing emotional information from the user and deriving a response corresponding to the emotional state, and acoustic recognition means for processing acoustic input and converting it into textual information. This makes it possible to provide accurate advice that is appropriate to the user's situation and emotions.

[0448] A "generative artificial intelligence model" is a machine learning algorithm that uses a large amount of pre-trained data to produce appropriate outputs for new input data.

[0449] "User" refers to an individual or legal entity that uses the system to input information and receive advice.

[0450] "Emotional information" refers to data that expresses the user's emotions and psychological state.

[0451] "Audio input" refers to audio signals that are input into the system via a microphone or similar device.

[0452] "Textual information" refers to a format in which analyzed data is represented using text.

[0453] An "emotional intelligence module" is a software component that analyzes the user's emotional state and generates appropriate responses based on that analysis.

[0454] "Sound generation means" refers to the technology and process for converting text data into speech and transmitting it to the user.

[0455] "Behavioral history" refers to records of actions and inputs made by users in the past, and is data used to improve the accuracy of future suggestions.

[0456] "Digitized information encryption" refers to a technology that protects data to safeguard user privacy, and is a method of transforming information using a specific algorithm.

[0457] This invention describes a specific embodiment of a system that provides personalized advice to users. The user accesses the system via voice or text using their own terminal and inputs a question or request for advice. The terminal transmits this information to a server via an internet connection. If voice input is provided, the terminal utilizes acoustic recognition capabilities to convert the voice signal into text information.

[0458] The server uses a generative artificial intelligence model to analyze the input text information. Based on a large amount of pre-trained data, the generative AI model understands the user's question, extracts relevant information, and generates appropriate advice. In this process, the user's emotional information is also analyzed through an emotional intelligence module, and a response appropriate to their emotional state is derived.

[0459] The generated advice is returned to the terminal as an audible output via an acoustic generation device. It is also displayed on the terminal screen as text information. This system is designed to efficiently process the user's acoustic input and provide optimal responses tailored to the user's language habits and emotions. The user's behavioral history is also accumulated and considered when providing personalized advice during subsequent consultations.

[0460] As a concrete example, let's consider a scenario where a user enters a text prompt asking, "How can I increase my savings?" The system then uses a generative artificial intelligence model to analyze this prompt and generate specific advice, such as, "Try setting up an automatic savings plan that sets a certain percentage of your monthly income," which is then communicated to the user via both voice and text. This system provides the user with guidance to achieve their financial goals.

[0461] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0462] Step 1:

[0463] The user inputs questions using their device via voice or text. Specific user actions include typing prompts such as "How should I manage my expenses this month?" or asking questions verbally. Based on this input, the device converts the voice input into text using its acoustic recognition function.

[0464] Step 2:

[0465] The terminal sends the converted character information to the server. Specifically, this data is transferred to the server via the internet connection. The input data is character information provided by the user, and this information is used for analysis on the server.

[0466] Step 3:

[0467] The server analyzes this textual information using a generative artificial intelligence model. The server extracts important keywords and related information from the received text and generates appropriate advice. This data processing results in optimal financial advice based on the user's question.

[0468] Step 4:

[0469] The server analyzes the user's emotional information through an emotional intelligence module. This analysis evaluates the emotional state and generates a response with a corresponding tone and content. Specifically, if the user's input indicates tension or anxiety, the server generates reassuring language.

[0470] Step 5:

[0471] The server converts the generated advice into audio data using an audio generation device and sends it to the terminal. The output data is provided as audio output. Specifically, advice such as "First, I recommend listing your expenses and reviewing them" is returned in audio format.

[0472] Step 6:

[0473] The terminal plays audio data provided by the server and simultaneously displays it as text. The user can then take action based on this. Through the outputted audio and text information, the user receives personalized advice and appropriate guidance for action.

[0474] (Application Example 1)

[0475] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0476] In today's consumer society, many individuals are required to efficiently manage their spending and improve their financial situation. However, traditional methods make it difficult to obtain specific, real-time advice tailored to individual financial circumstances. Furthermore, there is a lack of systems that provide advice that takes into account specific emotional states. Therefore, there is a need for a system that allows users to manage their spending in a sustainable way without undue burden.

[0477] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0478] In this invention, the server includes means for analyzing input information using a generative artificial intelligence model and generating personalized financial advice for the user, means for analyzing emotional information from the user and deriving a response corresponding to the emotional state, and speech recognition means for processing voice input and converting it into text information. As a result, the user can receive real-time spending management advice tailored to their individual financial situation, as well as appropriate advice that takes their emotions into consideration.

[0479] A "generative artificial intelligence model" is an advanced computational model used to learn from large amounts of data and perform analysis and prediction of new information.

[0480] "Input information" refers to data provided by the user, which is incorporated into the system as audio or text.

[0481] "Personalized financial advice" refers to specific financial management advice created individually based on each user's specific circumstances and behavioral history.

[0482] "Emotional information" refers to data that indicates a user's emotional state, and includes information that reflects emotional elements in questions and statements.

[0483] "Speech recognition means" refers to technology for analyzing speech data and converting it into text information.

[0484] "Speech synthesis means" refers to a technology that converts text information into speech data and provides the information to the user as speech.

[0485] "Real-time spending data analysis" is a process that instantly analyzes a user's latest spending data and generates advice based on that data.

[0486] "Information encryption methods" are technologies used to protect data from third parties and are techniques for securely protecting information.

[0487] The system that realizes this invention operates in cooperation with a user, a terminal, and a server. The user accesses the application using voice or text via a terminal such as a smartphone. First, when the user enters a financial question, the terminal sends the input data to the server. In the case of voice input, the terminal uses speech recognition technology to convert the voice data into text information and sends it to the server. The speech recognition technology used in this process is a commonly used cloud-based speech recognition service.

[0488] The server utilizes a generative artificial intelligence model to analyze the user input information it receives. This model references a vast amount of historically accumulated financial data and, while also considering the user's unique behavioral history, generates personalized financial advice. For example, it can analyze this month's spending and provide advice such as, "Your spending is high this month due to frequent eating out. If you limit eating out to two times until your next payday, your savings will increase."

[0489] Furthermore, the server analyzes the user's emotional information and uses emotional intelligence to understand their emotional state. If it determines that the user is in an emotionally distressed state, it will construct advice in a tone that is sensitive to their feelings and strive to reassure the user.

[0490] The generated advice is sent to the device as audio using speech synthesis technology and played back to the user. It is also displayed as text on the device screen, making it easy for users to understand the advice in any situation. A general-purpose speech synthesis engine is used for the speech synthesis technology.

[0491] To protect your privacy, user data is securely encrypted on the server to prevent unauthorized access by third parties. This allows you to use the system with peace of mind.

[0492] Examples of prompts include specific questions based on the user's financial situation and goals, such as, "Please analyze your spending trends over the past three months and tell me where you can save money," or "Where can I cut back this month to reach my current savings goal?"

[0493] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0494] Step 1:

[0495] The user enters financial-related questions into the terminal via voice or text. If voice input is used, the terminal uses speech recognition to convert the voice data into text data. In this process, the voice data is input into the speech recognition engine, and the resulting text data is output.

[0496] Step 2:

[0497] The terminal sends the converted character data to the server. The server analyzes the received input data and performs the necessary preprocessing for analysis. Specifically, it removes unnecessary spaces and special characters and formats the prompt text so that it is in a format suitable for the generating AI model.

[0498] Step 3:

[0499] The server utilizes a generative artificial intelligence model to generate personalized financial advice based on analyzed input data. The model receives formatted data as prompts and outputs personalized financial advice text. This output is based on pre-trained financial data and behavioral history.

[0500] Step 4:

[0501] The server uses emotional intelligence technology to analyze emotional information from user input. The emotion analysis engine receives the user's text data and outputs data indicating the emotional state. Based on this data, it constructs advice with a tone and content that takes the user's emotional state into consideration.

[0502] Step 5:

[0503] The generated financial advice is converted into audio data using speech synthesis technology. The server inputs the advice's text data into the speech synthesis engine and outputs it as audio data. This process converts the advice into a format that is easy for the user to understand.

[0504] Step 6:

[0505] The server sends the generated audio and text advice to the terminal. The terminal receives this and plays it back to the user as audio and displays the text on the screen. This allows the user to confirm the advice visually and audibly.

[0506] Step 7:

[0507] The server updates the user's behavior history and stores it in the database to improve the accuracy of future advice. The update process tracks the advice received by the user and their responses, contributing to continuous system improvement.

[0508] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0509] This invention illustrates a specific embodiment of a system that provides personalized financial advice to users using a generative artificial intelligence model and an emotion engine. This system analyzes the user's input, recognizes their emotional state, and then generates and provides optimal advice.

[0510] First, the user accesses the application using their device via voice or text. When the user enters a question or inquiry, the input data is sent from the device to the server. In the case of voice input, the device converts the voice data into text using its speech recognition function and sends it to the server.

[0511] The server analyzes the received text data. A generative artificial intelligence model processes this data to create personalized financial advice based on the user's input. Simultaneously, an emotion engine analyzes the user's text data and voice characteristics to assess the user's emotional state. Based on this assessment, the content and tone of the advice are adjusted. For example, if the server determines that the user is stressed, it will be configured to provide gentle and encouraging advice.

[0512] The generated advice is sent to the terminal as voice output via speech synthesis and simultaneously displayed on the screen as text. Specifically, if the user inputs "I feel anxious about future investments," the emotion engine detects the anxiety, and the server provides reassuring advice such as "Don't worry. Considering this investment strategy, the risks are managed."

[0513] Furthermore, this system saves users' behavioral history and uses it to continuously optimize responses to users by providing advice for future interactions. It also protects user privacy and ensures secure data transmission through data encryption.

[0514] In this way, it becomes possible to provide optimal financial advice tailored to the user's situation and emotions in real time, thereby realizing more reliable personalized support.

[0515] The following describes the processing flow.

[0516] Step 1:

[0517] Users access the application through their device and input financial questions or concerns via voice or text. In the case of voice input, the device's voice recognition function converts the voice into text data.

[0518] Step 2:

[0519] The terminal bundles the user's input data into packets and sends them to the server using a secure protocol. User ID and session information are also transmitted during this process.

[0520] Step 3:

[0521] The server first performs a security check on the data received from the terminal to confirm its safety. Next, it uses a natural language processing module to analyze the input data and extract the information it contains.

[0522] Step 4:

[0523] The server uses a generative artificial intelligence model to generate personalized financial advice based on the extracted information. This process also takes into account the user's behavioral history and past advice.

[0524] Step 5:

[0525] The server uses an emotion engine to analyze the user's emotional state from their input. It detects emotions such as stress, relief, and anxiety from the user's vocabulary and voice characteristics, and adjusts the tone and content of advice based on the results.

[0526] Step 6:

[0527] The tailored advice is converted to a speech format via a speech synthesis module and sent to the device along with the text. Here too, the data is encrypted to ensure privacy.

[0528] Step 7:

[0529] The terminal decrypts the data received from the server and displays the advice as text on the user interface, while also providing a voice-over function.

[0530] Step 8:

[0531] Users review the advice provided and incorporate it into their future actions. At this time, their reactions and new inputs are recorded as part of their behavioral history and used to improve the accuracy of future advice.

[0532] (Example 2)

[0533] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0534] Traditional financial advice systems often fail to adequately consider a user's individual emotional state or past behavioral history, potentially resulting in inadequate advice. Furthermore, they lack sufficient user privacy protection, raising concerns about data security. These issues lead to users lacking reliable support.

[0535] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0536] In this invention, the server includes means for analyzing input information using a data processing device and generating personalized economic advice for the user, means for evaluating emotional information from the user and adjusting the response based on that evaluation, and a speech recognition device for processing voice input and converting it into text information. This makes it possible to provide optimized economic advice based on the user's individual emotional state and past behavioral history.

[0537] A "data processing device" is a device used to analyze input information and generate specific results based on that data.

[0538] "Users" refers to individuals or organizations that use this system, and in particular, to those seeking financial advice.

[0539] "Financial advice" refers to information that provides guidance and suggestions regarding a user's financial situation and investment strategy.

[0540] "Emotional information" refers to data that indicates the emotional state expressed by a user, and is derived from information analyzed from text and audio.

[0541] A "speech recognition device" is a device or software equipped with technology to analyze speech signals and convert them into text information.

[0542] A "speech generation device" is a device or software equipped with the technology to analyze textual information and output it as an audio signal.

[0543] "Past behavioral history" refers to a collection of data that records actions and history that a user has performed in the past.

[0544] "Privacy protection" refers to the means and technologies used to protect personal information from unauthorized access and misuse.

[0545] This invention is a system that provides personalized financial advice to users in real time. The system uses a generative AI model and an emotion engine to generate optimal advice tailored to the user's needs.

[0546] The user first accesses the application using a device. The user then inputs their questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. This process utilizes standard speech recognition software.

[0547] The converted text data is securely transmitted to the server using encryption technology. The server uses data processing equipment to analyze the received data. Here, a generative AI model (e.g., a large-scale language model) is used to generate personalized economic advice based on the user's input.

[0548] Simultaneously, the server uses an emotion engine to evaluate the user's emotional information. This emotional information is determined from the content of the text data and the expressions it contains. Based on this evaluation, the server adjusts the content and tone of the generated advice.

[0549] Finally, the adjusted advice is converted into audio data using a speech generator and sent to the terminal. The terminal plays the advice in audio format and simultaneously displays it in text format.

[0550] For example, if a user enters "I'm worried about how I should save for my children's education," the server uses a generative AI model to create personalized advice for this question and delivers it to the user in a reassuring tone. This process allows users to receive optimized advice based on their emotional state and past behavioral history.

[0551] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0552] Step 1:

[0553] Users access the application using their device and input questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. In this process, it receives a voice signal as input and obtains output converted into text using speech recognition technology.

[0554] Step 2:

[0555] The terminal sends the text data converted by speech recognition to the server. During this process, data encryption technology is used to securely transmit the text data to the server. The input is encrypted text data, and the output is received in a format that the server can analyze.

[0556] Step 3:

[0557] The server analyzes the received text data using a data processing device and generates personalized economic advice using a generative AI model. The input is text data, and the generative AI model performs data calculations based on this data to produce advice as output.

[0558] Step 4:

[0559] The server uses an emotion engine to evaluate user emotional information from text data. It receives text data as input, analyzes its content to identify the user's emotional state, and outputs the evaluation result.

[0560] Step 5:

[0561] The tone and content of the generated advice are adjusted based on the emotion evaluation results. This results in optimized advice that is tailored to the user's emotional state. The input consists of advice from the generating AI model and the results of the emotion evaluation, and optimized advice is output.

[0562] Step 6:

[0563] The server converts optimized advice into audio data using a speech generator and sends it to the terminal. It also outputs multimedia-format advice received as input as audio data and sends it to the terminal.

[0564] Step 7:

[0565] The terminal plays audio data sent from the server and displays text data on the screen. Users can listen to the provided advice in audio and confirm it in text. The input is data sent from the server, and this is used to produce output in the form of audio playback and text display.

[0566] (Application Example 2)

[0567] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0568] Modern asset management requires the provision of appropriate financial advice in real time, tailored to the user's emotional state. However, conventional systems have faced challenges in accurately understanding a user's specific emotional state and generating personalized advice accordingly. Furthermore, protecting user privacy while providing advice in real time via voice input and output is also a crucial challenge.

[0569] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0570] In this invention, the server includes means for analyzing input information using a generative artificial intelligence system and generating personalized asset management suggestions for the user; means for evaluating emotional information from the user and deriving a response corresponding to the emotional state; means for processing voice input and converting it into text information; and means for providing emotionally appropriate financial advice when conducting electronic transactions. This makes it possible to achieve more appropriate asset management while simultaneously providing real-time financial advice tailored to the user's emotions and protecting their privacy.

[0571] A "generative artificial intelligence system" is an artificial intelligence technology that analyzes user input information and generates personalized asset management proposals.

[0572] "Emotional information" refers to data that indicates the emotional state of a user, and this data is used to analyze that state and derive personalized responses.

[0573] An "asset management proposal" is specific advice that presents the optimal management method based on the user's financial situation.

[0574] "Speech recognition means" refers to a technical means for receiving speech input and converting it into text information.

[0575] "Electronic transactions" refer to all acts of buying, selling, or paying for goods and services conducted through online platforms or similar means.

[0576] "Speech synthesis means" refers to a technical means for outputting the generated asset management proposal as speech.

[0577] "Privacy protection" refers to information security measures that protect users' personal information and transaction data from third parties and manage them securely.

[0578] In this invention, the user accesses the system from a terminal via voice or text input. The terminal uses speech recognition software to convert the voice-input information into text information. During this process, the voice-input information is analyzed by natural language processing software. Specifically, a speech recognition engine installed in the terminal is used for speech recognition. The converted text information is then transmitted to a server.

[0579] The server utilizes a generative artificial intelligence system to generate personalized asset management suggestions from received text information. Here, the generative AI model plays a crucial role in generating advice based on user input. The server also uses an emotion engine to assess the user's emotional state and adjust the suggestions accordingly. For example, if the server determines that the user is feeling anxious about cost reduction, it will adjust its advice to be delivered in a gentler tone.

[0580] The generated asset management proposals are sent from the server to the terminal. The system utilizes speech synthesis technology to provide the proposals as voice output. The proposals are also displayed in text format, allowing users to visually review them.

[0581] As a concrete example of this system, if a user inputs "I'm worried about how to save money on my next trip," the emotion engine detects anxiety about saving money. In response, the server generates advice such as, "Your trip is definitely worth enjoying. However, if you want to save money, consider the following approaches." This advice is provided in both voice and text formats, delivering information in a way that suits the user's needs.

[0582] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0583] Step 1:

[0584] Users input questions and inquiries via voice or text through their device. This input is provided as voice in the case of voice input, and as text data in the case of text input. Voice input is converted into text data by the device's speech recognition engine. The speech recognition engine analyzes the voice signal and generates a corresponding string, which is then output as text data.

[0585] Step 2:

[0586] The terminal sends the text data obtained as a result of speech recognition to the server. Here, the converted text data is transferred to the server side. In particular, the user's consultation content is transmitted as data for processing by the generative AI model.

[0587] Step 3:

[0588] The server analyzes the received text data and uses a generative artificial intelligence system to generate personalized asset management suggestions. In this process, the generative AI model generates prompts based on the user's questions and analyzes them to determine the best advice. The generated prompts are tailored to the user's specific needs and are used to generate subsequent responses.

[0589] Step 4:

[0590] Simultaneously, the server uses an emotion engine to evaluate emotional information from the user's text data. This involves analyzing words and phrases contained in the text to identify emotional states. For example, if negative emotions are expressed, the system may estimate the likelihood of anxiety or stress.

[0591] Step 5:

[0592] The server adjusts the tone and content of the suggestions generated based on emotional information to form the final asset management recommendations. This adjustment includes refining the suggestions to correspond to the emotional state. As output, personalized advice is provided as text data.

[0593] Step 6:

[0594] The generated suggestions are sent from the server to the terminal and provided to the user as voice output using speech synthesis technology. Furthermore, they are also displayed on the screen in text format. In this final step, the advice is output to the user's terminal in a way that is clearly visible as both voice and text.

[0595] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0596] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0597] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0598] [Fourth Embodiment]

[0599] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0600] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0601] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0602] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0603] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0604] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0605] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0606] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0607] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0608] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0609] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0610] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0611] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0612] This invention describes a system that provides personalized financial advice to users. The following explains how the server, terminal, and user interact to realize this system.

[0613] First, the user accesses the application using their device via voice or text. When the user enters a question, the device sends that input data to the server. If voice input is used, the device utilizes speech recognition to convert it into text data and sends this data to the server.

[0614] The server analyzes the received data. A generative artificial intelligence model is used to understand the user's input and extract the information necessary to generate appropriate financial advice. This model can provide highly accurate advice by referencing a large amount of pre-stored financial data and the user's behavioral history.

[0615] Furthermore, the server uses an emotional intelligence module to determine the user's emotional state from their input data. For example, if the server determines that a user's question is emotionally tense, it will construct advice in a reassuring tone and prepare an emotionally sensitive response.

[0616] The advice generated by the server is returned to the terminal as voice output using speech synthesis technology. It is also displayed on the screen as text. For example, specific suggestions such as, "Let's review this month's expenses and try reducing eating out to save money," may be given.

[0617] Users can act based on the advice provided, and the system is particularly easy to use for senior citizens and visually impaired users, especially through audio delivery.

[0618] This system protects user privacy and ensures data security by encrypting user data. Furthermore, by accumulating behavioral history, it can continuously improve the accuracy of its advice. This helps users build reliable long-term financial plans.

[0619] The following describes the processing flow.

[0620] Step 1:

[0621] The user activates the device and enters a question into the application via voice input or text input. The device then converts the voice-input data into text data using speech recognition technology.

[0622] Step 2:

[0623] The terminal packets the text data entered by the user and sends it to the server via the internet. The user ID and session information are also sent along with the packet.

[0624] Step 3:

[0625] The server receives data packets from the terminal. The received data first undergoes a security check to ensure its safety. Then, a natural language processing module is used to analyze the data and identify the intent and content of the question.

[0626] Step 4:

[0627] Based on the analysis results, the server generates personalized financial advice for the user using a generative artificial intelligence model. This process references the user's behavioral history and previously learned financial data.

[0628] Step 5:

[0629] The server uses an emotional intelligence module to analyze the user's emotional data. It identifies emotional tension and stress from the questions asked and, if necessary, constructs empathetic advice that addresses those emotions.

[0630] Step 6:

[0631] The advice generated on the server is formatted into a user-friendly format and sent to the terminal as encrypted data.

[0632] Step 7:

[0633] The terminal receives a response from the server, decodes the data, and displays it in the application's user interface. Simultaneously, it outputs advice in voice using speech synthesis technology.

[0634] Step 8:

[0635] Users read the advice displayed on the screen or listen to it aloud, and take action as needed. During this process, the device updates the user's behavior history, which is used to improve the accuracy of advice for future interactions.

[0636] (Example 1)

[0637] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0638] Financial management and asset building in modern society vary greatly depending on individual circumstances and emotions, thus requiring personalized advice. However, conventional systems have problems in that they have difficulty taking into account the emotional state of users and do not adequately protect user privacy or utilize their behavioral history. This invention aims to address these issues and realize a system that provides highly accurate, personalized advice.

[0639] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0640] In this invention, the server includes means for analyzing information using a generative artificial intelligence model and generating personalized advice, means for analyzing emotional information from the user and deriving a response corresponding to the emotional state, and acoustic recognition means for processing acoustic input and converting it into textual information. This makes it possible to provide accurate advice that is appropriate to the user's situation and emotions.

[0641] A "generative artificial intelligence model" is a machine learning algorithm that uses a large amount of pre-trained data to produce appropriate outputs for new input data.

[0642] "User" refers to an individual or legal entity that uses the system to input information and receive advice.

[0643] "Emotional information" refers to data that expresses the user's emotions and psychological state.

[0644] "Audio input" refers to audio signals that are input into the system via a microphone or similar device.

[0645] "Textual information" refers to a format in which analyzed data is represented using text.

[0646] An "emotional intelligence module" is a software component that analyzes the user's emotional state and generates appropriate responses based on that analysis.

[0647] "Sound generation means" refers to the technology and process for converting text data into speech and transmitting it to the user.

[0648] "Behavioral history" refers to records of actions and inputs made by users in the past, and is data used to improve the accuracy of future suggestions.

[0649] "Digitized information encryption" refers to a technology that protects data to safeguard user privacy, and is a method of transforming information using a specific algorithm.

[0650] This invention describes a specific embodiment of a system that provides personalized advice to users. The user accesses the system via voice or text using their own terminal and inputs a question or request for advice. The terminal transmits this information to a server via an internet connection. If voice input is provided, the terminal utilizes acoustic recognition capabilities to convert the voice signal into text information.

[0651] The server uses a generative artificial intelligence model to analyze the input text information. Based on a large amount of pre-trained data, the generative AI model understands the user's question, extracts relevant information, and generates appropriate advice. In this process, the user's emotional information is also analyzed through an emotional intelligence module, and a response appropriate to their emotional state is derived.

[0652] The generated advice is returned to the terminal as an audible output via an acoustic generation device. It is also displayed on the terminal screen as text information. This system is designed to efficiently process the user's acoustic input and provide optimal responses tailored to the user's language habits and emotions. The user's behavioral history is also accumulated and considered when providing personalized advice during subsequent consultations.

[0653] As a concrete example, let's consider a scenario where a user enters a text prompt asking, "How can I increase my savings?" The system then uses a generative artificial intelligence model to analyze this prompt and generate specific advice, such as, "Try setting up an automatic savings plan that sets a certain percentage of your monthly income," which is then communicated to the user via both voice and text. This system provides the user with guidance to achieve their financial goals.

[0654] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0655] Step 1:

[0656] The user inputs questions using their device via voice or text. Specific user actions include typing prompts such as "How should I manage my expenses this month?" or asking questions verbally. Based on this input, the device converts the voice input into text using its acoustic recognition function.

[0657] Step 2:

[0658] The terminal sends the converted character information to the server. Specifically, this data is transferred to the server via the internet connection. The input data is character information provided by the user, and this information is used for analysis on the server.

[0659] Step 3:

[0660] The server analyzes this textual information using a generative artificial intelligence model. The server extracts important keywords and related information from the received text and generates appropriate advice. This data processing results in optimal financial advice based on the user's question.

[0661] Step 4:

[0662] The server analyzes the user's emotional information through an emotional intelligence module. This analysis evaluates the emotional state and generates a response with a corresponding tone and content. Specifically, if the user's input indicates tension or anxiety, the server generates reassuring language.

[0663] Step 5:

[0664] The server converts the generated advice into audio data using an audio generation device and sends it to the terminal. The output data is provided as audio output. Specifically, advice such as "First, I recommend listing your expenses and reviewing them" is returned in audio format.

[0665] Step 6:

[0666] The terminal plays audio data provided by the server and simultaneously displays it as text. The user can then take action based on this. Through the outputted audio and text information, the user receives personalized advice and appropriate guidance for action.

[0667] (Application Example 1)

[0668] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0669] In today's consumer society, many individuals are required to efficiently manage their spending and improve their financial situation. However, traditional methods make it difficult to obtain specific, real-time advice tailored to individual financial circumstances. Furthermore, there is a lack of systems that provide advice that takes into account specific emotional states. Therefore, there is a need for a system that allows users to manage their spending in a sustainable way without undue burden.

[0670] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0671] In this invention, the server includes means for analyzing input information using a generative artificial intelligence model and generating personalized financial advice for the user, means for analyzing emotional information from the user and deriving a response corresponding to the emotional state, and speech recognition means for processing voice input and converting it into text information. As a result, the user can receive real-time spending management advice tailored to their individual financial situation, as well as appropriate advice that takes their emotions into consideration.

[0672] A "generative artificial intelligence model" is an advanced computational model used to learn from large amounts of data and perform analysis and prediction of new information.

[0673] "Input information" refers to data provided by the user, which is incorporated into the system as audio or text.

[0674] "Personalized financial advice" refers to specific financial management advice created individually based on each user's specific circumstances and behavioral history.

[0675] "Emotional information" refers to data that indicates a user's emotional state, and includes information that reflects emotional elements in questions and statements.

[0676] "Speech recognition means" refers to technology for analyzing speech data and converting it into text information.

[0677] "Speech synthesis means" refers to a technology that converts text information into speech data and provides the information to the user as speech.

[0678] "Real-time spending data analysis" is a process that instantly analyzes a user's latest spending data and generates advice based on that data.

[0679] "Information encryption methods" are technologies used to protect data from third parties and are techniques for securely protecting information.

[0680] The system that realizes this invention operates in cooperation with a user, a terminal, and a server. The user accesses the application using voice or text via a terminal such as a smartphone. First, when the user enters a financial question, the terminal sends the input data to the server. In the case of voice input, the terminal uses speech recognition technology to convert the voice data into text information and sends it to the server. The speech recognition technology used in this process is a commonly used cloud-based speech recognition service.

[0681] The server utilizes a generative artificial intelligence model to analyze the user input information it receives. This model references a vast amount of historically accumulated financial data and, while also considering the user's unique behavioral history, generates personalized financial advice. For example, it can analyze this month's spending and provide advice such as, "Your spending is high this month due to frequent eating out. If you limit eating out to two times until your next payday, your savings will increase."

[0682] Furthermore, the server analyzes the user's emotional information and uses emotional intelligence to understand their emotional state. If it determines that the user is in an emotionally distressed state, it will construct advice in a tone that is sensitive to their feelings and strive to reassure the user.

[0683] The generated advice is sent to the device as audio using speech synthesis technology and played back to the user. It is also displayed as text on the device screen, making it easy for users to understand the advice in any situation. A general-purpose speech synthesis engine is used for the speech synthesis technology.

[0684] To protect your privacy, user data is securely encrypted on the server to prevent unauthorized access by third parties. This allows you to use the system with peace of mind.

[0685] Examples of prompts include specific questions based on the user's financial situation and goals, such as, "Please analyze your spending trends over the past three months and tell me where you can save money," or "Where can I cut back this month to reach my current savings goal?"

[0686] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0687] Step 1:

[0688] The user enters financial-related questions into the terminal via voice or text. If voice input is used, the terminal uses speech recognition to convert the voice data into text data. In this process, the voice data is input into the speech recognition engine, and the resulting text data is output.

[0689] Step 2:

[0690] The terminal sends the converted character data to the server. The server analyzes the received input data and performs the necessary preprocessing for analysis. Specifically, it removes unnecessary spaces and special characters and formats the prompt text so that it is in a format suitable for the generating AI model.

[0691] Step 3:

[0692] The server utilizes a generative artificial intelligence model to generate personalized financial advice based on analyzed input data. The model receives formatted data as prompts and outputs personalized financial advice text. This output is based on pre-trained financial data and behavioral history.

[0693] Step 4:

[0694] The server uses emotional intelligence technology to analyze emotional information from user input. The emotion analysis engine receives the user's text data and outputs data indicating the emotional state. Based on this data, it constructs advice with a tone and content that takes the user's emotional state into consideration.

[0695] Step 5:

[0696] The generated financial advice is converted into audio data using speech synthesis technology. The server inputs the advice's text data into the speech synthesis engine and outputs it as audio data. This process converts the advice into a format that is easy for the user to understand.

[0697] Step 6:

[0698] The server sends the generated audio and text advice to the terminal. The terminal receives this and plays it back to the user as audio and displays the text on the screen. This allows the user to confirm the advice visually and audibly.

[0699] Step 7:

[0700] The server updates the user's behavior history and stores it in the database to improve the accuracy of future advice. The update process tracks the advice received by the user and their responses, contributing to continuous system improvement.

[0701] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0702] This invention illustrates a specific embodiment of a system that provides personalized financial advice to users using a generative artificial intelligence model and an emotion engine. This system analyzes the user's input, recognizes their emotional state, and then generates and provides optimal advice.

[0703] First, the user accesses the application using their device via voice or text. When the user enters a question or inquiry, the input data is sent from the device to the server. In the case of voice input, the device converts the voice data into text using its speech recognition function and sends it to the server.

[0704] The server analyzes the received text data. A generative artificial intelligence model processes this data to create personalized financial advice based on the user's input. Simultaneously, an emotion engine analyzes the user's text data and voice characteristics to assess the user's emotional state. Based on this assessment, the content and tone of the advice are adjusted. For example, if the server determines that the user is stressed, it will be configured to provide gentle and encouraging advice.

[0705] The generated advice is sent to the terminal as voice output via speech synthesis and simultaneously displayed on the screen as text. Specifically, if the user inputs "I feel anxious about future investments," the emotion engine detects the anxiety, and the server provides reassuring advice such as "Don't worry. Considering this investment strategy, the risks are managed."

[0706] Furthermore, this system saves users' behavioral history and uses it to continuously optimize responses to users by providing advice for future interactions. It also protects user privacy and ensures secure data transmission through data encryption.

[0707] In this way, it becomes possible to provide optimal financial advice tailored to the user's situation and emotions in real time, thereby realizing more reliable personalized support.

[0708] The following describes the processing flow.

[0709] Step 1:

[0710] Users access the application through their device and input financial questions or concerns via voice or text. In the case of voice input, the device's voice recognition function converts the voice into text data.

[0711] Step 2:

[0712] The terminal bundles the user's input data into packets and sends them to the server using a secure protocol. User ID and session information are also transmitted during this process.

[0713] Step 3:

[0714] The server first performs a security check on the data received from the terminal to confirm its safety. Next, it uses a natural language processing module to analyze the input data and extract the information it contains.

[0715] Step 4:

[0716] The server uses a generative artificial intelligence model to generate personalized financial advice based on the extracted information. This process also takes into account the user's behavioral history and past advice.

[0717] Step 5:

[0718] The server uses an emotion engine to analyze the user's emotional state from their input. It detects emotions such as stress, relief, and anxiety from the user's vocabulary and voice characteristics, and adjusts the tone and content of advice based on the results.

[0719] Step 6:

[0720] The tailored advice is converted to a speech format via a speech synthesis module and sent to the device along with the text. Here too, the data is encrypted to ensure privacy.

[0721] Step 7:

[0722] The terminal decrypts the data received from the server and displays the advice as text on the user interface, while also providing a voice-over function.

[0723] Step 8:

[0724] Users review the advice provided and incorporate it into their future actions. At this time, their reactions and new inputs are recorded as part of their behavioral history and used to improve the accuracy of future advice.

[0725] (Example 2)

[0726] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0727] Traditional financial advice systems often fail to adequately consider a user's individual emotional state or past behavioral history, potentially resulting in inadequate advice. Furthermore, they lack sufficient user privacy protection, raising concerns about data security. These issues lead to users lacking reliable support.

[0728] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0729] In this invention, the server includes means for analyzing input information using a data processing device and generating personalized economic advice for the user, means for evaluating emotional information from the user and adjusting the response based on that evaluation, and a speech recognition device for processing voice input and converting it into text information. This makes it possible to provide optimized economic advice based on the user's individual emotional state and past behavioral history.

[0730] A "data processing device" is a device used to analyze input information and generate specific results based on that data.

[0731] "Users" refers to individuals or organizations that use this system, and in particular, to those seeking financial advice.

[0732] "Financial advice" refers to information that provides guidance and suggestions regarding a user's financial situation and investment strategy.

[0733] "Emotional information" refers to data that indicates the emotional state expressed by a user, and is derived from information analyzed from text and audio.

[0734] A "speech recognition device" is a device or software equipped with technology to analyze speech signals and convert them into text information.

[0735] A "speech generation device" is a device or software equipped with the technology to analyze textual information and output it as an audio signal.

[0736] "Past behavioral history" refers to a collection of data that records actions and history that a user has performed in the past.

[0737] "Privacy protection" refers to the means and technologies used to protect personal information from unauthorized access and misuse.

[0738] This invention is a system that provides personalized financial advice to users in real time. The system uses a generative AI model and an emotion engine to generate optimal advice tailored to the user's needs.

[0739] The user first accesses the application using a device. The user then inputs their questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. This process utilizes standard speech recognition software.

[0740] The converted text data is securely transmitted to the server using encryption technology. The server uses data processing equipment to analyze the received data. Here, a generative AI model (e.g., a large-scale language model) is used to generate personalized economic advice based on the user's input.

[0741] Simultaneously, the server uses an emotion engine to evaluate the user's emotional information. This emotional information is determined from the content of the text data and the expressions it contains. Based on this evaluation, the server adjusts the content and tone of the generated advice.

[0742] Finally, the adjusted advice is converted into audio data using a speech generator and sent to the terminal. The terminal plays the advice in audio format and simultaneously displays it in text format.

[0743] For example, if a user enters "I'm worried about how I should save for my children's education," the server uses a generative AI model to create personalized advice for this question and delivers it to the user in a reassuring tone. This process allows users to receive optimized advice based on their emotional state and past behavioral history.

[0744] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0745] Step 1:

[0746] Users access the application using their device and input questions or inquiries via voice or text. In the case of voice input, the device uses a speech recognition device to convert the voice data into text data. In this process, it receives a voice signal as input and obtains output converted into text using speech recognition technology.

[0747] Step 2:

[0748] The terminal sends the text data converted by speech recognition to the server. During this process, data encryption technology is used to securely transmit the text data to the server. The input is encrypted text data, and the output is received in a format that the server can analyze.

[0749] Step 3:

[0750] The server analyzes the received text data using a data processing device and generates personalized economic advice using a generative AI model. The input is text data, and the generative AI model performs data calculations based on this data to produce advice as output.

[0751] Step 4:

[0752] The server uses an emotion engine to evaluate user emotional information from text data. It receives text data as input, analyzes its content to identify the user's emotional state, and outputs the evaluation result.

[0753] Step 5:

[0754] The tone and content of the generated advice are adjusted based on the emotion evaluation results. This results in optimized advice that is tailored to the user's emotional state. The input consists of advice from the generating AI model and the results of the emotion evaluation, and optimized advice is output.

[0755] Step 6:

[0756] The server converts optimized advice into audio data using a speech generator and sends it to the terminal. It also outputs multimedia-format advice received as input as audio data and sends it to the terminal.

[0757] Step 7:

[0758] The terminal plays audio data sent from the server and displays text data on the screen. Users can listen to the provided advice in audio and confirm it in text. The input is data sent from the server, and this is used to produce output in the form of audio playback and text display.

[0759] (Application Example 2)

[0760] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0761] Modern asset management requires the provision of appropriate financial advice in real time, tailored to the user's emotional state. However, conventional systems have faced challenges in accurately understanding a user's specific emotional state and generating personalized advice accordingly. Furthermore, protecting user privacy while providing advice in real time via voice input and output is also a crucial challenge.

[0762] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0763] In this invention, the server includes means for analyzing input information using a generative artificial intelligence system and generating personalized asset management suggestions for the user; means for evaluating emotional information from the user and deriving a response corresponding to the emotional state; means for processing voice input and converting it into text information; and means for providing emotionally appropriate financial advice when conducting electronic transactions. This makes it possible to achieve more appropriate asset management while simultaneously providing real-time financial advice tailored to the user's emotions and protecting their privacy.

[0764] A "generative artificial intelligence system" is an artificial intelligence technology that analyzes user input information and generates personalized asset management proposals.

[0765] "Emotional information" refers to data that indicates the emotional state of a user, and this data is used to analyze that state and derive personalized responses.

[0766] An "asset management proposal" is specific advice that presents the optimal management method based on the user's financial situation.

[0767] "Speech recognition means" refers to a technical means for receiving speech input and converting it into text information.

[0768] "Electronic transactions" refer to all acts of buying, selling, or paying for goods and services conducted through online platforms or similar means.

[0769] "Speech synthesis means" refers to a technical means for outputting the generated asset management proposal as speech.

[0770] "Privacy protection" refers to information security measures that protect users' personal information and transaction data from third parties and manage them securely.

[0771] In this invention, the user accesses the system from a terminal via voice or text input. The terminal uses speech recognition software to convert the voice-input information into text information. During this process, the voice-input information is analyzed by natural language processing software. Specifically, a speech recognition engine installed in the terminal is used for speech recognition. The converted text information is then transmitted to a server.

[0772] The server utilizes a generative artificial intelligence system to generate personalized asset management suggestions from received text information. Here, the generative AI model plays a crucial role in generating advice based on user input. The server also uses an emotion engine to assess the user's emotional state and adjust the suggestions accordingly. For example, if the server determines that the user is feeling anxious about cost reduction, it will adjust its advice to be delivered in a gentler tone.

[0773] The generated asset management proposals are sent from the server to the terminal. The system utilizes speech synthesis technology to provide the proposals as voice output. The proposals are also displayed in text format, allowing users to visually review them.

[0774] As a concrete example of this system, if a user inputs "I'm worried about how to save money on my next trip," the emotion engine detects anxiety about saving money. In response, the server generates advice such as, "Your trip is definitely worth enjoying. However, if you want to save money, consider the following approaches." This advice is provided in both voice and text formats, delivering information in a way that suits the user's needs.

[0775] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0776] Step 1:

[0777] Users input questions and inquiries via voice or text through their device. This input is provided as voice in the case of voice input, and as text data in the case of text input. Voice input is converted into text data by the device's speech recognition engine. The speech recognition engine analyzes the voice signal and generates a corresponding string, which is then output as text data.

[0778] Step 2:

[0779] The terminal sends the text data obtained as a result of speech recognition to the server. Here, the converted text data is transferred to the server side. In particular, the user's consultation content is transmitted as data for processing by the generative AI model.

[0780] Step 3:

[0781] The server analyzes the received text data and uses a generative artificial intelligence system to generate personalized asset management suggestions. In this process, the generative AI model generates prompts based on the user's questions and analyzes them to determine the best advice. The generated prompts are tailored to the user's specific needs and are used to generate subsequent responses.

[0782] Step 4:

[0783] Simultaneously, the server uses an emotion engine to evaluate emotional information from the user's text data. This involves analyzing words and phrases contained in the text to identify emotional states. For example, if negative emotions are expressed, the system may estimate the likelihood of anxiety or stress.

[0784] Step 5:

[0785] The server adjusts the tone and content of the suggestions generated based on emotional information to form the final asset management recommendations. This adjustment includes refining the suggestions to correspond to the emotional state. As output, personalized advice is provided as text data.

[0786] Step 6:

[0787] The generated suggestions are sent from the server to the terminal and provided to the user as voice output using speech synthesis technology. Furthermore, they are also displayed on the screen in text format. In this final step, the advice is output to the user's terminal in a way that is clearly visible as both voice and text.

[0788] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0789] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0790] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0791] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0792] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0793] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0794] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0795] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0796] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0797] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0798] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0799] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0800] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0801] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0802] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0803] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0804] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0805] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0806] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0807] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0808] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0809] The following is further disclosed regarding the embodiments described above.

[0810] (Claim 1)

[0811] A means of analyzing input data using a generative artificial intelligence model and generating personalized financial advice for the user,

[0812] A means of analyzing emotional data from users and deriving responses appropriate to their emotional state,

[0813] A speech recognition means that processes voice input and converts it into text data,

[0814] A speech synthesis means that provides the generated financial advice as an audio output,

[0815] A system that includes this.

[0816] (Claim 2)

[0817] The system according to claim 1, comprising means for accumulating a user's behavioral history and taking that history into consideration when providing personalized financial advice.

[0818] (Claim 3)

[0819] The system according to claim 1, comprising data encryption means for protecting user privacy.

[0820] "Example 1"

[0821] (Claim 1)

[0822] A means of analyzing information using a generative artificial intelligence model and generating personalized advice for users,

[0823] A means of analyzing emotional information from users and deriving responses appropriate to their emotional state,

[0824] A sound recognition means that processes acoustic input and converts it into textual information,

[0825] Sound generation means that provides the generated advice as an acoustic output,

[0826] A means of collecting information from a terminal and transferring it to a computer for analysis,

[0827] A means for evaluating a user's emotions using an emotional intelligence module and generating an appropriate response based on that state,

[0828] A system that includes this.

[0829] (Claim 2)

[0830] The system according to claim 1, comprising means for accumulating user behavior history and taking that history into consideration when providing personalized advice.

[0831] (Claim 3)

[0832] The system according to claim 1, comprising a means for encrypting digitized information to protect user privacy.

[0833] "Application Example 1"

[0834] (Claim 1)

[0835] A means of analyzing input information using a generative artificial intelligence model and generating personalized financial advice for the user,

[0836] A means of analyzing emotional information from users and deriving responses appropriate to their emotional state,

[0837] A speech recognition means that processes voice input and converts it into text information,

[0838] A speech synthesis means that provides the generated financial advice as an audio output,

[0839] A means of analyzing users' spending information in real time and providing advice on reducing spending based on their individual financial situation,

[0840] A system that includes this.

[0841] (Claim 2)

[0842] The system according to claim 1, comprising means for accumulating a user's behavioral history and taking that history into consideration when providing personalized financial advice.

[0843] (Claim 3)

[0844] The system according to claim 1, comprising means for encrypting information to protect user privacy.

[0845] "Example 2 of combining an emotion engine"

[0846] (Claim 1)

[0847] A means for analyzing input information using a data processing device and generating personalized economic advice for the user,

[0848] A means of evaluating emotional information from users and adjusting responses based on that evaluation,

[0849] A speech recognition device that processes voice input and converts it into text information,

[0850] A voice generation device that provides generated economic advice as voice output,

[0851] A means of optimizing advice based on the user's emotional state and past behavioral history,

[0852] A system that includes this.

[0853] (Claim 2)

[0854] The system according to claim 1, comprising means for saving a user's past behavioral history and utilizing it when providing personalized financial advice.

[0855] (Claim 3)

[0856] The system according to claim 1, comprising means for encrypting data to protect user privacy.

[0857] "Application example 2 when combining with an emotional engine"

[0858] (Claim 1)

[0859] A means for analyzing input information using a generative artificial intelligence system and generating personalized asset management proposals for users,

[0860] A means of evaluating emotional information from users and deriving responses appropriate to their emotional state,

[0861] A speech recognition means that processes voice input and converts it into text information,

[0862] A speech synthesis means that provides the generated asset management proposal as an audio output,

[0863] A means of providing emotionally charged financial advice when conducting electronic transactions,

[0864] A system that includes this.

[0865] (Claim 2)

[0866] The system according to claim 1, which includes means for accumulating user behavior history and taking that history into consideration when providing personalized asset management proposals, and for adjusting the content of proposals based on emotional information.

[0867] (Claim 3)

[0868] The system according to claim 1, which includes data encryption means to protect user privacy and securely manages suggestions using emotional information. [Explanation of Symbols]

[0869] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A means of analyzing input information using a generative artificial intelligence model and generating personalized financial advice for the user, A means of analyzing emotional information from users and deriving responses appropriate to their emotional state, A speech recognition means that processes voice input and converts it into text information, A speech synthesis means that provides the generated financial advice as an audio output, A means of analyzing users' spending information in real time and providing advice on reducing spending based on their individual financial situation, A system that includes this.

2. The system according to claim 1, comprising means for accumulating a user's behavioral history and taking that history into consideration when providing personalized financial advice.

3. The system according to claim 1, comprising information encryption means for protecting user privacy.