system
An AI system efficiently manages and personalizes information by acquiring, analyzing, and summarizing data from communication devices, suggesting tasks, and adjusting environments based on user emotions, addressing information overload and improving user experience.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-09
- Publication Date
- 2026-06-19
AI Technical Summary
The increasing volume of information from communication devices overwhelms users, leading to inefficiencies in managing time and prioritizing important tasks, causing stress and difficulty in balancing work and personal life.
An AI-powered system that acquires information from multiple devices, analyzes it using natural language processing, generates summaries, suggests tasks based on user behavior, and adjusts notifications and environmental settings based on emotional state, allowing users to manage information efficiently and personalize their experience.
The system effectively manages vast amounts of information, prioritizes important tasks, and provides personalized, emotionally responsive assistance, reducing stress and enhancing user productivity and comfort.
Smart Images

Figure 2026100571000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, the method including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a character of the chatbot, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance as a response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] In recent years, with the development of information and communication technologies, the amount of information obtained from emails and various communication devices has become enormous, and managing this information has become extremely complicated. For this reason, users often overlook important information, and it becomes difficult to manage time efficiently in business and private life, often feeling stressed. The present invention aims to efficiently manage a vast amount of information from multiple communication devices and optimize the user's life and business.
Means for Solving the Problems
[0005] The present invention solves the above problems by providing means for acquiring information from multiple communication devices, means for analyzing the acquired information using natural language processing and generating a summary, means for analyzing the user's behavior history and suggesting future tasks, means for notifying the user's terminal of the information and suggestions, and means for collecting user feedback and updating the analysis results. As a result, users will be able to efficiently extract important information from a vast amount of information and appropriately manage the balance between their lives and work.
[0006] "Communication devices" is a general term for electronic devices used to send and receive information, and examples include smartphones and computers.
[0007] "Means of acquiring information" refers to the processes and technologies used to collect necessary information from communication devices.
[0008] "Natural language processing" is the technology that enables computers to understand human language, and includes the analysis, understanding, and generation of text data.
[0009] "Methods for generating summaries" refer to techniques and algorithms for extracting key points from acquired information and summarizing them concisely.
[0010] "Methods for analyzing behavioral history" refer to methods for analyzing and understanding a user's past behavioral patterns and data.
[0011] "Means of proposing tasks" refers to methods for suggesting future actions and activities based on the user's behavioral history.
[0012] "Means of notifying the device" refers to the processes and functions used to send information to the user's device.
[0013] "Means of collecting feedback" refers to the processes and methods used to collect evaluations and opinions from users.
[0014] "Means for updating analysis results" refers to methods for improving the system's analytical capabilities and recommendation accuracy based on collected feedback. [Brief explanation of the drawing]
[0015] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, which incorporates an emotion engine. [Figure 14]It is a sequence diagram showing the processing flow of a data processing system in Application Example 2 when a sentiment engine is combined.
Embodiment for Implementing the Invention
[0016] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0017] First, the terms used in the following description will be explained.
[0018] In the following embodiments, a numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0019] In the following embodiments, a numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0020] In the following embodiments, a numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.
[0021] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0022] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0023] [First Embodiment]
[0024] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0025] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0026] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0027] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0028] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0029] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0030] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0031] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0032] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0033] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0034] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0035] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0036] This invention is an AI-powered system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play different roles and work together to achieve its functions.
[0037] The server first obtains permission from the user to access the communication device and retrieve information such as emails and messages. In this information gathering process, the server uses the API of each communication device to extract the necessary data and periodically collects updated information.
[0038] Next, the server processes the acquired data through a natural language processing engine to analyze the text. It extracts important topics and keywords and summarizes the data based on them. For example, emails containing meeting schedules or important deadlines are summarized as high-priority information for the user.
[0039] The server then analyzes the user's behavioral history and schedule data to predict future tasks and events. Based on these predictions, the server proposes recommended actions to the user and summarizes the content. The server sends the generated information and suggestions to the device, allowing the user to receive notifications in real time.
[0040] The device allows users to receive notifications and access information through various interfaces via mobile devices and wearable devices. The device incorporates voice recognition capabilities, enabling it to accept user voice commands. This allows users to perform tasks such as checking schedules and adding new tasks via voice.
[0041] Users manage their daily tasks and lives based on the information and suggestions provided by the system. They can also provide feedback, which is sent to the server and used to update the system's analysis model and improve its accuracy.
[0042] This AI-powered information management system is a powerful tool for efficiently managing daily life and work, ensuring that important information is not overlooked amidst a vast amount of data. For example, to prevent users from missing an important meeting on Monday, the server can send a notification the night before and set a reminder in the morning, ensuring the user is properly prepared.
[0043] The following describes the processing flow.
[0044] Step 1:
[0045] The server retrieves email and message data via the communication device's API, based on the access permissions provided by the user. During this process, the server periodically updates the data and accumulates new information.
[0046] Step 2:
[0047] The server analyzes the acquired data by passing it through a natural language processing engine. This extracts important topics and keywords and summarizes the data. For example, it prioritizes extracting tasks with deadlines and meeting dates.
[0048] Step 3:
[0049] Based on the analyzed information, the server analyzes the user's behavioral history and calendar data, and uses a predictive model to suggest future tasks and events. This allows notifications to include things the user should prepare in advance.
[0050] Step 4:
[0051] The server prepares to push the generated important information and task suggestions to the user's terminal. This information is organized so that the user can access it immediately.
[0052] Step 5:
[0053] The device receives push notifications sent from the server and displays them to the user. The device can also accept voice commands from the user via its voice interface and send those commands to the server.
[0054] Step 6:
[0055] Users can check notifications displayed on their devices and adjust their actions and schedules as needed. They can also improve the accuracy of system recommendations by sending feedback to the server via their devices.
[0056] (Example 1)
[0057] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0058] Modern users are constantly exposed to vast amounts of information from numerous sources, making it difficult to efficiently extract and manage important information. In particular, prioritizing information generated in daily life and work, and managing future schedules, has become increasingly complex. In this context, there is a growing need for systems that provide appropriate information in real time and support future activities.
[0059] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0060] In this invention, the server includes means for acquiring information from a communication terminal, means for analyzing the acquired information using a processing device and generating a summary, and means for analyzing the user's history data and suggesting future tasks. This enables users to efficiently extract important information from diverse sources, predict important tasks in their daily lives and work, and act in a planned manner.
[0061] A "communication terminal" is an electronic device that has the function of receiving or transmitting information.
[0062] "Information" refers to a collection of data or knowledge that is acquired or provided, including emails and messages.
[0063] A "processing device" is a device that performs computational processing, such as analyzing data and summarizing information.
[0064] "Analysis" is the process of breaking down information into its individual elements and converting them into a form that is easy to understand.
[0065] A "summary" is a concise compilation of information, resulting from the extraction of key points.
[0066] A "user" is a person who operates or uses the system.
[0067] "History data" refers to records of activities and actions that a user has performed in the past.
[0068] Proposing a "task" means recommending future activities or actions to the user.
[0069] An "apparatus" is a machine or device that has a specific function.
[0070] "Evaluation" refers to a user's judgment or opinion regarding the information provided by the system.
[0071] An "analytical model" is a mathematical or statistical method used to understand data and make predictions.
[0072] A "portable terminal" is an electronic device with communication capabilities that can be carried around.
[0073] This invention is a system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play their respective roles in acquiring, analyzing, suggesting, and notifying information.
[0074] The server first acquires information from multiple communication terminals. This is a method of obtaining data from various email services and messaging applications using standardized APIs. Using a natural language processing engine such as Google® Cloud Natural Language API, the server analyzes the acquired information, extracts important topics and keywords, and generates a summary.
[0075] The server also analyzes user history data and uses a generative AI model to suggest future tasks. Based on past behavioral patterns and schedules, it can recommend the next steps to take and create efficient plans.
[0076] Next, the generated information and suggestions are sent to the terminal in real time. The terminal functions as a mobile device or wearable device and accepts the user's voice commands through voice recognition. This allows users to check information and add new tasks hands-free.
[0077] Furthermore, users can provide feedback on the system's suggestions, which is then sent to the server. The server updates the analysis model based on the collected feedback, improving its accuracy.
[0078] As a concrete example, to ensure users don't miss important emails each day, the server sends a reminder the night before, prompting them to check it the following morning, thus supporting thorough preparation. An example of a prompt message for the generating AI model might be, "Please review this week's schedule and summarize important appointments. Also, please suggest the next necessary actions."
[0079] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0080] Step 1:
[0081] The server retrieves information from the communication terminal. Input is API access based on user permission, retrieving email and message data. Output is the retrieved raw data. This information is used in subsequent analysis steps.
[0082] Step 2:
[0083] The server processes the acquired information through a natural language processing engine to analyze the data. The input is the raw data acquired in step 1, which is then processed for keyword extraction and topic recognition. The output consists of summarized text information and a list of high-priority information. Specifically, meetings and deadlines are highlighted in the summary.
[0084] Step 3:
[0085] The server analyzes the user's historical data and suggests future tasks. This step involves inputting schedule data from the calendar API, as well as past behavioral data. A generative AI model analyzes this data and provides recommended tasks and appointments as output. Weekly reports and other similar reports are generated based on the configured routines.
[0086] Step 4:
[0087] The server sends the generated information and suggestions to the terminal. The input is the summary information and suggestions generated in steps 2 and 3, which are sent to the user's device in notification format. The output is a visual and audible alert displayed on the terminal. Pop-up notifications and reminders are displayed on the device.
[0088] Step 5:
[0089] The terminal receives voice commands from the user and sends them to the server. The input is the user's voice command, which is converted into text format by speech recognition. The output is instruction data sent to the server. This allows the user to operate the device hands-free.
[0090] Step 6:
[0091] Users provide feedback on the system's suggestions. This feedback is entered into the server and used to improve the accuracy of the analysis model. The output is the feedback data accumulated in the system, contributing to the continuous improvement of the model. Customization is performed to reflect user feedback.
[0092] (Application Example 1)
[0093] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0094] In modern urban life, efficiently and accurately providing users with important information from a vast amount of data is a major challenge. In particular, there is a need to streamline daily life by processing information on traffic conditions, weather, and public events in a timely manner and suggesting optimal actions for users. Furthermore, improving usability through the effective use of user voice interaction is essential.
[0095] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0096] In this invention, the server includes means for collecting information from multiple communication devices, means for analyzing the acquired information using natural language processing technology and constructing a summary, means for collecting traffic conditions, weather information, and public event data and making suggestions suitable for the user, and means for processing the user's voice instructions using speech recognition functionality. This enables the user to make optimal decisions without missing important information in their daily life.
[0097] A "communication device" is an online or offline device or equipment that sends and receives information.
[0098] "Means of information gathering" refer to the functions and processes for collecting necessary information from external data sources.
[0099] "Natural language processing technology" is a computational technique for converting human language into a format that computers can easily understand and then analyzing it.
[0100] "Methods for constructing summaries" refer to methods for compactly organizing acquired information and data, extracting important content, and presenting it in a concise form.
[0101] "Activity history" refers to a record of activities and actions a user has performed in the past.
[0102] "Means of suggesting activities" refers to a function that recommends beneficial actions for users based on analyzed data.
[0103] A "terminal device" refers to a device that a user can directly operate, such as a smartphone or tablet.
[0104] A "voice instruction processing function" refers to a program or device that recognizes voice input from a user and performs the necessary operations based on that input.
[0105] "Traffic conditions" refers to information about congestion and traffic conditions on roads and public transportation in a specific area.
[0106] "Weather information" refers to data about local weather conditions, such as temperature, probability of precipitation, and wind speed.
[0107] "Public event data" refers to information about events that are open to the general public.
[0108] This technology is a personal assistant system that efficiently supports urban life. The following describes an embodiment of the system.
[0109] The server collects information from multiple communication devices and utilizes online APIs to obtain data in real time. The hardware used includes cloud servers. For software, the Google Cloud Natural Language API is used for natural language processing. Through this API, it is possible to analyze the acquired information and generate summaries. Furthermore, traffic conditions, weather information, and public event data are continuously collected from relevant organizations and open data sources.
[0110] The server further analyzes the user's behavioral history data and uses an AI model to suggest future activities. For example, if there is a forecast for worsening weather the next day, it will notify the user to bring an umbrella.
[0111] Smartphones and wearable devices are used as terminal devices, and users receive information through them. Amazon Alexa Voice Service is used for voice recognition, processing user voice commands and performing necessary actions.
[0112] Users can interact with the system through their devices and provide feedback through voice and text instructions. This feedback helps update the AI model and improve its accuracy.
[0113] As a concrete example, if a user enters the prompt "What will the weather be like tomorrow?", the system retrieves the next day's weather information from the server and notifies the user's smartphone of a summary processed through natural language processing. In this way, the user can efficiently plan their actions.
[0114] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0115] Step 1:
[0116] The server collects information from multiple communication devices. It uses APIs to obtain real-time traffic conditions, weather forecasts, and event information. The input consists of raw data provided by various data providers, and the output is an internal data structure that unifies the format of this data.
[0117] Step 2:
[0118] The server analyzes the collected data using natural language processing techniques and constructs a summary. It extracts key information and keywords from each data point via the Google Cloud Natural Language API. The input is the internal data structure generated in the previous step, and the output is summarized text data. Specifically, it extracts important topics and summarizes based on them.
[0119] Step 3:
[0120] The server uses an AI model to analyze user behavior history data. Using collected schedule information and past behavior patterns as input, it generates suggestions for future activities based on that information. The output consists of suggestions for actions that are beneficial to the user. Specific actions include everyday suggestions such as bringing an umbrella.
[0121] Step 4:
[0122] The server notifies the user's device of the generated information and suggestions. Push notifications are sent to smartphones and wearable devices. The input consists of summarized information and suggestions, and the output is a notification sent to the user's device.
[0123] Step 5:
[0124] Users provide feedback via voice or text through their terminal device. Voice commands are processed using the Amazon Alexa Voice Service. Input is the user's voice or text command, and output is the system performing an action tailored to that input.
[0125] Step 6:
[0126] The server updates the analysis model based on user feedback. The input is feedback data, and the output is an improved analysis model. Specifically, the feedback is added to the model's training data and used for the next run.
[0127] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0128] This invention is a system that more effectively supports users' lives and work by incorporating an emotion engine that recognizes and utilizes user emotions, in addition to a technology that collects information from communication devices and summarizes the data using natural language processing. In this system, the server, terminal, and user collaborate to manage information and provide user assistance.
[0129] The server connects to communication devices authorized by the user and retrieves email and message data. This retrieval is performed periodically, and information across various platforms is centrally managed. The retrieved data is analyzed by a natural language processing engine to identify important topics, and then saved as a summary.
[0130] Next, using its emotion engine, one of the system's key features, the server analyzes the user's emotions from their voice and text. This emotion data is used to adjust information notifications and task suggestions. For example, if the emotion engine detects that the user is experiencing stress, the server can reduce the number of low-priority notifications or offer suggestions for relaxation.
[0131] The terminal's role is to present information and suggestions sent from the server to the user. The terminal is equipped with voice recognition capabilities, allowing the user to operate it using voice commands. In addition, the terminal senses the user's facial expressions and tone of voice and provides information to the emotion engine.
[0132] Users can manage their daily tasks based on information and suggestions provided by their devices. Furthermore, they receive emotionally responsive suggestions and feedback, leading to a high level of satisfaction with the system. Personalized support, provided through emotion engine analysis, enhances the user experience.
[0133] For example, if a user shows signs of fatigue during work, the emotion engine detects this emotion. Based on this emotion data, the server then sends a notification to the device suggesting short exercise or meditation apps that can help refresh the user, thereby supporting their work efficiency. In this way, the system integrates emotion recognition technology to achieve more effective information presentation and user support.
[0134] The following describes the processing flow.
[0135] Step 1:
[0136] The server uses the communication device's API to periodically retrieve email and message data within the user's permission. This process prioritizes the collection of new and unread messages.
[0137] Step 2:
[0138] The server feeds the acquired data into a natural language processing engine to analyze the information. As a result of the analysis, important keywords and topics are extracted, and a summary is generated based on them.
[0139] Step 3:
[0140] The server uses an emotion engine to recognize the user's emotional state based on their voice commands and text data. This emotional data is then analyzed to determine the user's current psychological state.
[0141] Step 4:
[0142] The server adjusts the content of information notifications and task suggestions based on the analyzed sentiment data. For example, if it detects that a user is in a high-stress state, it will adjust the settings to refrain from sending low-priority notifications.
[0143] Step 5:
[0144] The terminal receives notifications from the server and displays information to the user. At the same time, it can receive voice commands from the user via speech recognition and send those commands to the server.
[0145] Step 6:
[0146] Users review notifications and suggestions displayed on their devices and adjust their schedules as needed. Emotion-based feedback is sent back to the server, improving the system's adaptability.
[0147] Step 7:
[0148] The device sends user feedback to the server, which then updates the sentiment engine and recommendation algorithms based on that feedback. This continuous updating enables more personalized information delivery.
[0149] (Example 2)
[0150] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0151] In modern society, information overload is a problem, and users may miss important information. Furthermore, conventional systems fail to provide information and suggestions that take into account the user's emotional state, resulting in an unoptimized user experience. Therefore, there is a need for technologies that provide personalized information management and suggestions that consider user emotions.
[0152] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0153] In this invention, the server includes means for acquiring data from multiple information devices, means for analyzing the acquired data using natural language processing and generating summaries, and means for analyzing the user's behavioral history and emotional data and making future work suggestions. This makes it possible to provide personalized information according to the user's emotional state, efficiently manage the most useful information for the user from among excessive information, and provide appropriate work suggestions.
[0154] "Information equipment" refers to all electronic devices used for generating, transmitting, receiving, and processing data, including computers, smartphones, and other communication devices.
[0155] "Data" refers to all types of information obtained through information devices, including text messages, audio data, emails, and image data.
[0156] "Natural language processing" is a technology that enables computers to understand, interpret, and process human language, and is used for text analysis and summarization.
[0157] A "summary" refers to a concise compilation of the main points of information extracted through natural language processing, enabling users to efficiently understand important information.
[0158] "Behavioral history" refers to a collection of data that records a user's past activities and is used to analyze the user's behavioral patterns and preferences.
[0159] "Emotional data" refers to data that indicates a user's emotional state, and includes information obtained from voice tone, facial expressions, emotional expressions in text, and other sources.
[0160] "Work suggestions" refer to recommended actions that are provided based on the user's current situation and emotional state, and include specific suggestions for performing daily activities more efficiently.
[0161] "Portable devices" refer to electronic devices that users can easily carry with them, and include smartphones and tablet devices.
[0162] "Feedback" refers to reactions and opinions obtained from users, and includes information collected and analyzed to improve system performance and optimize the user experience.
[0163] "Information processing equipment" refers to devices used to electronically process data and perform calculations and analyses, and includes servers and computer systems.
[0164] "Voice commands" refer to a method of giving instructions to a system using voice, including commands processed via speech recognition technology.
[0165] "Personalized notifications" refer to informational notifications that are customized based on the user's individual circumstances and preferences, and include means of providing a personalized experience.
[0166] In this invention, the server acquires data from multiple information devices and analyzes that data using natural language processing. The hardware used includes cloud servers and database servers, and the software utilizes open-source libraries and machine learning frameworks for natural language processing, specifically the Python library spaCy and the machine learning framework TENSORFLOW®. This allows diverse information to be summarized and managed.
[0167] The terminal is responsible for presenting the user with summary data and suggestions received from the server. The terminal has a voice recognition function and communicates with the server to accurately process the user's voice instructions. Voice data and the user's facial expressions are also collected by the terminal, and this emotional data is sent back to the server.
[0168] Users make decisions based on information received from their devices. For example, if a user gives a voice command indicating stress, the server analyzes that emotional data and suggests specific actions to help them relax. Furthermore, the system updates its analysis results based on user feedback, enabling it to provide even more accurate and personalized services.
[0169] For example, a user can send a prompt to the system such as, "Summarize my emails and tell me if I'm stressed," and the server can perform the analysis and send the results to the terminal. In this way, the present invention integrates natural language processing and emotion recognition technology to realize information management and adaptive suggestions in the user's life.
[0170] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0171] Step 1:
[0172] The server periodically acquires data from multiple information devices, including mail servers, messaging applications, and other communication media. The input is raw data from each device, and the output is a centrally collected dataset. This data is stored in a database, serving as a foundation for subsequent analysis.
[0173] Step 2:
[0174] The server analyzes data stored in the database using a natural language processing engine. The input is the dataset collected in step 1, and the output is information in a summarized format highlighting important topics. Specifically, the spaCy library in Python is used to extract topics and perform keyword analysis, concisely summarizing information useful to the user.
[0175] Step 3:
[0176] The server collects voice and text data and analyzes the user's emotions using an emotion engine. Input is voice recordings and text content, and output is a dataset showing the user's emotional state. Machine learning models and emotion analysis algorithms are applied to this emotion analysis. This allows the server to determine if the user is experiencing stress and prepare appropriate responses.
[0177] Step 4:
[0178] The server generates informational notifications and work suggestions based on the data obtained in steps 2 and 3. The input consists of summary information and sentiment data, while the output is user-oriented suggestions and notifications. Specifically, it employs a personalized approach, such as suggesting short relaxation exercises when the user's stress level is high.
[0179] Step 5:
[0180] The terminal displays notifications and suggestions sent from the server to the user. Input is suggestion data from the server, and output is information received by the user visually or audibly. The terminal utilizes voice recognition capabilities, allowing the user to provide voice instructions in response to these notifications.
[0181] Step 6:
[0182] The user decides whether to accept the information and suggestions provided by the device and sends feedback to the device as an action. The input is a notification to the user, and the output is the user's action choice and subsequent feedback. This feedback is reflected in subsequent analyses and suggestions, contributing to improvements in the overall accuracy of the system and the user experience.
[0183] (Application Example 2)
[0184] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0185] In today's information society, users are often overwhelmed by the sheer volume of information and notifications, which frequently contributes to stress. Furthermore, the difficulty in providing flexible responses tailored to individual emotional states makes it challenging for users to live comfortably. Additionally, the lack of well-established methods for optimizing the environment based on user emotions is another significant issue.
[0186] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0187] In this invention, the server includes means for acquiring information from multiple communication devices, means for analyzing the acquired information using natural language processing and generating a summary, means for analyzing the user's behavioral tendencies and making future work suggestions, means for notifying the user's terminal of the information and suggestions, means for collecting user responses and updating the analysis results, and means for analyzing the user's emotions and adjusting the physical environment based on those emotions. This makes it possible to provide appropriate information and adjust the environment in accordance with the user's emotions, thereby providing a comfortable living environment.
[0188] A "communication device" is an electronic device that has the function of sending and receiving information, such as a smartphone or a computer.
[0189] "Natural language processing" is a technology that enables computers to understand, analyze, and generate language that humans use in everyday life.
[0190] A "summary" is a concise compilation of information, extracting the most important parts from the acquired data.
[0191] "Behavioral tendencies" refer to information that shows patterns in a user's past actions and choices, and are used to predict future behavior.
[0192] A "task proposal" is a suggestion made to a user outlining specific tasks or activities that should be undertaken in the future.
[0193] A "terminal" is a device that connects to a server and has the role of displaying information and accepting user input.
[0194] "Response" refers to the user's reaction or feedback to information or suggestions from the system.
[0195] "Emotions" refer to information that indicates a user's psychological or emotional state, and serve as a basis for the system to analyze and reflect in its actions.
[0196] "Environmental adjustment" refers to the act of changing settings to optimize the physical or digital environment in accordance with the user's emotions and state.
[0197] To implement this invention, a communication device, a server for information processing, and a terminal providing a user interface are required. The server is a device that connects to a communication device such as a smartphone or computer and periodically acquires information. The acquired information is analyzed by a natural language processing engine and a summary is generated. Here, the TextBlob library using Python is responsible for natural language processing. In addition, a predictive model based on past data is used to analyze user behavior trends and make suggestions for future tasks.
[0198] Voice and text input are used for analyzing user sentiment. The device is equipped with a microphone and speaker, and utilizes the SpeechRecognition library for speech recognition. Based on the user's sentiment, the device notifies the user of the most suitable suggestions provided by the server. Furthermore, the device can adjust the physical environment based on the user's sentiment analysis, with LED lighting systems and music playback devices being used for this adjustment. The NaiveBayesAnalyzer from the TextBlob library is used for sentiment analysis.
[0199] As a concrete example, if a user who feels unwell after waking up says, "I'm awake but still sleepy," the device will analyze their voice and suggest playing relaxing music and adjusting the lighting to a softer glow. An example of a prompt might be: "Design an assistant robot system that analyzes a user's emotions from their voice and optimizes their home environment. If the user indicates stress, play relaxing music and adjust the lighting to a warm color."
[0200] In this way, a system that provides a comfortable environment for users is realized.
[0201] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0202] Step 1:
[0203] The server retrieves information from communication devices such as smartphones and computers at regular intervals. This information includes user messages, schedules, and activity history. The retrieved data is stored in intermediate data storage on the server and prepared as input for the next processing step.
[0204] Step 2:
[0205] The server analyzes the acquired information using a natural language processing engine, extracts important topics, and generates a summary. Here, the TextBlob library is used to analyze the text data and compile the main information into a summary. This summarized information becomes the output for the next sentiment analysis step.
[0206] Step 3:
[0207] The server activates the emotion engine using summary data and user voice input to analyze the user's emotional state. Voice input is received through the terminal and converted to text using the SpeechRecognition library. The obtained text data is then analyzed for emotion using TextBlob's NaiveBayesAnalyzer. The analysis results are output as the user's emotional state (e.g., positive, negative, neutral).
[0208] Step 4:
[0209] The device receives suggestions generated by the server based on the user's emotional state and notifies the user. These suggestions include specific actions such as playing music to help the user relax or adjusting the lighting. The device then operates its built-in speaker and intelligent lighting system to perform the suggested actions. User feedback is also received and sent to the server for analysis.
[0210] Step 5:
[0211] Users can adjust their environment based on information and suggestions from their devices, allowing them to have a more comfortable experience. By inputting how users felt about the suggestions, the server processes this data to improve the accuracy of future analyses. This enables the server to continuously improve system performance and optimize the user experience.
[0212] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0213] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0214] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0215] [Second Embodiment]
[0216] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0217] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0218] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0219] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0220] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0221] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0222] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0223] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0224] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0225] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0226] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0227] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0228] This invention is an AI-powered system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play different roles and work together to achieve its functions.
[0229] The server first obtains permission from the user to access the communication device and retrieve information such as emails and messages. In this information gathering process, the server uses the API of each communication device to extract the necessary data and periodically collects updated information.
[0230] Next, the server processes the acquired data through a natural language processing engine to analyze the text. It extracts important topics and keywords and summarizes the data based on them. For example, emails containing meeting schedules or important deadlines are summarized as high-priority information for the user.
[0231] The server then analyzes the user's behavioral history and schedule data to predict future tasks and events. Based on these predictions, the server proposes recommended actions to the user and summarizes the content. The server sends the generated information and suggestions to the device, allowing the user to receive notifications in real time.
[0232] The device allows users to receive notifications and access information through various interfaces via mobile devices and wearable devices. The device incorporates voice recognition capabilities, enabling it to accept user voice commands. This allows users to perform tasks such as checking schedules and adding new tasks via voice.
[0233] Users manage their daily tasks and lives based on the information and suggestions provided by the system. They can also provide feedback, which is sent to the server and used to update the system's analysis model and improve its accuracy.
[0234] This AI-powered information management system is a powerful tool for efficiently managing daily life and work, ensuring that important information is not overlooked amidst a vast amount of data. For example, to prevent users from missing an important meeting on Monday, the server can send a notification the night before and set a reminder in the morning, ensuring the user is properly prepared.
[0235] The following describes the processing flow.
[0236] Step 1:
[0237] The server retrieves email and message data via the communication device's API, based on the access permissions provided by the user. During this process, the server periodically updates the data and accumulates new information.
[0238] Step 2:
[0239] The server analyzes the acquired data by passing it through a natural language processing engine. This extracts important topics and keywords and summarizes the data. For example, it prioritizes extracting tasks with deadlines and meeting dates.
[0240] Step 3:
[0241] Based on the analyzed information, the server analyzes the user's behavioral history and calendar data, and uses a predictive model to suggest future tasks and events. This allows notifications to include things the user should prepare in advance.
[0242] Step 4:
[0243] The server prepares to push the generated important information and task suggestions to the user's terminal. This information is organized so that the user can access it immediately.
[0244] Step 5:
[0245] The device receives push notifications sent from the server and displays them to the user. The device can also accept voice commands from the user via its voice interface and send those commands to the server.
[0246] Step 6:
[0247] Users can check notifications displayed on their devices and adjust their actions and schedules as needed. They can also improve the accuracy of system recommendations by sending feedback to the server via their devices.
[0248] (Example 1)
[0249] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0250] Modern users are constantly exposed to vast amounts of information from numerous sources, making it difficult to efficiently extract and manage important information. In particular, prioritizing information generated in daily life and work, and managing future schedules, has become increasingly complex. In this context, there is a growing need for systems that provide appropriate information in real time and support future activities.
[0251] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0252] In this invention, the server includes means for acquiring information from a communication terminal, means for analyzing the acquired information using a processing device and generating a summary, and means for analyzing the user's history data and suggesting future tasks. This enables users to efficiently extract important information from diverse sources, predict important tasks in their daily lives and work, and act in a planned manner.
[0253] A "communication terminal" is an electronic device that has the function of receiving or transmitting information.
[0254] "Information" refers to a collection of data or knowledge that is acquired or provided, including emails and messages.
[0255] A "processing device" is a device that performs computational processing, such as analyzing data and summarizing information.
[0256] "Analysis" is the process of breaking down information into its individual elements and converting them into a form that is easy to understand.
[0257] A "summary" is a concise compilation of information, resulting from the extraction of key points.
[0258] A "user" is a person who operates or uses the system.
[0259] "History data" refers to records of activities and actions that a user has performed in the past.
[0260] Proposing a "task" means recommending future activities or actions to the user.
[0261] An "apparatus" is a machine or device that has a specific function.
[0262] "Evaluation" refers to a user's judgment or opinion regarding the information provided by the system.
[0263] An "analytical model" is a mathematical or statistical method used to understand data and make predictions.
[0264] A "portable terminal" is an electronic device with communication capabilities that can be carried around.
[0265] This invention is a system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play their respective roles in acquiring, analyzing, suggesting, and notifying information.
[0266] The server first acquires information from multiple communication terminals. This involves using standardized APIs to retrieve data from various email services and messaging applications. Using a natural language processing engine such as the Google Cloud Natural Language API, the server analyzes the acquired information, extracts important topics and keywords, and generates a summary.
[0267] The server also analyzes user history data and uses a generative AI model to suggest future tasks. Based on past behavioral patterns and schedules, it can recommend the next steps to take and create efficient plans.
[0268] Next, the generated information and suggestions are sent to the terminal in real time. The terminal functions as a mobile device or wearable device and accepts the user's voice commands through voice recognition. This allows users to check information and add new tasks hands-free.
[0269] Furthermore, users can provide feedback on the system's suggestions, which is then sent to the server. The server updates the analysis model based on the collected feedback, improving its accuracy.
[0270] As a concrete example, to ensure users don't miss important emails each day, the server sends a reminder the night before, prompting them to check it the following morning, thus supporting thorough preparation. An example of a prompt message for the generating AI model might be, "Please review this week's schedule and summarize important appointments. Also, please suggest the next necessary actions."
[0271] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0272] Step 1:
[0273] The server retrieves information from the communication terminal. Input is API access based on user permission, retrieving email and message data. Output is the retrieved raw data. This information is used in subsequent analysis steps.
[0274] Step 2:
[0275] The server processes the acquired information through a natural language processing engine to analyze the data. The input is the raw data acquired in step 1, which is then processed for keyword extraction and topic recognition. The output consists of summarized text information and a list of high-priority information. Specifically, meetings and deadlines are highlighted in the summary.
[0276] Step 3:
[0277] The server analyzes the user's historical data and proposes future tasks. In this step, in addition to the schedule data from the Calendar API, past behavioral data is input. The generative AI model analyzes this data and provides recommended tasks and schedules as output. Based on the set routines, weekly reports and the like are generated.
[0278] Step 4:
[0279] The server sends the generated information and proposals to the terminal. The input is the summary information and proposals generated in Steps 2 and 3, and is sent to the user device in a notification format. The output is visual and audible alerts displayed on the terminal. Pop-up notifications and reminders are displayed on the device.
[0280] Step 5:
[0281] The terminal receives voice instructions from the user and sends them to the server. The input is the user's voice command, which is converted into text format by the voice recognition function. The output is instruction data to the server. This allows the user to operate without using their hands.
[0282] Step 6:
[0283] The user provides feedback on the system's proposals. This feedback is input to the server and used to improve the accuracy of the analysis model. The output is the feedback data accumulated in the system, which contributes to the continuous improvement of the model. Customization reflecting the user's opinions is performed.
[0284] (Application Example 1)
[0285] Next, Application Example 1 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".
[0286] In modern urban life, it is a major challenge to efficiently and accurately provide users with important information from a vast amount of information. In particular, by timely processing information related to traffic conditions, weather conditions, and public events and proposing optimal actions for users, the rationalization of daily life is required. In addition, it is essential to improve the operability by effectively utilizing user voice interactions.
[0287] The specific processing by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0288] In this invention, the server includes means for collecting information from a plurality of communication devices, means for analyzing the acquired information by natural language processing technology and constructing a summary, means for collecting traffic conditions, weather information, and public event data and making proposals suitable for users, and means for utilizing a voice recognition function and processing user voice instructions. As a result, users can make optimal judgments without missing important information in their daily lives.
[0289] A "communication device" is an online or offline device or apparatus that transmits and receives information.
[0290] "Means for collecting information" is a function or process for collecting necessary information from external data sources.
[0291] "Natural language processing technology" is a computational technology for converting human language into a form that is easy for a computer to understand and analyzing it.
[0292] "Means for constructing a summary" is a method for compactly summarizing acquired information and data, extracting important content, and presenting it in a concise form.
[0293] An "action history" is a record of activities and operations that a user has performed in the past.
[0294] "Means of suggesting activities" refers to a function that recommends beneficial actions for users based on analyzed data.
[0295] A "terminal device" refers to a device that a user can directly operate, such as a smartphone or tablet.
[0296] A "voice instruction processing function" refers to a program or device that recognizes voice input from a user and performs the necessary operations based on that input.
[0297] "Traffic conditions" refers to information about congestion and traffic conditions on roads and public transportation in a specific area.
[0298] "Weather information" refers to data about local weather conditions, such as temperature, probability of precipitation, and wind speed.
[0299] "Public event data" refers to information about events that are open to the general public.
[0300] This technology is a personal assistant system that efficiently supports urban life. The following describes an embodiment of the system.
[0301] The server collects information from multiple communication devices and utilizes online APIs to obtain data in real time. The hardware used includes cloud servers. For software, the Google Cloud Natural Language API is used for natural language processing. Through this API, it is possible to analyze the acquired information and generate summaries. Furthermore, traffic conditions, weather information, and public event data are continuously collected from relevant organizations and open data sources.
[0302] The server further analyzes the user's behavioral history data and uses an AI model to suggest future activities. For example, if there is a forecast for worsening weather the next day, it will notify the user to bring an umbrella.
[0303] As terminal devices, smartphones and wearable devices are used, and users receive information through these. For the voice recognition function, Amazon Alexa Voice Service is utilized to process the user's voice instructions and execute necessary actions.
[0304] Users can interact with the system through the device and provide instructions in the form of voice or text as feedback. This feedback can be used to update the AI model and improve its accuracy.
[0305] As a specific example, when a user inputs a prompt sentence such as "What will the weather be like tomorrow?", the system obtains the weather information for the next day from the server and notifies the user's smartphone of a summary through natural language processing. In this way, users can efficiently make action plans.
[0306] The flow of specific processing in Application Example 1 will be described using FIG. 12.
[0307] Step 1:
[0308] The server collects information from multiple communication devices. It uses APIs to obtain real-time traffic conditions, weather forecasts, and event information in real time. As inputs, there are raw data provided by various data providers, and as outputs, internal data structures with unified formats of these data are generated.
[0309] Step 2:
[0310] The server analyzes the collected data using natural language processing technology and constructs a summary. Through the Google Cloud Natural Language API, important information and keywords of each data are extracted. As inputs, there are the internal data structures generated in the previous step, and as outputs, summarized text data are generated. As specific operations, important topics are extracted and summarized based on them.
[0311] Step 3:
[0312] The server uses an AI model to analyze user behavior history data. Using collected schedule information and past behavior patterns as input, it generates suggestions for future activities based on that information. The output consists of suggestions for actions that are beneficial to the user. Specific actions include everyday suggestions such as bringing an umbrella.
[0313] Step 4:
[0314] The server notifies the user's device of the generated information and suggestions. Push notifications are sent to smartphones and wearable devices. The input consists of summarized information and suggestions, and the output is a notification sent to the user's device.
[0315] Step 5:
[0316] Users provide feedback via voice or text through their terminal device. Voice commands are processed using the Amazon Alexa Voice Service. Input is the user's voice or text command, and output is the system performing an action tailored to that input.
[0317] Step 6:
[0318] The server updates the analysis model based on user feedback. The input is feedback data, and the output is an improved analysis model. Specifically, the feedback is added to the model's training data and used for the next run.
[0319] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0320] This invention is a system that more effectively supports users' lives and work by incorporating an emotion engine that recognizes and utilizes user emotions, in addition to a technology that collects information from communication devices and summarizes the data using natural language processing. In this system, the server, terminal, and user collaborate to manage information and provide user assistance.
[0321] The server connects to communication devices authorized by the user and retrieves email and message data. This retrieval is performed periodically, and information across various platforms is centrally managed. The retrieved data is analyzed by a natural language processing engine to identify important topics, and then saved as a summary.
[0322] Next, using its emotion engine, one of the system's key features, the server analyzes the user's emotions from their voice and text. This emotion data is used to adjust information notifications and task suggestions. For example, if the emotion engine detects that the user is experiencing stress, the server can reduce the number of low-priority notifications or offer suggestions for relaxation.
[0323] The terminal's role is to present information and suggestions sent from the server to the user. The terminal is equipped with voice recognition capabilities, allowing the user to operate it using voice commands. In addition, the terminal senses the user's facial expressions and tone of voice and provides information to the emotion engine.
[0324] Users can manage their daily tasks based on information and suggestions provided by their devices. Furthermore, they receive emotionally responsive suggestions and feedback, leading to a high level of satisfaction with the system. Personalized support, provided through emotion engine analysis, enhances the user experience.
[0325] For example, if a user shows signs of fatigue during work, the emotion engine detects this emotion. Based on this emotion data, the server then sends a notification to the device suggesting short exercise or meditation apps that can help refresh the user, thereby supporting their work efficiency. In this way, the system integrates emotion recognition technology to achieve more effective information presentation and user support.
[0326] The following describes the processing flow.
[0327] Step 1:
[0328] The server uses the communication device's API to periodically retrieve email and message data within the user's permission. This process prioritizes the collection of new and unread messages.
[0329] Step 2:
[0330] The server feeds the acquired data into a natural language processing engine to analyze the information. As a result of the analysis, important keywords and topics are extracted, and a summary is generated based on them.
[0331] Step 3:
[0332] The server uses an emotion engine to recognize the user's emotional state based on their voice commands and text data. This emotional data is then analyzed to determine the user's current psychological state.
[0333] Step 4:
[0334] The server adjusts the content of information notifications and task suggestions based on the analyzed sentiment data. For example, if it detects that a user is in a high-stress state, it will adjust the settings to refrain from sending low-priority notifications.
[0335] Step 5:
[0336] The terminal receives notifications from the server and displays information to the user. At the same time, it can receive voice commands from the user via speech recognition and send those commands to the server.
[0337] Step 6:
[0338] Users review notifications and suggestions displayed on their devices and adjust their schedules as needed. Emotion-based feedback is sent back to the server, improving the system's adaptability.
[0339] Step 7:
[0340] The device sends user feedback to the server, which then updates the sentiment engine and recommendation algorithms based on that feedback. This continuous updating enables more personalized information delivery.
[0341] (Example 2)
[0342] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0343] In modern society, information overload is a problem, and users may miss important information. Furthermore, conventional systems fail to provide information and suggestions that take into account the user's emotional state, resulting in an unoptimized user experience. Therefore, there is a need for technologies that provide personalized information management and suggestions that consider user emotions.
[0344] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0345] In this invention, the server includes means for acquiring data from multiple information devices, means for analyzing the acquired data using natural language processing and generating summaries, and means for analyzing the user's behavioral history and emotional data and making future work suggestions. This makes it possible to provide personalized information according to the user's emotional state, efficiently manage the most useful information for the user from among excessive information, and provide appropriate work suggestions.
[0346] "Information equipment" refers to all electronic devices used for generating, transmitting, receiving, and processing data, including computers, smartphones, and other communication devices.
[0347] "Data" refers to all types of information obtained through information devices, including text messages, audio data, emails, and image data.
[0348] "Natural language processing" is a technology that enables computers to understand, interpret, and process human language, and is used for text analysis and summarization.
[0349] A "summary" refers to a concise compilation of the main points of information extracted through natural language processing, enabling users to efficiently understand important information.
[0350] "Behavioral history" refers to a collection of data that records a user's past activities and is used to analyze the user's behavioral patterns and preferences.
[0351] "Emotional data" refers to data that indicates a user's emotional state, and includes information obtained from voice tone, facial expressions, emotional expressions in text, and other sources.
[0352] "Work suggestions" refer to recommended actions that are provided based on the user's current situation and emotional state, and include specific suggestions for performing daily activities more efficiently.
[0353] "Portable devices" refer to electronic devices that users can easily carry with them, and include smartphones and tablet devices.
[0354] "Feedback" refers to reactions and opinions obtained from users, and includes information collected and analyzed to improve system performance and optimize the user experience.
[0355] "Information processing equipment" refers to devices used to electronically process data and perform calculations and analyses, and includes servers and computer systems.
[0356] "Voice commands" refer to a method of giving instructions to a system using voice, including commands processed via speech recognition technology.
[0357] "Personalized notifications" refer to informational notifications that are customized based on the user's individual circumstances and preferences, and include means of providing a personalized experience.
[0358] In this invention, the server acquires data from multiple information devices and analyzes that data using natural language processing. The hardware used includes cloud servers and database servers, and the software utilizes open-source libraries and machine learning frameworks for natural language processing, specifically the Python library spaCy and the machine learning framework TensorFlow. This allows diverse information to be summarized and managed.
[0359] The terminal is responsible for presenting the user with summary data and suggestions received from the server. The terminal has a voice recognition function and communicates with the server to accurately process the user's voice instructions. Voice data and the user's facial expressions are also collected by the terminal, and this emotional data is sent back to the server.
[0360] Users make decisions based on information received from their devices. For example, if a user gives a voice command indicating stress, the server analyzes that emotional data and suggests specific actions to help them relax. Furthermore, the system updates its analysis results based on user feedback, enabling it to provide even more accurate and personalized services.
[0361] For example, a user can send a prompt to the system such as, "Summarize my emails and tell me if I'm stressed," and the server can perform the analysis and send the results to the terminal. In this way, the present invention integrates natural language processing and emotion recognition technology to realize information management and adaptive suggestions in the user's life.
[0362] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0363] Step 1:
[0364] The server periodically acquires data from multiple information devices, including mail servers, messaging applications, and other communication media. The input is raw data from each device, and the output is a centrally collected dataset. This data is stored in a database, serving as a foundation for subsequent analysis.
[0365] Step 2:
[0366] The server analyzes data stored in the database using a natural language processing engine. The input is the dataset collected in step 1, and the output is information in a summarized format highlighting important topics. Specifically, the spaCy library in Python is used to extract topics and perform keyword analysis, concisely summarizing information useful to the user.
[0367] Step 3:
[0368] The server collects voice and text data and analyzes the user's emotions using an emotion engine. Input is voice recordings and text content, and output is a dataset showing the user's emotional state. Machine learning models and emotion analysis algorithms are applied to this emotion analysis. This allows the server to determine if the user is experiencing stress and prepare appropriate responses.
[0369] Step 4:
[0370] The server generates informational notifications and work suggestions based on the data obtained in steps 2 and 3. The input consists of summary information and sentiment data, while the output is user-oriented suggestions and notifications. Specifically, it employs a personalized approach, such as suggesting short relaxation exercises when the user's stress level is high.
[0371] Step 5:
[0372] The terminal displays notifications and suggestions sent from the server to the user. Input is suggestion data from the server, and output is information received by the user visually or audibly. The terminal utilizes voice recognition capabilities, allowing the user to provide voice instructions in response to these notifications.
[0373] Step 6:
[0374] The user decides whether to accept the information and suggestions provided by the device and sends feedback to the device as an action. The input is a notification to the user, and the output is the user's action choice and subsequent feedback. This feedback is reflected in subsequent analyses and suggestions, contributing to improvements in the overall accuracy of the system and the user experience.
[0375] (Application Example 2)
[0376] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0377] In today's information society, users are often overwhelmed by the sheer volume of information and notifications, which frequently contributes to stress. Furthermore, the difficulty in providing flexible responses tailored to individual emotional states makes it challenging for users to live comfortably. Additionally, the lack of well-established methods for optimizing the environment based on user emotions is another significant issue.
[0378] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0379] In this invention, the server includes means for acquiring information from multiple communication devices, means for analyzing the acquired information using natural language processing and generating a summary, means for analyzing the user's behavioral tendencies and making future work suggestions, means for notifying the user's terminal of the information and suggestions, means for collecting user responses and updating the analysis results, and means for analyzing the user's emotions and adjusting the physical environment based on those emotions. This makes it possible to provide appropriate information and adjust the environment in accordance with the user's emotions, thereby providing a comfortable living environment.
[0380] A "communication device" is an electronic device that has the function of sending and receiving information, such as a smartphone or a computer.
[0381] "Natural language processing" is a technology that enables computers to understand, analyze, and generate language that humans use in everyday life.
[0382] A "summary" is a concise compilation of information, extracting the most important parts from the acquired data.
[0383] "Behavioral tendencies" refer to information that shows patterns in a user's past actions and choices, and are used to predict future behavior.
[0384] A "task proposal" is a suggestion made to a user outlining specific tasks or activities that should be undertaken in the future.
[0385] A "terminal" is a device that connects to a server and has the role of displaying information and accepting user input.
[0386] "Response" refers to the user's reaction or feedback to information or suggestions from the system.
[0387] "Emotions" refer to information that indicates a user's psychological or emotional state, and serve as a basis for the system to analyze and reflect in its actions.
[0388] "Environmental adjustment" refers to the act of changing settings to optimize the physical or digital environment in accordance with the user's emotions and state.
[0389] To implement this invention, a communication device, a server for information processing, and a terminal providing a user interface are required. The server is a device that connects to a communication device such as a smartphone or computer and periodically acquires information. The acquired information is analyzed by a natural language processing engine and a summary is generated. Here, the TextBlob library using Python is responsible for natural language processing. In addition, a predictive model based on past data is used to analyze user behavior trends and make suggestions for future tasks.
[0390] Voice and text input are used for analyzing user sentiment. The device is equipped with a microphone and speaker, and utilizes the SpeechRecognition library for speech recognition. Based on the user's sentiment, the device notifies the user of the most suitable suggestions provided by the server. Furthermore, the device can adjust the physical environment based on the user's sentiment analysis, with LED lighting systems and music playback devices being used for this adjustment. The NaiveBayesAnalyzer from the TextBlob library is used for sentiment analysis.
[0391] As a concrete example, if a user who feels unwell after waking up says, "I'm awake but still sleepy," the device will analyze their voice and suggest playing relaxing music and adjusting the lighting to a softer glow. An example of a prompt might be: "Design an assistant robot system that analyzes a user's emotions from their voice and optimizes their home environment. If the user indicates stress, play relaxing music and adjust the lighting to a warm color."
[0392] In this way, a system that provides a comfortable environment for users is realized.
[0393] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0394] Step 1:
[0395] The server retrieves information from communication devices such as smartphones and computers at regular intervals. This information includes user messages, schedules, and activity history. The retrieved data is stored in intermediate data storage on the server and prepared as input for the next processing step.
[0396] Step 2:
[0397] The server analyzes the acquired information using a natural language processing engine, extracts important topics, and generates a summary. Here, the TextBlob library is used to analyze the text data and compile the main information into a summary. This summarized information becomes the output for the next sentiment analysis step.
[0398] Step 3:
[0399] The server activates the emotion engine using summary data and user voice input to analyze the user's emotional state. Voice input is received through the terminal and converted to text using the SpeechRecognition library. The obtained text data is then analyzed for emotion using TextBlob's NaiveBayesAnalyzer. The analysis results are output as the user's emotional state (e.g., positive, negative, neutral).
[0400] Step 4:
[0401] The device receives suggestions generated by the server based on the user's emotional state and notifies the user. These suggestions include specific actions such as playing music to help the user relax or adjusting the lighting. The device then operates its built-in speaker and intelligent lighting system to perform the suggested actions. User feedback is also received and sent to the server for analysis.
[0402] Step 5:
[0403] Users can adjust their environment based on information and suggestions from their devices, allowing them to have a more comfortable experience. By inputting how users felt about the suggestions, the server processes this data to improve the accuracy of future analyses. This enables the server to continuously improve system performance and optimize the user experience.
[0404] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0405] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0406] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0407] [Third Embodiment]
[0408] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0409] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0410] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0411] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0412] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0413] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0414] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0415] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0416] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0417] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0418] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0419] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0420] This invention is an AI-powered system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play different roles and work together to achieve its functions.
[0421] The server first obtains permission from the user to access the communication device and retrieve information such as emails and messages. In this information gathering process, the server uses the API of each communication device to extract the necessary data and periodically collects updated information.
[0422] Next, the server processes the acquired data through a natural language processing engine to analyze the text. It extracts important topics and keywords and summarizes the data based on them. For example, emails containing meeting schedules or important deadlines are summarized as high-priority information for the user.
[0423] The server then analyzes the user's behavioral history and schedule data to predict future tasks and events. Based on these predictions, the server proposes recommended actions to the user and summarizes the content. The server sends the generated information and suggestions to the device, allowing the user to receive notifications in real time.
[0424] The device allows users to receive notifications and access information through various interfaces via mobile devices and wearable devices. The device incorporates voice recognition capabilities, enabling it to accept user voice commands. This allows users to perform tasks such as checking schedules and adding new tasks via voice.
[0425] Users manage their daily tasks and lives based on the information and suggestions provided by the system. They can also provide feedback, which is sent to the server and used to update the system's analysis model and improve its accuracy.
[0426] This AI-powered information management system is a powerful tool for efficiently managing daily life and work, ensuring that important information is not overlooked amidst a vast amount of data. For example, to prevent users from missing an important meeting on Monday, the server can send a notification the night before and set a reminder in the morning, ensuring the user is properly prepared.
[0427] The following describes the processing flow.
[0428] Step 1:
[0429] The server retrieves email and message data via the communication device's API, based on the access permissions provided by the user. During this process, the server periodically updates the data and accumulates new information.
[0430] Step 2:
[0431] The server analyzes the acquired data by passing it through a natural language processing engine. This extracts important topics and keywords and summarizes the data. For example, it prioritizes extracting tasks with deadlines and meeting dates.
[0432] Step 3:
[0433] Based on the analyzed information, the server analyzes the user's behavioral history and calendar data, and uses a predictive model to suggest future tasks and events. This allows notifications to include things the user should prepare in advance.
[0434] Step 4:
[0435] The server prepares to push the generated important information and task suggestions to the user's terminal. This information is organized so that the user can access it immediately.
[0436] Step 5:
[0437] The device receives push notifications sent from the server and displays them to the user. The device can also accept voice commands from the user via its voice interface and send those commands to the server.
[0438] Step 6:
[0439] Users can check notifications displayed on their devices and adjust their actions and schedules as needed. They can also improve the accuracy of system recommendations by sending feedback to the server via their devices.
[0440] (Example 1)
[0441] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0442] Modern users are constantly exposed to vast amounts of information from numerous sources, making it difficult to efficiently extract and manage important information. In particular, prioritizing information generated in daily life and work, and managing future schedules, has become increasingly complex. In this context, there is a growing need for systems that provide appropriate information in real time and support future activities.
[0443] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0444] In this invention, the server includes means for acquiring information from a communication terminal, means for analyzing the acquired information using a processing device and generating a summary, and means for analyzing the user's history data and suggesting future tasks. This enables users to efficiently extract important information from diverse sources, predict important tasks in their daily lives and work, and act in a planned manner.
[0445] A "communication terminal" is an electronic device that has the function of receiving or transmitting information.
[0446] "Information" refers to a collection of data or knowledge that is acquired or provided, including emails and messages.
[0447] A "processing device" is a device that performs computational processing, such as analyzing data and summarizing information.
[0448] "Analysis" is the process of breaking down information into its individual elements and converting them into a form that is easy to understand.
[0449] A "summary" is a concise compilation of information, resulting from the extraction of key points.
[0450] A "user" is a person who operates or uses the system.
[0451] "History data" refers to records of activities and actions that a user has performed in the past.
[0452] Proposing a "task" means recommending future activities or actions to the user.
[0453] An "apparatus" is a machine or device that has a specific function.
[0454] "Evaluation" refers to a user's judgment or opinion regarding the information provided by the system.
[0455] An "analytical model" is a mathematical or statistical method used to understand data and make predictions.
[0456] A "portable terminal" is an electronic device with communication capabilities that can be carried around.
[0457] This invention is a system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play their respective roles in acquiring, analyzing, suggesting, and notifying information.
[0458] The server first acquires information from multiple communication terminals. This involves using standardized APIs to retrieve data from various email services and messaging applications. Using a natural language processing engine such as the Google Cloud Natural Language API, the server analyzes the acquired information, extracts important topics and keywords, and generates a summary.
[0459] The server also analyzes user history data and uses a generative AI model to suggest future tasks. Based on past behavioral patterns and schedules, it can recommend the next steps to take and create efficient plans.
[0460] Next, the generated information and suggestions are sent to the terminal in real time. The terminal functions as a mobile device or wearable device and accepts the user's voice commands through voice recognition. This allows users to check information and add new tasks hands-free.
[0461] Furthermore, users can provide feedback on the system's suggestions, which is then sent to the server. The server updates the analysis model based on the collected feedback, improving its accuracy.
[0462] As a concrete example, to ensure users don't miss important emails each day, the server sends a reminder the night before, prompting them to check it the following morning, thus supporting thorough preparation. An example of a prompt message for the generating AI model might be, "Please review this week's schedule and summarize important appointments. Also, please suggest the next necessary actions."
[0463] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0464] Step 1:
[0465] The server retrieves information from the communication terminal. Input is API access based on user permission, retrieving email and message data. Output is the retrieved raw data. This information is used in subsequent analysis steps.
[0466] Step 2:
[0467] The server processes the acquired information through a natural language processing engine to analyze the data. The input is the raw data acquired in step 1, which is then processed for keyword extraction and topic recognition. The output consists of summarized text information and a list of high-priority information. Specifically, meetings and deadlines are highlighted in the summary.
[0468] Step 3:
[0469] The server analyzes the user's historical data and suggests future tasks. This step involves inputting schedule data from the calendar API, as well as past behavioral data. A generative AI model analyzes this data and provides recommended tasks and appointments as output. Weekly reports and other similar reports are generated based on the configured routines.
[0470] Step 4:
[0471] The server sends the generated information and suggestions to the terminal. The input is the summary information and suggestions generated in steps 2 and 3, which are sent to the user's device in notification format. The output is a visual and audible alert displayed on the terminal. Pop-up notifications and reminders are displayed on the device.
[0472] Step 5:
[0473] The terminal receives voice commands from the user and sends them to the server. The input is the user's voice command, which is converted into text format by speech recognition. The output is instruction data sent to the server. This allows the user to operate the device hands-free.
[0474] Step 6:
[0475] Users provide feedback on the system's suggestions. This feedback is entered into the server and used to improve the accuracy of the analysis model. The output is the feedback data accumulated in the system, contributing to the continuous improvement of the model. Customization is performed to reflect user feedback.
[0476] (Application Example 1)
[0477] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0478] In modern urban life, efficiently and accurately providing users with important information from a vast amount of data is a major challenge. In particular, there is a need to streamline daily life by processing information on traffic conditions, weather, and public events in a timely manner and suggesting optimal actions for users. Furthermore, improving usability through the effective use of user voice interaction is essential.
[0479] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0480] In this invention, the server includes means for collecting information from multiple communication devices, means for analyzing the acquired information using natural language processing technology and constructing a summary, means for collecting traffic conditions, weather information, and public event data and making suggestions suitable for the user, and means for processing the user's voice instructions using speech recognition functionality. This enables the user to make optimal decisions without missing important information in their daily life.
[0481] A "communication device" is an online or offline device or equipment that sends and receives information.
[0482] "Means of information gathering" refer to the functions and processes for collecting necessary information from external data sources.
[0483] "Natural language processing technology" is a computational technique for converting human language into a format that computers can easily understand and then analyzing it.
[0484] "Methods for constructing summaries" refer to methods for compactly organizing acquired information and data, extracting important content, and presenting it in a concise form.
[0485] "Activity history" refers to a record of activities and actions a user has performed in the past.
[0486] "Means of suggesting activities" refers to a function that recommends beneficial actions for users based on analyzed data.
[0487] A "terminal device" refers to a device that a user can directly operate, such as a smartphone or tablet.
[0488] A "voice instruction processing function" refers to a program or device that recognizes voice input from a user and performs the necessary operations based on that input.
[0489] "Traffic conditions" refers to information about congestion and traffic conditions on roads and public transportation in a specific area.
[0490] "Weather information" refers to data about local weather conditions, such as temperature, probability of precipitation, and wind speed.
[0491] "Public event data" refers to information about events that are open to the general public.
[0492] This technology is a personal assistant system that efficiently supports urban life. The following describes an embodiment of the system.
[0493] The server collects information from multiple communication devices and utilizes online APIs to obtain data in real time. The hardware used includes cloud servers. For software, the Google Cloud Natural Language API is used for natural language processing. Through this API, it is possible to analyze the acquired information and generate summaries. Furthermore, traffic conditions, weather information, and public event data are continuously collected from relevant organizations and open data sources.
[0494] The server further analyzes the user's behavioral history data and uses an AI model to suggest future activities. For example, if there is a forecast for worsening weather the next day, it will notify the user to bring an umbrella.
[0495] Smartphones and wearable devices are used as terminal devices, and users receive information through them. Amazon Alexa Voice Service is used for voice recognition, processing user voice commands and performing necessary actions.
[0496] Users can interact with the system through their devices and provide feedback through voice and text instructions. This feedback helps update the AI model and improve its accuracy.
[0497] As a concrete example, if a user enters the prompt "What will the weather be like tomorrow?", the system retrieves the next day's weather information from the server and notifies the user's smartphone of a summary processed through natural language processing. In this way, the user can efficiently plan their actions.
[0498] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0499] Step 1:
[0500] The server collects information from multiple communication devices. It uses APIs to obtain real-time traffic conditions, weather forecasts, and event information. The input consists of raw data provided by various data providers, and the output is an internal data structure that unifies the format of this data.
[0501] Step 2:
[0502] The server analyzes the collected data using natural language processing techniques and constructs a summary. It extracts key information and keywords from each data point via the Google Cloud Natural Language API. The input is the internal data structure generated in the previous step, and the output is summarized text data. Specifically, it extracts important topics and summarizes based on them.
[0503] Step 3:
[0504] The server uses an AI model to analyze user behavior history data. Using collected schedule information and past behavior patterns as input, it generates suggestions for future activities based on that information. The output consists of suggestions for actions that are beneficial to the user. Specific actions include everyday suggestions such as bringing an umbrella.
[0505] Step 4:
[0506] The server notifies the user's device of the generated information and suggestions. Push notifications are sent to smartphones and wearable devices. The input consists of summarized information and suggestions, and the output is a notification sent to the user's device.
[0507] Step 5:
[0508] Users provide feedback via voice or text through their terminal device. Voice commands are processed using the Amazon Alexa Voice Service. Input is the user's voice or text command, and output is the system performing an action tailored to that input.
[0509] Step 6:
[0510] The server updates the analysis model based on user feedback. The input is feedback data, and the output is an improved analysis model. Specifically, the feedback is added to the model's training data and used for the next run.
[0511] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0512] This invention is a system that more effectively supports users' lives and work by incorporating an emotion engine that recognizes and utilizes user emotions, in addition to a technology that collects information from communication devices and summarizes the data using natural language processing. In this system, the server, terminal, and user collaborate to manage information and provide user assistance.
[0513] The server connects to communication devices authorized by the user and retrieves email and message data. This retrieval is performed periodically, and information across various platforms is centrally managed. The retrieved data is analyzed by a natural language processing engine to identify important topics, and then saved as a summary.
[0514] Next, using its emotion engine, one of the system's key features, the server analyzes the user's emotions from their voice and text. This emotion data is used to adjust information notifications and task suggestions. For example, if the emotion engine detects that the user is experiencing stress, the server can reduce the number of low-priority notifications or offer suggestions for relaxation.
[0515] The terminal's role is to present information and suggestions sent from the server to the user. The terminal is equipped with voice recognition capabilities, allowing the user to operate it using voice commands. In addition, the terminal senses the user's facial expressions and tone of voice and provides information to the emotion engine.
[0516] Users can manage their daily tasks based on information and suggestions provided by their devices. Furthermore, they receive emotionally responsive suggestions and feedback, leading to a high level of satisfaction with the system. Personalized support, provided through emotion engine analysis, enhances the user experience.
[0517] For example, if a user shows signs of fatigue during work, the emotion engine detects this emotion. Based on this emotion data, the server then sends a notification to the device suggesting short exercise or meditation apps that can help refresh the user, thereby supporting their work efficiency. In this way, the system integrates emotion recognition technology to achieve more effective information presentation and user support.
[0518] The following describes the processing flow.
[0519] Step 1:
[0520] The server uses the communication device's API to periodically retrieve email and message data within the user's permission. This process prioritizes the collection of new and unread messages.
[0521] Step 2:
[0522] The server feeds the acquired data into a natural language processing engine to analyze the information. As a result of the analysis, important keywords and topics are extracted, and a summary is generated based on them.
[0523] Step 3:
[0524] The server uses an emotion engine to recognize the user's emotional state based on their voice commands and text data. This emotional data is then analyzed to determine the user's current psychological state.
[0525] Step 4:
[0526] The server adjusts the content of information notifications and task suggestions based on the analyzed sentiment data. For example, if it detects that a user is in a high-stress state, it will adjust the settings to refrain from sending low-priority notifications.
[0527] Step 5:
[0528] The terminal receives notifications from the server and displays information to the user. At the same time, it can receive voice commands from the user via speech recognition and send those commands to the server.
[0529] Step 6:
[0530] Users review notifications and suggestions displayed on their devices and adjust their schedules as needed. Emotion-based feedback is sent back to the server, improving the system's adaptability.
[0531] Step 7:
[0532] The device sends user feedback to the server, which then updates the sentiment engine and recommendation algorithms based on that feedback. This continuous updating enables more personalized information delivery.
[0533] (Example 2)
[0534] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0535] In modern society, information overload is a problem, and users may miss important information. Furthermore, conventional systems fail to provide information and suggestions that take into account the user's emotional state, resulting in an unoptimized user experience. Therefore, there is a need for technologies that provide personalized information management and suggestions that consider user emotions.
[0536] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0537] In this invention, the server includes means for acquiring data from multiple information devices, means for analyzing the acquired data using natural language processing and generating summaries, and means for analyzing the user's behavioral history and emotional data and making future work suggestions. This makes it possible to provide personalized information according to the user's emotional state, efficiently manage the most useful information for the user from among excessive information, and provide appropriate work suggestions.
[0538] "Information equipment" refers to all electronic devices used for generating, transmitting, receiving, and processing data, including computers, smartphones, and other communication devices.
[0539] "Data" refers to all types of information obtained through information devices, including text messages, audio data, emails, and image data.
[0540] "Natural language processing" is a technology that enables computers to understand, interpret, and process human language, and is used for text analysis and summarization.
[0541] A "summary" refers to a concise compilation of the main points of information extracted through natural language processing, enabling users to efficiently understand important information.
[0542] "Behavioral history" refers to a collection of data that records a user's past activities and is used to analyze the user's behavioral patterns and preferences.
[0543] "Emotional data" refers to data that indicates a user's emotional state, and includes information obtained from voice tone, facial expressions, emotional expressions in text, and other sources.
[0544] "Work suggestions" refer to recommended actions that are provided based on the user's current situation and emotional state, and include specific suggestions for performing daily activities more efficiently.
[0545] "Portable devices" refer to electronic devices that users can easily carry with them, and include smartphones and tablet devices.
[0546] "Feedback" refers to reactions and opinions obtained from users, and includes information collected and analyzed to improve system performance and optimize the user experience.
[0547] "Information processing equipment" refers to devices used to electronically process data and perform calculations and analyses, and includes servers and computer systems.
[0548] "Voice commands" refer to a method of giving instructions to a system using voice, including commands processed via speech recognition technology.
[0549] "Personalized notifications" refer to informational notifications that are customized based on the user's individual circumstances and preferences, and include means of providing a personalized experience.
[0550] In this invention, the server acquires data from multiple information devices and analyzes that data using natural language processing. The hardware used includes cloud servers and database servers, and the software utilizes open-source libraries and machine learning frameworks for natural language processing, specifically the Python library spaCy and the machine learning framework TensorFlow. This allows diverse information to be summarized and managed.
[0551] The terminal is responsible for presenting the user with summary data and suggestions received from the server. The terminal has a voice recognition function and communicates with the server to accurately process the user's voice instructions. Voice data and the user's facial expressions are also collected by the terminal, and this emotional data is sent back to the server.
[0552] Users make decisions based on information received from their devices. For example, if a user gives a voice command indicating stress, the server analyzes that emotional data and suggests specific actions to help them relax. Furthermore, the system updates its analysis results based on user feedback, enabling it to provide even more accurate and personalized services.
[0553] For example, a user can send a prompt to the system such as, "Summarize my emails and tell me if I'm stressed," and the server can perform the analysis and send the results to the terminal. In this way, the present invention integrates natural language processing and emotion recognition technology to realize information management and adaptive suggestions in the user's life.
[0554] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0555] Step 1:
[0556] The server periodically acquires data from multiple information devices, including mail servers, messaging applications, and other communication media. The input is raw data from each device, and the output is a centrally collected dataset. This data is stored in a database, serving as a foundation for subsequent analysis.
[0557] Step 2:
[0558] The server analyzes data stored in the database using a natural language processing engine. The input is the dataset collected in step 1, and the output is information in a summarized format highlighting important topics. Specifically, the spaCy library in Python is used to extract topics and perform keyword analysis, concisely summarizing information useful to the user.
[0559] Step 3:
[0560] The server collects voice and text data and analyzes the user's emotions using an emotion engine. Input is voice recordings and text content, and output is a dataset showing the user's emotional state. Machine learning models and emotion analysis algorithms are applied to this emotion analysis. This allows the server to determine if the user is experiencing stress and prepare appropriate responses.
[0561] Step 4:
[0562] The server generates informational notifications and work suggestions based on the data obtained in steps 2 and 3. The input consists of summary information and sentiment data, while the output is user-oriented suggestions and notifications. Specifically, it employs a personalized approach, such as suggesting short relaxation exercises when the user's stress level is high.
[0563] Step 5:
[0564] The terminal displays notifications and suggestions sent from the server to the user. Input is suggestion data from the server, and output is information received by the user visually or audibly. The terminal utilizes voice recognition capabilities, allowing the user to provide voice instructions in response to these notifications.
[0565] Step 6:
[0566] The user decides whether to accept the information and suggestions provided by the device and sends feedback to the device as an action. The input is a notification to the user, and the output is the user's action choice and subsequent feedback. This feedback is reflected in subsequent analyses and suggestions, contributing to improvements in the overall accuracy of the system and the user experience.
[0567] (Application Example 2)
[0568] Next, we will explain Application Example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0569] In today's information society, users are often overwhelmed by the sheer volume of information and notifications, which frequently contributes to stress. Furthermore, the difficulty in providing flexible responses tailored to individual emotional states makes it challenging for users to live comfortably. Additionally, the lack of well-established methods for optimizing the environment based on user emotions is another significant issue.
[0570] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0571] In this invention, the server includes means for acquiring information from multiple communication devices, means for analyzing the acquired information using natural language processing and generating a summary, means for analyzing the user's behavioral tendencies and making future work suggestions, means for notifying the user's terminal of the information and suggestions, means for collecting user responses and updating the analysis results, and means for analyzing the user's emotions and adjusting the physical environment based on those emotions. This makes it possible to provide appropriate information and adjust the environment in accordance with the user's emotions, thereby providing a comfortable living environment.
[0572] A "communication device" is an electronic device that has the function of sending and receiving information, such as a smartphone or a computer.
[0573] "Natural language processing" is a technology that enables computers to understand, analyze, and generate language that humans use in everyday life.
[0574] A "summary" is a concise compilation of information, extracting the most important parts from the acquired data.
[0575] "Behavioral tendencies" refer to information that shows patterns in a user's past actions and choices, and are used to predict future behavior.
[0576] A "task proposal" is a suggestion made to a user outlining specific tasks or activities that should be undertaken in the future.
[0577] A "terminal" is a device that connects to a server and has the role of displaying information and accepting user input.
[0578] "Response" refers to the user's reaction or feedback to information or suggestions from the system.
[0579] "Emotions" refer to information that indicates a user's psychological or emotional state, and serve as a basis for the system to analyze and reflect in its actions.
[0580] "Environmental adjustment" refers to the act of changing settings to optimize the physical or digital environment in accordance with the user's emotions and state.
[0581] To implement this invention, a communication device, a server for information processing, and a terminal providing a user interface are required. The server is a device that connects to a communication device such as a smartphone or computer and periodically acquires information. The acquired information is analyzed by a natural language processing engine and a summary is generated. Here, the TextBlob library using Python is responsible for natural language processing. In addition, a predictive model based on past data is used to analyze user behavior trends and make suggestions for future tasks.
[0582] Voice and text input are used for analyzing user sentiment. The device is equipped with a microphone and speaker, and utilizes the SpeechRecognition library for speech recognition. Based on the user's sentiment, the device notifies the user of the most suitable suggestions provided by the server. Furthermore, the device can adjust the physical environment based on the user's sentiment analysis, with LED lighting systems and music playback devices being used for this adjustment. The NaiveBayesAnalyzer from the TextBlob library is used for sentiment analysis.
[0583] As a concrete example, if a user who feels unwell after waking up says, "I'm awake but still sleepy," the device will analyze their voice and suggest playing relaxing music and adjusting the lighting to a softer glow. An example of a prompt might be: "Design an assistant robot system that analyzes a user's emotions from their voice and optimizes their home environment. If the user indicates stress, play relaxing music and adjust the lighting to a warm color."
[0584] In this way, a system that provides a comfortable environment for users is realized.
[0585] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0586] Step 1:
[0587] The server retrieves information from communication devices such as smartphones and computers at regular intervals. This information includes user messages, schedules, and activity history. The retrieved data is stored in intermediate data storage on the server and prepared as input for the next processing step.
[0588] Step 2:
[0589] The server analyzes the acquired information using a natural language processing engine, extracts important topics, and generates a summary. Here, the TextBlob library is used to analyze the text data and compile the main information into a summary. This summarized information becomes the output for the next sentiment analysis step.
[0590] Step 3:
[0591] The server activates the emotion engine using summary data and user voice input to analyze the user's emotional state. Voice input is received through the terminal and converted to text using the SpeechRecognition library. The obtained text data is then analyzed for emotion using TextBlob's NaiveBayesAnalyzer. The analysis results are output as the user's emotional state (e.g., positive, negative, neutral).
[0592] Step 4:
[0593] The device receives suggestions generated by the server based on the user's emotional state and notifies the user. These suggestions include specific actions such as playing music to help the user relax or adjusting the lighting. The device then operates its built-in speaker and intelligent lighting system to perform the suggested actions. User feedback is also received and sent to the server for analysis.
[0594] Step 5:
[0595] Users can adjust their environment based on information and suggestions from their devices, allowing them to have a more comfortable experience. By inputting how users felt about the suggestions, the server processes this data to improve the accuracy of future analyses. This enables the server to continuously improve system performance and optimize the user experience.
[0596] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0597] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0598] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0599] [Fourth Embodiment]
[0600] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0601] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0602] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0603] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0604] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0605] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0606] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0607] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0608] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0609] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0610] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0611] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0612] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0613] This invention is an AI-powered system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play different roles and work together to achieve its functions.
[0614] The server first obtains permission from the user to access the communication device and retrieve information such as emails and messages. In this information gathering process, the server uses the API of each communication device to extract the necessary data and periodically collects updated information.
[0615] Next, the server processes the acquired data through a natural language processing engine to analyze the text. It extracts important topics and keywords and summarizes the data based on them. For example, emails containing meeting schedules or important deadlines are summarized as high-priority information for the user.
[0616] The server then analyzes the user's behavioral history and schedule data to predict future tasks and events. Based on these predictions, the server proposes recommended actions to the user and summarizes the content. The server sends the generated information and suggestions to the device, allowing the user to receive notifications in real time.
[0617] The device allows users to receive notifications and access information through various interfaces via mobile devices and wearable devices. The device incorporates voice recognition capabilities, enabling it to accept user voice commands. This allows users to perform tasks such as checking schedules and adding new tasks via voice.
[0618] Users manage their daily tasks and lives based on the information and suggestions provided by the system. They can also provide feedback, which is sent to the server and used to update the system's analysis model and improve its accuracy.
[0619] This AI-powered information management system is a powerful tool for efficiently managing daily life and work, ensuring that important information is not overlooked amidst a vast amount of data. For example, to prevent users from missing an important meeting on Monday, the server can send a notification the night before and set a reminder in the morning, ensuring the user is properly prepared.
[0620] The following describes the processing flow.
[0621] Step 1:
[0622] The server retrieves email and message data via the communication device's API, based on the access permissions provided by the user. During this process, the server periodically updates the data and accumulates new information.
[0623] Step 2:
[0624] The server analyzes the acquired data by passing it through a natural language processing engine. This extracts important topics and keywords and summarizes the data. For example, it prioritizes extracting tasks with deadlines and meeting dates.
[0625] Step 3:
[0626] Based on the analyzed information, the server analyzes the user's behavioral history and calendar data, and uses a predictive model to suggest future tasks and events. This allows notifications to include things the user should prepare in advance.
[0627] Step 4:
[0628] The server prepares to push the generated important information and task suggestions to the user's terminal. This information is organized so that the user can access it immediately.
[0629] Step 5:
[0630] The device receives push notifications sent from the server and displays them to the user. The device can also accept voice commands from the user via its voice interface and send those commands to the server.
[0631] Step 6:
[0632] Users can check notifications displayed on their devices and adjust their actions and schedules as needed. They can also improve the accuracy of system recommendations by sending feedback to the server via their devices.
[0633] (Example 1)
[0634] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0635] Modern users are constantly exposed to vast amounts of information from numerous sources, making it difficult to efficiently extract and manage important information. In particular, prioritizing information generated in daily life and work, and managing future schedules, has become increasingly complex. In this context, there is a growing need for systems that provide appropriate information in real time and support future activities.
[0636] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0637] In this invention, the server includes means for acquiring information from a communication terminal, means for analyzing the acquired information using a processing device and generating a summary, and means for analyzing the user's history data and suggesting future tasks. This enables users to efficiently extract important information from diverse sources, predict important tasks in their daily lives and work, and act in a planned manner.
[0638] A "communication terminal" is an electronic device that has the function of receiving or transmitting information.
[0639] "Information" refers to a collection of data or knowledge that is acquired or provided, including emails and messages.
[0640] A "processing device" is a device that performs computational processing, such as analyzing data and summarizing information.
[0641] "Analysis" is the process of breaking down information into its individual elements and converting them into a form that is easy to understand.
[0642] A "summary" is a concise compilation of information, resulting from the extraction of key points.
[0643] A "user" is a person who operates or uses the system.
[0644] "History data" refers to records of activities and actions that a user has performed in the past.
[0645] Proposing a "task" means recommending future activities or actions to the user.
[0646] An "apparatus" is a machine or device that has a specific function.
[0647] "Evaluation" refers to a user's judgment or opinion regarding the information provided by the system.
[0648] An "analytical model" is a mathematical or statistical method used to understand data and make predictions.
[0649] A "portable terminal" is an electronic device with communication capabilities that can be carried around.
[0650] This invention is a system designed to improve the efficiency of information management and to support users' daily lives and work. In this system, the server, terminals, and users each play their respective roles in acquiring, analyzing, suggesting, and notifying information.
[0651] The server first acquires information from multiple communication terminals. This involves using standardized APIs to retrieve data from various email services and messaging applications. Using a natural language processing engine such as the Google Cloud Natural Language API, the server analyzes the acquired information, extracts important topics and keywords, and generates a summary.
[0652] The server also analyzes user history data and uses a generative AI model to suggest future tasks. Based on past behavioral patterns and schedules, it can recommend the next steps to take and create efficient plans.
[0653] Next, the generated information and suggestions are sent to the terminal in real time. The terminal functions as a mobile device or wearable device and accepts the user's voice commands through voice recognition. This allows users to check information and add new tasks hands-free.
[0654] Furthermore, users can provide feedback on the system's suggestions, which is then sent to the server. The server updates the analysis model based on the collected feedback, improving its accuracy.
[0655] As a concrete example, to ensure users don't miss important emails each day, the server sends a reminder the night before, prompting them to check it the following morning, thus supporting thorough preparation. An example of a prompt message for the generating AI model might be, "Please review this week's schedule and summarize important appointments. Also, please suggest the next necessary actions."
[0656] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0657] Step 1:
[0658] The server retrieves information from the communication terminal. Input is API access based on user permission, retrieving email and message data. Output is the retrieved raw data. This information is used in subsequent analysis steps.
[0659] Step 2:
[0660] The server processes the acquired information through a natural language processing engine to analyze the data. The input is the raw data acquired in step 1, which is then processed for keyword extraction and topic recognition. The output consists of summarized text information and a list of high-priority information. Specifically, meetings and deadlines are highlighted in the summary.
[0661] Step 3:
[0662] The server analyzes the user's historical data and suggests future tasks. This step involves inputting schedule data from the calendar API, as well as past behavioral data. A generative AI model analyzes this data and provides recommended tasks and appointments as output. Weekly reports and other similar reports are generated based on the configured routines.
[0663] Step 4:
[0664] The server sends the generated information and suggestions to the terminal. The input is the summary information and suggestions generated in steps 2 and 3, which are sent to the user's device in notification format. The output is a visual and audible alert displayed on the terminal. Pop-up notifications and reminders are displayed on the device.
[0665] Step 5:
[0666] The terminal receives voice commands from the user and sends them to the server. The input is the user's voice command, which is converted into text format by speech recognition. The output is instruction data sent to the server. This allows the user to operate the device hands-free.
[0667] Step 6:
[0668] Users provide feedback on the system's suggestions. This feedback is entered into the server and used to improve the accuracy of the analysis model. The output is the feedback data accumulated in the system, contributing to the continuous improvement of the model. Customization is performed to reflect user feedback.
[0669] (Application Example 1)
[0670] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0671] In modern urban life, efficiently and accurately providing users with important information from a vast amount of data is a major challenge. In particular, there is a need to streamline daily life by processing information on traffic conditions, weather, and public events in a timely manner and suggesting optimal actions for users. Furthermore, improving usability through the effective use of user voice interaction is essential.
[0672] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0673] In this invention, the server includes means for collecting information from multiple communication devices, means for analyzing the acquired information using natural language processing technology and constructing a summary, means for collecting traffic conditions, weather information, and public event data and making suggestions suitable for the user, and means for processing the user's voice instructions using speech recognition functionality. This enables the user to make optimal decisions without missing important information in their daily life.
[0674] A "communication device" is an online or offline device or equipment that sends and receives information.
[0675] "Means of information gathering" refer to the functions and processes for collecting necessary information from external data sources.
[0676] "Natural language processing technology" is a computational technique for converting human language into a format that computers can easily understand and then analyzing it.
[0677] "Methods for constructing summaries" refer to methods for compactly organizing acquired information and data, extracting important content, and presenting it in a concise form.
[0678] "Activity history" refers to a record of activities and actions a user has performed in the past.
[0679] "Means of suggesting activities" refers to a function that recommends beneficial actions for users based on analyzed data.
[0680] A "terminal device" refers to a device that a user can directly operate, such as a smartphone or tablet.
[0681] A "voice instruction processing function" refers to a program or device that recognizes voice input from a user and performs the necessary operations based on that input.
[0682] "Traffic conditions" refers to information about congestion and traffic conditions on roads and public transportation in a specific area.
[0683] "Weather information" refers to data about local weather conditions, such as temperature, probability of precipitation, and wind speed.
[0684] "Public event data" refers to information about events that are open to the general public.
[0685] This technology is a personal assistant system that efficiently supports urban life. The following describes an embodiment of the system.
[0686] The server collects information from multiple communication devices and utilizes online APIs to obtain data in real time. The hardware used includes cloud servers. For software, the Google Cloud Natural Language API is used for natural language processing. Through this API, it is possible to analyze the acquired information and generate summaries. Furthermore, traffic conditions, weather information, and public event data are continuously collected from relevant organizations and open data sources.
[0687] The server further analyzes the user's behavioral history data and uses an AI model to suggest future activities. For example, if there is a forecast for worsening weather the next day, it will notify the user to bring an umbrella.
[0688] Smartphones and wearable devices are used as terminal devices, and users receive information through them. Amazon Alexa Voice Service is used for voice recognition, processing user voice commands and performing necessary actions.
[0689] Users can interact with the system through their devices and provide feedback through voice and text instructions. This feedback helps update the AI model and improve its accuracy.
[0690] As a concrete example, if a user enters the prompt "What will the weather be like tomorrow?", the system retrieves the next day's weather information from the server and notifies the user's smartphone of a summary processed through natural language processing. In this way, the user can efficiently plan their actions.
[0691] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0692] Step 1:
[0693] The server collects information from multiple communication devices. It uses APIs to obtain real-time traffic conditions, weather forecasts, and event information. The input consists of raw data provided by various data providers, and the output is an internal data structure that unifies the format of this data.
[0694] Step 2:
[0695] The server analyzes the collected data using natural language processing techniques and constructs a summary. It extracts key information and keywords from each data point via the Google Cloud Natural Language API. The input is the internal data structure generated in the previous step, and the output is summarized text data. Specifically, it extracts important topics and summarizes based on them.
[0696] Step 3:
[0697] The server uses an AI model to analyze user behavior history data. Using collected schedule information and past behavior patterns as input, it generates suggestions for future activities based on that information. The output consists of suggestions for actions that are beneficial to the user. Specific actions include everyday suggestions such as bringing an umbrella.
[0698] Step 4:
[0699] The server notifies the user's device of the generated information and suggestions. Push notifications are sent to smartphones and wearable devices. The input consists of summarized information and suggestions, and the output is a notification sent to the user's device.
[0700] Step 5:
[0701] Users provide feedback via voice or text through their terminal device. Voice commands are processed using the Amazon Alexa Voice Service. Input is the user's voice or text command, and output is the system performing an action tailored to that input.
[0702] Step 6:
[0703] The server updates the analysis model based on user feedback. The input is feedback data, and the output is an improved analysis model. Specifically, the feedback is added to the model's training data and used for the next run.
[0704] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0705] This invention is a system that more effectively supports users' lives and work by incorporating an emotion engine that recognizes and utilizes user emotions, in addition to a technology that collects information from communication devices and summarizes the data using natural language processing. In this system, the server, terminal, and user collaborate to manage information and provide user assistance.
[0706] The server connects to communication devices authorized by the user and retrieves email and message data. This retrieval is performed periodically, and information across various platforms is centrally managed. The retrieved data is analyzed by a natural language processing engine to identify important topics, and then saved as a summary.
[0707] Next, using its emotion engine, one of the system's key features, the server analyzes the user's emotions from their voice and text. This emotion data is used to adjust information notifications and task suggestions. For example, if the emotion engine detects that the user is experiencing stress, the server can reduce the number of low-priority notifications or offer suggestions for relaxation.
[0708] The terminal's role is to present information and suggestions sent from the server to the user. The terminal is equipped with voice recognition capabilities, allowing the user to operate it using voice commands. In addition, the terminal senses the user's facial expressions and tone of voice and provides information to the emotion engine.
[0709] Users can manage their daily tasks based on information and suggestions provided by their devices. Furthermore, they receive emotionally responsive suggestions and feedback, leading to a high level of satisfaction with the system. Personalized support, provided through emotion engine analysis, enhances the user experience.
[0710] For example, if a user shows signs of fatigue during work, the emotion engine detects this emotion. Based on this emotion data, the server then sends a notification to the device suggesting short exercise or meditation apps that can help refresh the user, thereby supporting their work efficiency. In this way, the system integrates emotion recognition technology to achieve more effective information presentation and user support.
[0711] The following describes the processing flow.
[0712] Step 1:
[0713] The server uses the communication device's API to periodically retrieve email and message data within the user's permission. This process prioritizes the collection of new and unread messages.
[0714] Step 2:
[0715] The server feeds the acquired data into a natural language processing engine to analyze the information. As a result of the analysis, important keywords and topics are extracted, and a summary is generated based on them.
[0716] Step 3:
[0717] The server uses an emotion engine to recognize the user's emotional state based on their voice commands and text data. This emotional data is then analyzed to determine the user's current psychological state.
[0718] Step 4:
[0719] The server adjusts the content of information notifications and task suggestions based on the analyzed sentiment data. For example, if it detects that a user is in a high-stress state, it will adjust the settings to refrain from sending low-priority notifications.
[0720] Step 5:
[0721] The terminal receives notifications from the server and displays information to the user. At the same time, it can receive voice commands from the user via speech recognition and send those commands to the server.
[0722] Step 6:
[0723] Users review notifications and suggestions displayed on their devices and adjust their schedules as needed. Emotion-based feedback is sent back to the server, improving the system's adaptability.
[0724] Step 7:
[0725] The device sends user feedback to the server, which then updates the sentiment engine and recommendation algorithms based on that feedback. This continuous updating enables more personalized information delivery.
[0726] (Example 2)
[0727] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0728] In modern society, information overload is a problem, and users may miss important information. Furthermore, conventional systems fail to provide information and suggestions that take into account the user's emotional state, resulting in an unoptimized user experience. Therefore, there is a need for technologies that provide personalized information management and suggestions that consider user emotions.
[0729] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0730] In this invention, the server includes means for acquiring data from multiple information devices, means for analyzing the acquired data using natural language processing and generating summaries, and means for analyzing the user's behavioral history and emotional data and making future work suggestions. This makes it possible to provide personalized information according to the user's emotional state, efficiently manage the most useful information for the user from among excessive information, and provide appropriate work suggestions.
[0731] "Information equipment" refers to all electronic devices used for generating, transmitting, receiving, and processing data, including computers, smartphones, and other communication devices.
[0732] "Data" refers to all types of information obtained through information devices, including text messages, audio data, emails, and image data.
[0733] "Natural language processing" is a technology that enables computers to understand, interpret, and process human language, and is used for text analysis and summarization.
[0734] A "summary" refers to a concise compilation of the main points of information extracted through natural language processing, enabling users to efficiently understand important information.
[0735] "Behavioral history" refers to a collection of data that records a user's past activities and is used to analyze the user's behavioral patterns and preferences.
[0736] "Emotional data" refers to data that indicates a user's emotional state, and includes information obtained from voice tone, facial expressions, emotional expressions in text, and other sources.
[0737] "Work suggestions" refer to recommended actions that are provided based on the user's current situation and emotional state, and include specific suggestions for performing daily activities more efficiently.
[0738] "Portable devices" refer to electronic devices that users can easily carry with them, and include smartphones and tablet devices.
[0739] "Feedback" refers to reactions and opinions obtained from users, and includes information collected and analyzed to improve system performance and optimize the user experience.
[0740] "Information processing equipment" refers to devices used to electronically process data and perform calculations and analyses, and includes servers and computer systems.
[0741] "Voice commands" refer to a method of giving instructions to a system using voice, including commands processed via speech recognition technology.
[0742] "Personalized notifications" refer to informational notifications that are customized based on the user's individual circumstances and preferences, and include means of providing a personalized experience.
[0743] In this invention, the server acquires data from multiple information devices and analyzes that data using natural language processing. The hardware used includes cloud servers and database servers, and the software utilizes open-source libraries and machine learning frameworks for natural language processing, specifically the Python library spaCy and the machine learning framework TensorFlow. This allows diverse information to be summarized and managed.
[0744] The terminal is responsible for presenting the user with summary data and suggestions received from the server. The terminal has a voice recognition function and communicates with the server to accurately process the user's voice instructions. Voice data and the user's facial expressions are also collected by the terminal, and this emotional data is sent back to the server.
[0745] Users make decisions based on information received from their devices. For example, if a user gives a voice command indicating stress, the server analyzes that emotional data and suggests specific actions to help them relax. Furthermore, the system updates its analysis results based on user feedback, enabling it to provide even more accurate and personalized services.
[0746] For example, a user can send a prompt to the system such as, "Summarize my emails and tell me if I'm stressed," and the server can perform the analysis and send the results to the terminal. In this way, the present invention integrates natural language processing and emotion recognition technology to realize information management and adaptive suggestions in the user's life.
[0747] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0748] Step 1:
[0749] The server periodically acquires data from multiple information devices, including mail servers, messaging applications, and other communication media. The input is raw data from each device, and the output is a centrally collected dataset. This data is stored in a database, serving as a foundation for subsequent analysis.
[0750] Step 2:
[0751] The server analyzes data stored in the database using a natural language processing engine. The input is the dataset collected in step 1, and the output is information in a summarized format highlighting important topics. Specifically, the spaCy library in Python is used to extract topics and perform keyword analysis, concisely summarizing information useful to the user.
[0752] Step 3:
[0753] The server collects voice and text data and analyzes the user's emotions using an emotion engine. Input is voice recordings and text content, and output is a dataset showing the user's emotional state. Machine learning models and emotion analysis algorithms are applied to this emotion analysis. This allows the server to determine if the user is experiencing stress and prepare appropriate responses.
[0754] Step 4:
[0755] The server generates informational notifications and work suggestions based on the data obtained in steps 2 and 3. The input consists of summary information and sentiment data, while the output is user-oriented suggestions and notifications. Specifically, it employs a personalized approach, such as suggesting short relaxation exercises when the user's stress level is high.
[0756] Step 5:
[0757] The terminal displays notifications and suggestions sent from the server to the user. Input is suggestion data from the server, and output is information received by the user visually or audibly. The terminal utilizes voice recognition capabilities, allowing the user to provide voice instructions in response to these notifications.
[0758] Step 6:
[0759] The user decides whether to accept the information and suggestions provided by the device and sends feedback to the device as an action. The input is a notification to the user, and the output is the user's action choice and subsequent feedback. This feedback is reflected in subsequent analyses and suggestions, contributing to improvements in the overall accuracy of the system and the user experience.
[0760] (Application Example 2)
[0761] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0762] In today's information society, users are often overwhelmed by the sheer volume of information and notifications, which frequently contributes to stress. Furthermore, the difficulty in providing flexible responses tailored to individual emotional states makes it challenging for users to live comfortably. Additionally, the lack of well-established methods for optimizing the environment based on user emotions is another significant issue.
[0763] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0764] In this invention, the server includes means for acquiring information from multiple communication devices, means for analyzing the acquired information using natural language processing and generating a summary, means for analyzing the user's behavioral tendencies and making future work suggestions, means for notifying the user's terminal of the information and suggestions, means for collecting user responses and updating the analysis results, and means for analyzing the user's emotions and adjusting the physical environment based on those emotions. This makes it possible to provide appropriate information and adjust the environment in accordance with the user's emotions, thereby providing a comfortable living environment.
[0765] A "communication device" is an electronic device that has the function of sending and receiving information, such as a smartphone or a computer.
[0766] "Natural language processing" is a technology that enables computers to understand, analyze, and generate language that humans use in everyday life.
[0767] A "summary" is a concise compilation of information, extracting the most important parts from the acquired data.
[0768] "Behavioral tendencies" refer to information that shows patterns in a user's past actions and choices, and are used to predict future behavior.
[0769] A "task proposal" is a suggestion made to a user outlining specific tasks or activities that should be undertaken in the future.
[0770] A "terminal" is a device that connects to a server and has the role of displaying information and accepting user input.
[0771] "Response" refers to the user's reaction or feedback to information or suggestions from the system.
[0772] "Emotions" refer to information that indicates a user's psychological or emotional state, and serve as a basis for the system to analyze and reflect in its actions.
[0773] "Environmental adjustment" refers to the act of changing settings to optimize the physical or digital environment in accordance with the user's emotions and state.
[0774] To implement this invention, a communication device, a server for information processing, and a terminal providing a user interface are required. The server is a device that connects to a communication device such as a smartphone or computer and periodically acquires information. The acquired information is analyzed by a natural language processing engine and a summary is generated. Here, the TextBlob library using Python is responsible for natural language processing. In addition, a predictive model based on past data is used to analyze user behavior trends and make suggestions for future tasks.
[0775] Voice and text input are used for analyzing user sentiment. The device is equipped with a microphone and speaker, and utilizes the SpeechRecognition library for speech recognition. Based on the user's sentiment, the device notifies the user of the most suitable suggestions provided by the server. Furthermore, the device can adjust the physical environment based on the user's sentiment analysis, with LED lighting systems and music playback devices being used for this adjustment. The NaiveBayesAnalyzer from the TextBlob library is used for sentiment analysis.
[0776] As a concrete example, if a user who feels unwell after waking up says, "I'm awake but still sleepy," the device will analyze their voice and suggest playing relaxing music and adjusting the lighting to a softer glow. An example of a prompt might be: "Design an assistant robot system that analyzes a user's emotions from their voice and optimizes their home environment. If the user indicates stress, play relaxing music and adjust the lighting to a warm color."
[0777] In this way, a system that provides a comfortable environment for users is realized.
[0778] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0779] Step 1:
[0780] The server retrieves information from communication devices such as smartphones and computers at regular intervals. This information includes user messages, schedules, and activity history. The retrieved data is stored in intermediate data storage on the server and prepared as input for the next processing step.
[0781] Step 2:
[0782] The server analyzes the acquired information using a natural language processing engine, extracts important topics, and generates a summary. Here, the TextBlob library is used to analyze the text data and compile the main information into a summary. This summarized information becomes the output for the next sentiment analysis step.
[0783] Step 3:
[0784] The server activates the emotion engine using summary data and user voice input to analyze the user's emotional state. Voice input is received through the terminal and converted to text using the SpeechRecognition library. The obtained text data is then analyzed for emotion using TextBlob's NaiveBayesAnalyzer. The analysis results are output as the user's emotional state (e.g., positive, negative, neutral).
[0785] Step 4:
[0786] The device receives suggestions generated by the server based on the user's emotional state and notifies the user. These suggestions include specific actions such as playing music to help the user relax or adjusting the lighting. The device then operates its built-in speaker and intelligent lighting system to perform the suggested actions. User feedback is also received and sent to the server for analysis.
[0787] Step 5:
[0788] Users can adjust their environment based on information and suggestions from their devices, allowing them to have a more comfortable experience. By inputting how users felt about the suggestions, the server processes this data to improve the accuracy of future analyses. This enables the server to continuously improve system performance and optimize the user experience.
[0789] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0790] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0791] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0792] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0793] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0794] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0795] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0796] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0797] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0798] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0799] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0800] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0801] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0802] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0803] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0804] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0805] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0806] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0807] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0808] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0809] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0810] The following is further disclosed regarding the embodiments described above.
[0811] (Claim 1)
[0812] A means of acquiring information from multiple communication devices,
[0813] A means for analyzing acquired information using natural language processing and generating a summary,
[0814] A means of analyzing user behavior history and suggesting future tasks,
[0815] A means of notifying the user's terminal of information and suggestions,
[0816] A means of collecting user feedback and updating analysis results,
[0817] A system that includes this.
[0818] (Claim 2)
[0819] The system according to claim 1, further comprising means for receiving a user's voice command and transmitting it to a server.
[0820] (Claim 3)
[0821] The system according to claim 1, further comprising means for providing real-time notifications in conjunction with a wearable device.
[0822] "Example 1"
[0823] (Claim 1)
[0824] Means of obtaining information from a communication terminal,
[0825] A means for analyzing acquired information using a processing device and generating a summary,
[0826] A means of analyzing user history data and proposing future tasks,
[0827] A means for notifying the user's device of the generated information and suggestions,
[0828] A means of collecting user feedback and updating the analysis model,
[0829] A system that includes this.
[0830] (Claim 2)
[0831] The system according to claim 1, further comprising means for receiving a user's voice input and transmitting it to an information processing device.
[0832] (Claim 3)
[0833] The system according to claim 1, further comprising means for providing notifications as needed in conjunction with a portable terminal.
[0834] "Application Example 1"
[0835] (Claim 1)
[0836] A means of collecting information from multiple communication devices,
[0837] A means for analyzing acquired information using natural language processing technology and constructing a summary,
[0838] A means of analyzing user behavior history and suggesting future activities,
[0839] Means for notifying the user's terminal device of information and suggestions,
[0840] A means of collecting user feedback and updating analysis results,
[0841] A means of collecting traffic conditions, weather information, and public event data, and providing suggestions tailored to the user,
[0842] A means of processing user voice commands using voice recognition technology,
[0843] A system that includes this.
[0844] (Claim 2)
[0845] The system according to claim 1, further comprising means for transmitting voice recognition-based instructions to a user's terminal device.
[0846] (Claim 3)
[0847] The system according to claim 1, further comprising means for performing notifications in real time in conjunction with a wearable device.
[0848] "Example 2 of combining an emotion engine"
[0849] (Claim 1)
[0850] A means of acquiring data from multiple information devices,
[0851] A means for analyzing acquired data using natural language processing and generating a summary,
[0852] A means of analyzing user behavior history and sentiment data to make future work suggestions,
[0853] Means for notifying the user of information and suggestions on their mobile device,
[0854] A means for collecting user feedback and emotional states, and updating the analysis results,
[0855] A system that includes this.
[0856] (Claim 2)
[0857] The system according to claim 1, further comprising means for receiving a user's voice command and transmitting it to an information processing device, and means for acquiring and analyzing emotion data from voice and facial expressions.
[0858] (Claim 3)
[0859] The system according to claim 1, further comprising means for providing personalized notifications in real time in conjunction with a portable device.
[0860] "Application example 2 of combining emotional engines"
[0861] (Claim 1)
[0862] A means of acquiring information from multiple communication devices,
[0863] A means for analyzing acquired information using natural language processing and generating a summary,
[0864] A means of analyzing user behavior trends and making future work suggestions,
[0865] A means of notifying the user's terminal of information and suggestions,
[0866] A means of collecting user feedback and updating analysis results,
[0867] A means of analyzing the user's emotions and adjusting the physical environment based on those emotions,
[0868] A system that includes this.
[0869] (Claim 2)
[0870] The system according to claim 1, further comprising means for receiving a user's voice command and transmitting it to a server.
[0871] (Claim 3)
[0872] The system according to claim 1, further comprising means for providing real-time notifications in conjunction with a wearable device. [Explanation of Symbols]
[0873] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. A means of acquiring information from multiple communication devices, A means for analyzing acquired information using natural language processing and generating a summary, A means of analyzing user behavior history and suggesting future tasks, A means of notifying the user's terminal of information and suggestions, A means of collecting user feedback and updating analysis results, A system that includes this.
2. The system according to claim 1, further comprising means for receiving a user's voice command and transmitting it to a server.
3. The system according to claim 1, further comprising means for providing real-time notifications in cooperation with a wearable device.