system

A system with a communication device, analysis device, generation device, and learning device addresses the challenge of monitoring elderly health and isolation by providing personalized interaction and timely interventions, ensuring a safer living environment.

JP2026101212APending Publication Date: 2026-06-22SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-10
Publication Date
2026-06-22

AI Technical Summary

Technical Problem

Existing technologies struggle to continuously and individually monitor the isolation situation and health risks of elderly individuals, particularly in an aging society, and often fail to promptly notify medical institutions or family members of abnormalities, leading to potential social isolation and health issues.

Method used

A system comprising a communication device for regular interaction, an analysis device for cognitive and emotional assessment, a generation device for personalized training, and a learning device for improved evaluation accuracy, which automatically connects with elderly individuals, analyzes their health and emotional states, generates tailored training, and notifies relevant parties of anomalies.

Benefits of technology

The system effectively monitors the health status of elderly individuals, prevents social isolation, and provides timely interventions by offering personalized support and early notification of health risks, thereby enhancing their safety and quality of life.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026101212000001_ABST
    Figure 2026101212000001_ABST
Patent Text Reader

Abstract

We provide the system. [Solution] A means by which a communication unit automatically establishes a communication connection to a task target based on a pre-set schedule, In a dialogue with the task target, a means is used that converts the dialogue content into information data and analyzes cognitive function and emotional state using an analysis unit. A means for generating individually optimized competency training using a generation unit based on the results of the aforementioned analysis and presenting it to the task target, A means of notifying medical institutions and relevant parties when an abnormality is detected, A means of presenting optimized information training on a mobile device used by the task target, A system that includes this.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, and includes steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] With the increase in the number of elderly people living alone in an aging society, the occurrence of solitary death and dementia has become a serious social problem. In the conventional technology, it has been difficult to continuously and individually monitor the isolation situation and health risks of the elderly. Furthermore, since the mechanism for promptly notifying medical institutions and family members even when an abnormality is detected was insufficient, situations where early intervention could not be carried out often occurred. The present invention aims to solve these problems and provide a system that enables the elderly to live a safe and fulfilling life.

Means for Solving the Problems

[0005] This invention enables regular interaction by having a communication device automatically establish a communication connection with a target person based on a pre-set schedule. Furthermore, it precisely monitors the target person's health by converting conversation content into linguistic data and analyzing cognitive function and emotional state using an analysis device. It also aims to maintain and improve cognitive function by generating individually optimized intelligence training based on the analysis using a generation device and presenting it to the target person. In the event of detection of an abnormality, it provides a means to quickly notify medical institutions and relevant parties, thereby facilitating necessary intervention. By improving evaluation accuracy using a learning device based on accumulated conversation data and analysis results, it is possible to prevent social isolation among the elderly and provide a safe living environment.

[0006] A "communication device" is a device that automatically establishes a communication connection with a target person based on a pre-set schedule.

[0007] An "analysis device" is a device that converts conversation content into linguistic data during a conversation with a subject, and analyzes their cognitive function and emotional state.

[0008] A "generation device" is a device that generates individually optimized intelligence training based on analysis results and presents it to the target individual.

[0009] "Means for detecting abnormalities" refers to methods and means for detecting abnormalities related to the health status of a subject from the analysis results and notifying medical institutions and relevant parties.

[0010] A "learning device" is a device that updates a model based on accumulated conversation data and analysis results in order to improve the evaluation accuracy in the next conversation.

[0011] An "intermediate device" is a device designed to facilitate daily interaction between participants and between supervisors. [Brief explanation of the drawing]

[0012] [Figure 1]This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, which incorporates an emotion engine. [Figure 14] This is a sequence diagram showing the processing flow of the data processing system in Application Example 2, which combines an emotion engine. [Modes for carrying out the invention]

[0013] Hereinafter, an example of an embodiment of the system relating to the technology of this disclosure will be described with reference to the attached drawings.

[0014] First, the terms used in the following description will be explained.

[0015] In the following embodiments, the labeled processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0016] In the following embodiments, the labeled RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0017] In the following embodiments, the labeled storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.

[0018] In the following embodiments, the labeled communication I / F (Interface) is an interface including a communication processor and an antenna, etc. The communication I / F controls communication between multiple computers. Examples of communication standards applicable to the communication I / F include wireless communication standards including 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark), etc.

[0019] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0020] [First Embodiment]

[0021] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0022] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0023] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0024] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0025] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0026] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0027] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0028] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0029] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0030] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0031] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0032] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0033] The present invention aims to efficiently monitor the health status of the elderly and prevent social isolation through a system combining a communication device, an analysis device, a generation device, and necessary means. Embodiments of this system are described below.

[0034] The server first uses a communication device to periodically connect with the elderly target person according to a pre-set schedule. This communication is conducted via voice or video call. Once the terminal connects to the target person, it initiates a conversation based on everyday topics and collects information about the target person's health status.

[0035] During a conversation, the audio data acquired by the device is constantly converted into language data by a server via an analysis device. This analysis device uses natural language processing technology to analyze the cognitive function and emotional state of the subject from their responses. For example, if the subject's statements show signs of emotional instability, the server reflects this in the analysis results.

[0036] Next, based on the analysis results, the server generates individually optimized intelligence training and brain training content through a generation device. This generated content is presented to the subject via a terminal, making it possible to implement optimal training tailored to the subject's current health condition.

[0037] Furthermore, if an anomaly is detected, the server automatically notifies healthcare facilities and relevant parties. This encourages early response and reduces health risks. For example, if signs of depression are detected during a conversation, the server will issue an emergency alert and coordinate with healthcare facilities to enable early intervention.

[0038] The collected data is securely stored on a server, and a learning device is used for subsequent analyses and model accuracy improvements. This enables more accurate evaluations and allows for the provision of higher-quality support to the target individuals.

[0039] In this way, this system aims to reduce the risk of lonely deaths and health problems by continuously monitoring the health status of elderly people and providing appropriate support.

[0040] The following describes the processing flow.

[0041] Step 1:

[0042] The server checks the communication schedule. The server retrieves the date and time of the next communication connection from each participant's profile stored in the database. This prepares the server for communication to take place according to the predetermined schedule.

[0043] Step 2:

[0044] The server automatically initiates a communication connection. At the specified time, the server initiates a voice or video call to the target person's phone number or IP address. Using VoIP technology, the connection is established over the internet in the case of a voice call.

[0045] Step 3:

[0046] The device initiates a conversation with the target person. Once the target person accepts the communication, the device makes an opening greeting and starts a conversation based on everyday topics. Questions are asked based on a pre-prepared script to facilitate a natural conversation.

[0047] Step 4:

[0048] The user responds to questions from the AI ​​agent. The user, as the subject, continues the conversation about everyday events and their own health in response to questions from the device. The device receives these responses in real time.

[0049] Step 5:

[0050] The server analyzes the response. The voice data sent from the terminal is converted into text data by the server using speech recognition technology. Then, cognitive function and emotional state are analyzed through natural language processing.

[0051] Step 6:

[0052] The server generates training content based on the analysis results. Using the generation device, it automatically generates brain training and intelligence training tailored to the subject's condition and determines what to provide to the user via the terminal.

[0053] Step 7:

[0054] The device provides training to the user. It presents the generated training to the target user and encourages them to complete it. For example, simple memory tests or quiz-style training may be conducted.

[0055] Step 8:

[0056] The server monitors for anomalies and sends notifications. If an anomaly is detected during the analysis process, the server immediately initiates communication to notify medical institutions and relevant parties. This communication may utilize email or SMS.

[0057] Step 9:

[0058] The server stores data and updates the learning device. By storing conversation content and analysis results in a database and continuously updating the AI ​​model using the learning device, the accuracy of subsequent conversations and analyses is improved.

[0059] (Example 1)

[0060] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0061] In modern society, monitoring the health status of the elderly and preventing social isolation are crucial issues. However, conventional technologies have made it difficult to grasp detailed health conditions in real time or to provide personalized support. Therefore, there is a need for early detection of health risks among the elderly and methods for providing appropriate support.

[0062] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0063] In this invention, the server includes means for communication devices to automatically establish a communication connection with a target person based on a pre-set plan; means for using an analysis device that converts voice data into language data and evaluates cognitive function and emotional state; means for generating individually optimized cognitive training using a generation device and presenting it to the target person; means for notifying medical institutions and relevant parties when an abnormality is detected; and means for accumulating collected data and using a learning device to improve the accuracy of the next analysis and evaluation model. This enables real-time monitoring of the health status of elderly people and the provision of appropriate support tailored to their individual conditions.

[0064] "Communication equipment" refers to a device that connects with a target person via voice or video communication and exchanges information with them.

[0065] "Plan" refers to a schedule or procedure set in advance for communication equipment to connect with a target person.

[0066] An "analysis device" is a device that converts audio data into linguistic data and uses it to evaluate cognitive function and emotional state.

[0067] A "generating device" is a device that generates and presents cognitive training optimized for the target individual based on analysis results.

[0068] "Abnormal" refers to dangerous signs that deviate from the norm based on the subject's health condition or conversation content, and indicates a state that requires early intervention.

[0069] A "learning device" is a device that uses collected data to improve the accuracy of subsequent analyses and evaluation models.

[0070] A "model" refers to the algorithms and structures used in data analysis and the generation of cognitive training.

[0071] This invention provides a system in which a server, terminal, and user work together to monitor the health status of elderly people and prevent social isolation.

[0072] The server uses communication equipment to automatically connect with the elderly user's device according to a pre-configured plan. Typically, a device with calling capabilities is used. This communication is conducted by processing voice data in real time, enabling smooth interaction with the user. Examples of prompts include, "What happened today?" and "Have you been feeling tired lately?" These questions initiate a natural conversation with the user.

[0073] The device engages in conversation with the user based on everyday topics and transmits the user's voice input to the server. The server converts the voice data into language data via analysis equipment and then uses a generative AI model to evaluate cognitive function and emotional state. This process utilizes natural language processing (NLP) technology. For example, from a user's statement, "I often have trouble sleeping," the system analyzes cognitive decline and emotional instability.

[0074] Based on the analysis results, the server uses a generation device to design individually optimized cognitive training. This training is tailored to the user's health condition and abilities and can be presented via a terminal. A specific example is a quiz-style training aimed at improving memory.

[0075] If an anomaly is detected, the server quickly notifies medical institutions and relevant parties, facilitating early response. This helps mitigate health risks. The collected data is stored in learning devices, and machine learning techniques are used to improve the accuracy of subsequent analyses and evaluation models. This supports long-term health management and forms the basis for further technological improvements.

[0076] The implementation of this system will be an essential means of monitoring the health of the elderly in their daily lives and ensuring they can live with peace of mind.

[0077] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0078] Step 1:

[0079] The server automatically connects to the target person's terminal via communication equipment, based on a pre-configured plan.

[0080] The inputs used are schedule information and contact data for the person to connect, and the output is the establishment of a voice or video call connection. Specifically, the server sends the prompt message "What happened today?" to the person and prepares to start the conversation.

[0081] Step 2:

[0082] The device conducts a conversation with the user and collects audio data.

[0083] The input consists of the user's responses to prompts sent from the server. The output is collected audio data. Specifically, the user speaks about their daily life, and the device records what they say.

[0084] Step 3:

[0085] The terminal sends the collected audio data to the server, which then uses analysis equipment to convert the audio data into language data.

[0086] The collected audio data is used as input, and text-based language data is generated as output. During this process, the server utilizes natural language processing technology. The server analyzes the user's utterance, "I've been having trouble sleeping a lot lately," and evaluates their emotional state.

[0087] Step 4:

[0088] The server uses the analyzed language data to generate personalized cognitive training using a generative AI model.

[0089] Language data and past health data are used as input, and an optimized training program is output. Specifically, a quiz-style training program for improving memory is designed and prepared to be provided to the user.

[0090] Step 5:

[0091] The server sends the generated training content to the terminal, and the terminal presents the training to the user.

[0092] The input is a generated training program, and the output is the user receiving training instructions. Specifically, the device instructs the user to "start the next quiz" and then begins the quiz.

[0093] Step 6:

[0094] The user performs the training, and the device sends the results to the server.

[0095] The input consists of user training data and results, and the output is training result data sent to the server. Specifically, the user answers a quiz, and the device records the answers.

[0096] Step 7:

[0097] The server stores the training results that have been sent and, if necessary, notifies medical institutions and relevant parties if it detects any anomalies.

[0098] The input is training result data, and the output may generate anomaly notification messages. The server automatically sends an alert when an anomaly is detected, prompting intervention.

[0099] Step 8:

[0100] The server stores the collected data in the learning device and updates the model to improve the accuracy of subsequent analyses.

[0101] The accumulated data is used as input, and an updated evaluation model is generated as output. Specific operations include the application of machine learning algorithms to accelerate the evolution of the evaluation model.

[0102] (Application Example 1)

[0103] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0104] In modern society, the increasing elderly population necessitates continuous monitoring of their health. However, challenges remain, including the risk of social isolation among the elderly and the difficulty in responding quickly to sudden changes in their health. Furthermore, there is a need for a system that collects health information in a way that does not require complex operations from the elderly, and provides appropriate support.

[0105] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0106] In this invention, the server includes means for a communication unit to automatically establish a communication connection with a task target based on a pre-set schedule; means for using an analysis unit to convert the content of the dialogue with the task target into information data and analyze cognitive function and emotional state; means for generating individually optimized ability training using a generation unit based on the results of the analysis and presenting it to the task target; and means for presenting the optimized information training on a mobile terminal used by the task target. This makes it possible for elderly people to continuously monitor their health status and receive prompt support as needed without becoming socially isolated.

[0107] A "communication unit" is a device that automatically establishes a communication connection with the task target and operates according to a pre-set schedule.

[0108] "Task subjects" are individuals whose health status is being monitored, and these typically refer to elderly people.

[0109] "Information data" refers to data that records the content of interactions with the task target, including data that has been converted into language data.

[0110] An "analysis unit" is a device that analyzes acquired information data and determines the cognitive function and emotional state of the task subject.

[0111] A "generation unit" is a device that creates ability training suitable for the task target based on the analysis results, and plays a role in individual optimization.

[0112] "Mobile devices" refer to information and communication devices that the task target uses on a daily basis, including smartphones and tablets.

[0113] An "intermediate unit" is a device that facilitates interaction between task participants and between supervisors, with the aim of preventing social isolation.

[0114] The system for realizing this invention is comprised of a communication unit, an analysis unit, a generation unit, and a mobile terminal.

[0115] First, the server automatically establishes a communication connection with the user targeted for the task via a communication unit, based on a pre-configured schedule. This provides users with an environment where they can easily monitor their own health status.

[0116] Once a communication connection is established, the content of the conversation with the user is converted into information data and sent to the analysis unit. The analysis unit uses this information data to analyze the user's cognitive function and emotional state step by step. For example, natural language processing is used in the analysis to determine changes in emotion and signs of attention deficit from the user's dialogue. Specifically, detailed language analysis is possible by using speech recognition technologies such as Google® Cloud Speech-to-Text and the Google Cloud Natural Language API.

[0117] Once the analysis results are obtained, the server utilizes a generation unit to generate ability training optimized for the user. The training content is presented on the user's mobile device, enabling more personalized support. This allows the user to take appropriate actions according to their own health condition.

[0118] For example, if an analysis reveals that a user is experiencing stress in their daily life, the generation unit will suggest relaxation exercises. In this case, a specific prompt might be: "A 70-year-old man has recently been having trouble sleeping at night. Perform an emotional analysis and suggest appropriate relaxation methods."

[0119] Furthermore, if an anomaly is detected, the server automatically notifies medical institutions and relevant parties, enabling a swift response.

[0120] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0121] Step 1:

[0122] The server automatically establishes a communication connection to the target user based on a pre-configured schedule using a communication unit. The input is schedule data, and the output is the establishment of the communication connection.

[0123] Step 2:

[0124] After a communication connection is established, the terminal records the conversation with the user in real time and acquires audio data. The input is the user's voice, and the output is an audio data file.

[0125] Step 3:

[0126] The server sends the audio data to a speech recognition API such as Google Cloud Speech-to-Text, where it is converted into informational data. The input is audio data, and the output is the converted text data.

[0127] Step 4:

[0128] The analysis unit analyzes the converted text data and performs natural language processing to evaluate the user's cognitive function and emotional state. The input is text data, and the output is the analysis result data.

[0129] Step 5:

[0130] The server generates ability training tailored to the user using a generation unit based on the analysis results. Specifically, it uses a generation AI model to determine appropriate exercises and training content. The input is the analysis results, and the output is data on the training content.

[0131] Step 6:

[0132] The terminal displays the generated training content on the user's mobile device, making it accessible to the user. The input is the training content data, and the output is the user's visual interface.

[0133] Step 7:

[0134] The server automatically notifies medical institutions and relevant parties if an anomaly is detected. The input is anomaly detection data obtained from the analysis results, and the output is the transmission of a notification message.

[0135] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0136] The present invention aims to prevent social isolation by closely monitoring the health status of elderly individuals through a system combining a communication device, an analysis device, a generation device, an emotion engine, and necessary means. The embodiments of this system are described in detail below.

[0137] Based on a schedule set by the server, the communication device automatically initiates a connection to the target person. Once the device establishes a connection with the target person, interaction begins through casual conversation. This conversation proceeds at a comfortable pace and involves asking questions about the target person's daily life. In this way, the device provides a natural space for communication.

[0138] The audio data acquired by the terminal is sent to a server and converted into text data by an analysis device. The analysis device uses natural language processing technology to precisely analyze the user's cognitive function and emotional state. An emotion engine is integrated into this process, analyzing the user's emotions from their tone of voice and word choice, and determining their state in real time. This emotional state is a crucial element for a deeper understanding of the meaning of the conversation.

[0139] Based on the analysis results, the server uses a generator to create intelligence training and brain training programs optimized for the individual, providing programs tailored to their specific needs. For example, if the emotion engine determines that the user is experiencing stress, training with a relaxing effect will be provided.

[0140] Furthermore, the server utilizes emotional state information obtained by the emotion engine to notify healthcare institutions and relevant parties if any unusual changes are detected. This notification includes the results of the analyzed emotional state, providing important information for healthcare professionals to take appropriate action.

[0141] The server stores conversation data and analysis results in a database, and the AI ​​model is updated by a learning device. This improves the accuracy of the analysis, enabling the provision of more appropriate support in subsequent conversations. Through this process, the system constantly monitors the health status of elderly individuals in an optimal manner, thereby reducing the risk of lonely deaths and other health risks.

[0142] The following describes the processing flow.

[0143] Step 1:

[0144] The server checks the communication schedule. The server refers to the registered schedule for each participant and determines the date and time of the next communication. This ensures a timely connection.

[0145] Step 2:

[0146] The server initiates the communication connection. At the scheduled time, the server automatically places a voice or video call to the target person's phone number or IP address. Once the connection is established, the device begins conversing with the target person.

[0147] Step 3:

[0148] The device greets the target person and begins the conversation. Based on a pre-prepared script, the device asks questions about the target person's daily life, communicating in a natural flow.

[0149] Step 4:

[0150] The user answers questions from the device. The person responds to the questions from the device by talking about their daily life and physical condition. The device acquires the audio data and sends it to the server.

[0151] Step 5:

[0152] The server converts the audio data into text. The analysis device uses speech recognition technology to convert the acquired audio data into text data and then performs natural language processing.

[0153] Step 6:

[0154] The server uses an emotion engine to perform emotion analysis. The emotion engine analyzes the converted text data and voice characteristics to evaluate the user's emotional state in real time.

[0155] Step 7:

[0156] The server generates intelligence training based on the analysis results. The generation device creates an individually optimized brain training program based on the analysis results and sends it to the terminal.

[0157] Step 8:

[0158] The device presents training to the user. By having the user complete the generated intelligence training program and providing appropriate feedback, it supports the maintenance and improvement of cognitive function.

[0159] Step 9:

[0160] The server detects and notifies of anomalies. If an anomaly is detected based on analysis results and the evaluation of the emotion engine, the server automatically notifies the relevant medical institutions and family members.

[0161] Step 10:

[0162] The server stores data and updates the model. By accumulating conversation data and analysis results, and updating the AI ​​model using a learning device, the accuracy of subsequent analyses is improved.

[0163] (Example 2)

[0164] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0165] There is a need to efficiently support the monitoring of the health status of the elderly and the prevention of social isolation. However, conventional methods have limitations in natural dialogue and precise analysis of emotional states, making it difficult to provide individualized intelligence training or to rapidly detect and notify abnormal conditions.

[0166] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0167] In this invention, the server includes means for an information transmission device to automatically connect information to the user based on a pre-set plan, means for using an analysis device to convert dialogue content into linguistic information during interaction with the user and to analyze cognitive function and emotional state, and means for generating individually optimized intelligence training using a generation device based on the results of the analysis and presenting it to the user. This enables detailed monitoring of the health status of the elderly, improvement of cognitive function through designed intelligence training, and prevention of social isolation.

[0168] An "information transmission device" is a device that automatically connects to users based on a pre-set plan.

[0169] "User" refers to the individual who uses the system, particularly the elderly.

[0170] "Dialogue content" refers to the entire content of the conversation, including linguistic information, exchanged between the user and the system's terminal.

[0171] "Linguistic information" refers to text-based information converted using speech recognition technology.

[0172] "Cognitive function" refers to analytical methods and systems used to evaluate a user's cognitive abilities.

[0173] "Emotional state" refers to elements that analyze and judge the user's mental and emotional condition in real time.

[0174] "Analysis equipment" refers to all devices that convert dialogue content into linguistic information and evaluate cognitive functions and emotional states.

[0175] A "generating device" refers to a device or system that generates intelligent training optimized for the user based on analysis results.

[0176] "Intelligence training" refers to a set of programs and activities designed to improve, maintain, or enhance a user's cognitive functions and emotional state.

[0177] "Unusual changes" refer to health or emotional changes that deviate significantly from the user's normal state.

[0178] A "healthcare organization" refers to an agency or facility that receives notifications from the system and provides necessary medical responses and care.

[0179] An "information storage device" refers to a storage medium for accumulating generated linguistic information and analysis results.

[0180] A "learning device" refers to a device or system that updates a generative model using accumulated data to improve the accuracy of evaluations in subsequent interactions.

[0181] A "generative model" refers to a machine learning model used by a learning device to improve the accuracy of subsequent interactions.

[0182] This invention is a system for monitoring the health status of elderly people, combining an information transmission device, analysis equipment, generation equipment, an emotion engine, and related systems. In this system, a server acts as the central point, and each device and equipment works in coordination with it.

[0183] First, the server automatically establishes a connection with the user through an information transmission device based on a pre-configured plan. The user then engages in natural, everyday conversation using a terminal. During this dialogue process, the user's voice is captured by the terminal and sent to the server. The server then passes this voice data to an analysis device, which uses speech recognition technology to convert it into linguistic information. This process utilizes natural language processing algorithms to enable highly accurate conversion.

[0184] Next, the server uses the converted language information to analyze the recognition function and emotional state using an analysis device. The emotion engine determines the emotional tendency in real time from the user's voice tone and selected words. For example, if feelings of worry or anxiety are detected, that information is input into the generation device.

[0185] Based on the results of this emotion analysis, the generating device creates optimized intelligence training tailored to the user. For example, it may offer programs that play relaxing music or instructions for light exercise. This may include special programs aimed at stress reduction.

[0186] Furthermore, the server also has a function to notify medical organizations and relevant parties if any unusual changes are detected. This notification enables a swift and appropriate response.

[0187] The server simultaneously saves past dialogue data and analysis results to a data storage device, and updates the generated AI model using a learning device. This update improves the accuracy of evaluations in subsequent dialogues, enabling more personalized support.

[0188] For example, if a user repeatedly mentions "I can't sleep lately," the emotion engine will determine their stress level, and the generating device will provide a corresponding relaxation program. Furthermore, relevant data is fed back to the learning device to improve its response to similar cases.

[0189] An example of a prompt to input into the generative AI model is as follows: "Please explain in detail how to analyze the user's voice data and determine their emotional state."

[0190] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0191] Step 1:

[0192] The server activates the information transmission device based on a pre-configured plan and automatically initiates a connection to the user. The server references connection schedule data as input and sends a connection request as output. Specifically, the server performs network connection control and establishes a session with the user terminal.

[0193] Step 2:

[0194] The device accepts the connection request and initiates a natural, everyday conversation with the user. The device receives connection information as input and generates voice data as output. Specifically, it facilitates smooth dialogue by asking questions such as, "How are you feeling today?"

[0195] Step 3:

[0196] The device sends audio data acquired through conversation to the server. It receives audio data as input and sends high-quality audio files to the server as output. Specifically, the device's built-in microphone records the user's voice and converts it into an appropriate data format.

[0197] Step 4:

[0198] The server uses analysis equipment to convert transmitted audio data into text-based language information. The input is audio data, and the output is text data. Specifically, a natural language processing algorithm performs speech recognition and then converts it to text.

[0199] Step 5:

[0200] The server analyzes linguistic information transcribed into text using analytical equipment to determine cognitive function and emotional state. It receives text data as input and generates cognitive and emotional evaluation reports as output. Specifically, the analytical algorithm analyzes the text and determines the user's mental state.

[0201] Step 6:

[0202] The server generates optimized intelligent training based on analysis results using generation equipment. It uses the analysis and evaluation report as input and provides an intelligent training program as output. Specifically, it selects an adaptive program and presents it to the user.

[0203] Step 7:

[0204] The server sends notifications to medical organizations and relevant parties if it detects unusual changes in sentiment analysis. The input is a detailed analysis report, and the output is a notification message. The specific operation involves sending warnings via email or application using a communication protocol.

[0205] Step 8:

[0206] The server stores conversation data and analysis results in a data storage device and updates the generative AI model through a learning device. The input is past conversation data and analysis results, and the output is the updated generative AI model. The specific operations involve database management and model update processing using machine learning algorithms.

[0207] (Application Example 2)

[0208] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0209] In modern society, monitoring the health status of the elderly and preventing social isolation are crucial issues. Traditional methods make it difficult to ensure regular personal contact and assess health status, potentially causing the elderly to miss opportunities for appropriate care. Solving this problem and providing an environment where the elderly can live with peace of mind is essential.

[0210] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0211] In this invention, the server includes means for a communication device to automatically establish a communication connection with a target person based on a pre-set schedule; means for using an analysis device to convert conversation content into linguistic data and analyze cognitive function and emotional state during a conversation with the target person; and means for generating individually optimized intelligence training and relaxation content using a generation device based on the results of the analysis and presenting it to the target person. This makes it possible to continuously and effectively monitor the health status of elderly people, prevent social isolation, and provide appropriate care.

[0212] A "communication device" is a device that has the function of automatically establishing a communication connection with a target person based on a pre-set schedule.

[0213] An "analysis device" is a device that converts the content of conversations with a subject into linguistic data and analyzes their cognitive function and emotional state.

[0214] A "generation device" is a device that generates and presents individually optimized intelligence training and relaxation content to the target user based on the analysis results.

[0215] "Means for notifying relevant parties via communication means when an anomaly is detected" refers to a function that, when an anomaly is detected during analysis, notifies pre-designated relevant parties using communication means.

[0216] A "communication terminal" is a device that enables the operation of the aforementioned communication devices, analysis devices, and generation devices, and is a terminal for users to operate.

[0217] "Remote computing resources" refer to services and hardware that support the computational processing required by analytical devices, and usually refer to cloud services.

[0218] A "learning device" is a device that stores conversation content and analysis results, and updates the computational model to improve the accuracy of evaluation in subsequent dialogues.

[0219] An "intermediate device" is a device that has the function of promoting daily interaction among the subjects themselves and among their supervisors, and preventing psychological isolation.

[0220] The system for carrying out this invention consists of a communication device, an analysis device, a generation device, a learning device, and an intermediate device. Each of these devices is intended to monitor the health status of the subject and prevent social isolation.

[0221] The server plays a central role, coordinating with communication terminals to establish regular communication with the target individual. The communication terminals, installed on devices such as smartphones, act as a bridge between the target individual and the server. Once a conversation begins, the terminals capture the conversation as audio data and convert it into text data using the Google Cloud Speech-to-Text API. Next, an analysis device uses this text data for natural language processing to evaluate the target individual's cognitive function and emotional state. The analysis utilizes machine learning algorithms based on TENSORFLOW®.

[0222] Based on the analysis results, the server uses a generation device to create intelligence training and relaxation content optimized for the target individual. This content is presented to the target individual's smartphone. Furthermore, if an anomaly is detected, the server notifies pre-registered stakeholders using the Twilio API, etc. Data is accumulated using a cloud database, and the learning device updates the generated AI model to improve accuracy in subsequent interactions.

[0223] For example, if data suggests that older adults tend to experience stress on weekends, the system will provide guided meditation content for relaxation. An example of a prompt for the generative AI model would be, "If data suggests that the user is particularly anxious on weekends, please suggest appropriate relaxation methods."

[0224] The introduction of this system will enable elderly people to receive care safely at home while maintaining social connections.

[0225] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0226] Step 1:

[0227] The server automatically establishes a communication connection with the target person via a communication terminal based on a pre-configured schedule. The input is schedule information, and the output is the establishment of the communication connection. In this process, the communication device sends a signal to the target person's smartphone, opening the communication line.

[0228] Step 2:

[0229] The device records the subject's conversation as audio data. The input is voice acquired through the microphone, and the output is digitized audio data. The device uses the microphone to collect voice and performs digital signal processing.

[0230] Step 3:

[0231] The server uses the Google Cloud Speech-to-Text API to convert audio data into text data. The input is audio data, and the output is text data. The server performs speech recognition via the API and saves the results to a database.

[0232] Step 4:

[0233] The server sends text data to the analysis device and performs natural language processing. The input is text data, and the output is the analysis result (cognitive function and emotional state). The analysis device uses TensorFlow to perform the analysis and returns the result.

[0234] Step 5:

[0235] The server generates intelligence training and relaxation content using a generation device based on the analysis results. The input is the analysis results, and the output is customized content. The generation device uses this data to create content suitable for the target audience.

[0236] Step 6:

[0237] It presents generated content on the user's device and facilitates interaction. The input is the generated content, and the output is what is displayed to the user. The device transmits information through its screen and speakers.

[0238] Step 7:

[0239] The server stores analysis results and user responses in a cloud database, and the learning device updates the generated AI model for the next interaction. The input is the analysis results and user response data, and the output is the updated AI model. The learning device incorporates the new data into the model to improve accuracy.

[0240] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0241] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0242] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0243] [Second Embodiment]

[0244] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0245] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0246] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0247] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0248] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0249] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0250] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0251] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0252] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0253] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0254] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0255] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0256] The present invention aims to efficiently monitor the health status of the elderly and prevent social isolation through a system combining a communication device, an analysis device, a generation device, and necessary means. Embodiments of this system are described below.

[0257] The server first uses a communication device to periodically connect with the elderly target person according to a pre-set schedule. This communication is conducted via voice or video call. Once the terminal connects to the target person, it initiates a conversation based on everyday topics and collects information about the target person's health status.

[0258] During a conversation, the audio data acquired by the device is constantly converted into language data by a server via an analysis device. This analysis device uses natural language processing technology to analyze the cognitive function and emotional state of the subject from their responses. For example, if the subject's statements show signs of emotional instability, the server reflects this in the analysis results.

[0259] Next, based on the analysis results, the server generates individually optimized intelligence training and brain training content through a generation device. This generated content is presented to the subject via a terminal, making it possible to implement optimal training tailored to the subject's current health condition.

[0260] Furthermore, if an anomaly is detected, the server automatically notifies healthcare facilities and relevant parties. This encourages early response and reduces health risks. For example, if signs of depression are detected during a conversation, the server will issue an emergency alert and coordinate with healthcare facilities to enable early intervention.

[0261] The collected data is securely stored on a server, and a learning device is used for subsequent analyses and model accuracy improvements. This enables more accurate evaluations and allows for the provision of higher-quality support to the target individuals.

[0262] In this way, this system aims to reduce the risk of lonely deaths and health problems by continuously monitoring the health status of elderly people and providing appropriate support.

[0263] The following describes the processing flow.

[0264] Step 1:

[0265] The server checks the communication schedule. The server retrieves the date and time of the next communication connection from each participant's profile stored in the database. This prepares the server for communication to take place according to the predetermined schedule.

[0266] Step 2:

[0267] The server automatically initiates a communication connection. At the specified time, the server initiates a voice or video call to the target person's phone number or IP address. Using VoIP technology, the connection is established over the internet in the case of a voice call.

[0268] Step 3:

[0269] The device initiates a conversation with the target person. Once the target person accepts the communication, the device makes an opening greeting and starts a conversation based on everyday topics. Questions are asked based on a pre-prepared script to facilitate a natural conversation.

[0270] Step 4:

[0271] The user responds to questions from the AI ​​agent. The user, as the subject, continues the conversation about everyday events and their own health in response to questions from the device. The device receives these responses in real time.

[0272] Step 5:

[0273] The server analyzes the response. The voice data sent from the terminal is converted into text data by the server using speech recognition technology. Then, cognitive function and emotional state are analyzed through natural language processing.

[0274] Step 6:

[0275] The server generates training content based on the analysis results. Using the generation device, it automatically generates brain training and intelligence training tailored to the subject's condition and determines what to provide to the user via the terminal.

[0276] Step 7:

[0277] The terminal provides training to the user. Present the generated training to the target person and encourage its implementation. For example, simple memory tests or quiz-style training are carried out.

[0278] Step 8:

[0279] The server monitors and notifies of abnormalities. If an abnormality is detected during the analysis process, the server immediately communicates to notify medical institutions and relevant parties. Email, SMS, etc. are used for this.

[0280] Step 9:

[0281] The server accumulates data and updates the learning device. By accumulating conversation content and analysis results in the database and updating the AI model by the learning device at any time, the accuracy in subsequent conversations and analyses is improved.

[0282] (Example 1)

[0283] Next, Example 1 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".

[0284] In modern society, monitoring the health status of the elderly and preventing social isolation have become important issues. However, with conventional technologies, it has been difficult to grasp the detailed health status in real time and to provide delicate individual support. Therefore, there is a need for an early detection method for the health risks of the elderly and an appropriate support provision method.

[0285] The specific processing by the specific processing unit 290 of the data processing device 12 in Example 1 is realized by the following respective means.

[0286] In this invention, the server includes means for automatically establishing a communication connection with a target person based on a plan preset by a communication device, means for using an analysis device that converts voice data into language data and evaluates cognitive functions and emotional states, means for generating and presenting individually optimized cognitive training to the target person using a generation device, means for notifying a medical institution or a relevant person when an abnormality is detected, and means for accumulating the collected data and using a learning device to improve the accuracy of the next analysis and evaluation model. Thereby, it becomes possible to monitor the health status of the elderly in real time and provide appropriate support according to individual conditions.

[0287] The "communication device" is a device that connects to a target person via voice or video communication and exchanges information.

[0288] The "plan" refers to a schedule or procedure preset for the communication device to connect to a target person.

[0289] The "analysis device" is a device that converts voice data into language data and evaluates cognitive functions and emotional states.

[0290] The "generation device" is a device that generates and presents optimized cognitive training for a target person based on analysis results.

[0291] "Abnormality" refers to dangerous signs different from normal ones in the health status or conversation content of a target person, indicating a state that requires early intervention.

[0292] The "learning device" is a device that uses the collected data to improve the accuracy of the next analysis and evaluation model.

[0293] The "model" refers to an algorithm or structure used in data analysis and the generation of cognitive training.

[0294] In this invention, a system is provided in which a server, a terminal, and a user cooperate to monitor the health status of the elderly and prevent social isolation.

[0295] The server uses communication equipment to automatically connect with the elderly user's device according to a pre-configured plan. Typically, a device with calling capabilities is used. This communication is conducted by processing voice data in real time, enabling smooth interaction with the user. Examples of prompts include, "What happened today?" and "Have you been feeling tired lately?" These questions initiate a natural conversation with the user.

[0296] The device engages in conversation with the user based on everyday topics and transmits the user's voice input to the server. The server converts the voice data into language data via analysis equipment and then uses a generative AI model to evaluate cognitive function and emotional state. This process utilizes natural language processing (NLP) technology. For example, from a user's statement, "I often have trouble sleeping," the system analyzes cognitive decline and emotional instability.

[0297] Based on the analysis results, the server uses a generation device to design individually optimized cognitive training. This training is tailored to the user's health condition and abilities and can be presented via a terminal. A specific example is a quiz-style training aimed at improving memory.

[0298] If an anomaly is detected, the server quickly notifies medical institutions and relevant parties, facilitating early response. This helps mitigate health risks. The collected data is stored in learning devices, and machine learning techniques are used to improve the accuracy of subsequent analyses and evaluation models. This supports long-term health management and forms the basis for further technological improvements.

[0299] The implementation of this system will be an essential means of monitoring the health of the elderly in their daily lives and ensuring they can live with peace of mind.

[0300] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0301] Step 1:

[0302] The server automatically connects to the target person's terminal via a communication device based on a pre-set plan.

[0303] As input, schedule information and contact data of the connection target are used, and as output, a connection for a voice call or a video call is established. As a specific operation, the server sends a prompt sentence "What happened today?" to the target person and prepares to start a conversation.

[0304] Step 2:

[0305] The terminal progresses the conversation with the user and collects voice data.

[0306] As input, there is a response from the user to the prompt sentence sent from the server. As output, the collected voice data is generated. The specific operation is that the user speaks about their daily life and the terminal records it.

[0307] Step 3:

[0308] The terminal sends the collected voice data to the server, and the server uses an analysis device to convert the voice data into language data.

[0309] As input, the collected voice data is used, and as output, language data in text form is generated. During this process, natural language processing technology is utilized on the server. The server analyzes the user's utterance content "I often can't sleep recently" and evaluates the emotional state.

[0310] Step 4:

[0311] The server uses the analyzed language data and uses a generative AI model to generate individualized cognitive training.

[0312] Language data and past health data are used as input, and an optimized training program is output. Specifically, a quiz-style training program for improving memory is designed and prepared to be provided to the user.

[0313] Step 5:

[0314] The server sends the generated training content to the terminal, and the terminal presents the training to the user.

[0315] The input is a generated training program, and the output is the user receiving training instructions. Specifically, the device instructs the user to "start the next quiz" and then begins the quiz.

[0316] Step 6:

[0317] The user performs the training, and the device sends the results to the server.

[0318] The input consists of user training data and results, and the output is training result data sent to the server. Specifically, the user answers a quiz, and the device records the answers.

[0319] Step 7:

[0320] The server stores the training results that have been sent and, if necessary, notifies medical institutions and relevant parties if it detects any anomalies.

[0321] The input is training result data, and the output may generate anomaly notification messages. The server automatically sends an alert when an anomaly is detected, prompting intervention.

[0322] Step 8:

[0323] The server stores the collected data in the learning device and updates the model to improve the accuracy of subsequent analyses.

[0324] The accumulated data is used as input, and an updated evaluation model is generated as output. Specific operations include the application of machine learning algorithms to accelerate the evolution of the evaluation model.

[0325] (Application Example 1)

[0326] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0327] In modern society, the increasing elderly population necessitates continuous monitoring of their health. However, challenges remain, including the risk of social isolation among the elderly and the difficulty in responding quickly to sudden changes in their health. Furthermore, there is a need for a system that collects health information in a way that does not require complex operations from the elderly, and provides appropriate support.

[0328] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0329] In this invention, the server includes means for a communication unit to automatically establish a communication connection with a task target based on a pre-set schedule; means for using an analysis unit to convert the content of the dialogue with the task target into information data and analyze cognitive function and emotional state; means for generating individually optimized ability training using a generation unit based on the results of the analysis and presenting it to the task target; and means for presenting the optimized information training on a mobile terminal used by the task target. This makes it possible for elderly people to continuously monitor their health status and receive prompt support as needed without becoming socially isolated.

[0330] A "communication unit" is a device that automatically establishes a communication connection with the task target and operates according to a pre-set schedule.

[0331] "Task subjects" are individuals whose health status is being monitored, and these typically refer to elderly people.

[0332] "Information data" refers to data that records the content of interactions with the task target, including data that has been converted into language data.

[0333] An "analysis unit" is a device that analyzes acquired information data and determines the cognitive function and emotional state of the task subject.

[0334] A "generation unit" is a device that creates ability training suitable for the task target based on the analysis results, and plays a role in individual optimization.

[0335] "Mobile devices" refer to information and communication devices that the task target uses on a daily basis, including smartphones and tablets.

[0336] An "intermediate unit" is a device that facilitates interaction between task participants and between supervisors, with the aim of preventing social isolation.

[0337] The system for realizing this invention is comprised of a communication unit, an analysis unit, a generation unit, and a mobile terminal.

[0338] First, the server automatically establishes a communication connection with the user targeted for the task via a communication unit, based on a pre-configured schedule. This provides users with an environment where they can easily monitor their own health status.

[0339] Once a communication connection is established, the content of the conversation with the user is converted into information data and sent to the analysis unit. The analysis unit uses this information data to analyze the user's cognitive function and emotional state step by step. For example, natural language processing is used in the analysis to determine changes in emotion and signs of attention deficit from the user's dialogue. Specifically, detailed language analysis is possible by using speech recognition technologies such as Google Cloud Speech-to-Text and the Google Cloud Natural Language API.

[0340] Once the analysis results are obtained, the server utilizes a generation unit to generate ability training optimized for the user. The training content is presented on the user's mobile device, enabling more personalized support. This allows the user to take appropriate actions according to their own health condition.

[0341] For example, if an analysis reveals that a user is experiencing stress in their daily life, the generation unit will suggest relaxation exercises. In this case, a specific prompt might be: "A 70-year-old man has recently been having trouble sleeping at night. Perform an emotional analysis and suggest appropriate relaxation methods."

[0342] Furthermore, if an anomaly is detected, the server automatically notifies medical institutions and relevant parties, enabling a swift response.

[0343] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0344] Step 1:

[0345] The server automatically establishes a communication connection to the target user based on a pre-configured schedule using a communication unit. The input is schedule data, and the output is the establishment of the communication connection.

[0346] Step 2:

[0347] After a communication connection is established, the terminal records the conversation with the user in real time and acquires audio data. The input is the user's voice, and the output is an audio data file.

[0348] Step 3:

[0349] The server sends the audio data to a speech recognition API such as Google Cloud Speech-to-Text, where it is converted into informational data. The input is audio data, and the output is the converted text data.

[0350] Step 4:

[0351] The analysis unit analyzes the converted text data and performs natural language processing to evaluate the user's cognitive function and emotional state. The input is text data, and the output is the analysis result data.

[0352] Step 5:

[0353] The server generates ability training tailored to the user using a generation unit based on the analysis results. Specifically, it uses a generation AI model to determine appropriate exercises and training content. The input is the analysis results, and the output is data on the training content.

[0354] Step 6:

[0355] The terminal displays the generated training content on the user's mobile device, making it accessible to the user. The input is the training content data, and the output is the user's visual interface.

[0356] Step 7:

[0357] The server automatically notifies medical institutions and relevant parties if an anomaly is detected. The input is anomaly detection data obtained from the analysis results, and the output is the transmission of a notification message.

[0358] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0359] The present invention aims to prevent social isolation by closely monitoring the health status of elderly individuals through a system combining a communication device, an analysis device, a generation device, an emotion engine, and necessary means. The embodiments of this system are described in detail below.

[0360] Based on a schedule set by the server, the communication device automatically initiates a connection to the target person. Once the device establishes a connection with the target person, interaction begins through casual conversation. This conversation proceeds at a comfortable pace and involves asking questions about the target person's daily life. In this way, the device provides a natural space for communication.

[0361] The audio data acquired by the terminal is sent to a server and converted into text data by an analysis device. The analysis device uses natural language processing technology to precisely analyze the user's cognitive function and emotional state. An emotion engine is integrated into this process, analyzing the user's emotions from their tone of voice and word choice, and determining their state in real time. This emotional state is a crucial element for a deeper understanding of the meaning of the conversation.

[0362] Based on the analysis results, the server uses a generator to create intelligence training and brain training programs optimized for the individual, providing programs tailored to their specific needs. For example, if the emotion engine determines that the user is experiencing stress, training with a relaxing effect will be provided.

[0363] Furthermore, the server utilizes emotional state information obtained by the emotion engine to notify healthcare institutions and relevant parties if any unusual changes are detected. This notification includes the results of the analyzed emotional state, providing important information for healthcare professionals to take appropriate action.

[0364] The server stores conversation data and analysis results in a database, and the AI ​​model is updated by a learning device. This improves the accuracy of the analysis, enabling the provision of more appropriate support in subsequent conversations. Through this process, the system constantly monitors the health status of elderly individuals in an optimal manner, thereby reducing the risk of lonely deaths and other health risks.

[0365] The following describes the processing flow.

[0366] Step 1:

[0367] The server checks the communication schedule. The server refers to the registered schedule for each participant and determines the date and time of the next communication. This ensures a timely connection.

[0368] Step 2:

[0369] The server initiates the communication connection. At the scheduled time, the server automatically places a voice or video call to the target person's phone number or IP address. Once the connection is established, the device begins conversing with the target person.

[0370] Step 3:

[0371] The device greets the target person and begins the conversation. Based on a pre-prepared script, the device asks questions about the target person's daily life, communicating in a natural flow.

[0372] Step 4:

[0373] The user answers questions from the device. The person responds to the questions from the device by talking about their daily life and physical condition. The device acquires the audio data and sends it to the server.

[0374] Step 5:

[0375] The server converts the audio data into text. The analysis device uses speech recognition technology to convert the acquired audio data into text data and then performs natural language processing.

[0376] Step 6:

[0377] The server uses an emotion engine to perform emotion analysis. The emotion engine analyzes the converted text data and voice characteristics to evaluate the user's emotional state in real time.

[0378] Step 7:

[0379] The server generates intelligence training based on the analysis results. The generation device creates an individually optimized brain training program based on the analysis results and sends it to the terminal.

[0380] Step 8:

[0381] The device presents training to the user. By having the user complete the generated intelligence training program and providing appropriate feedback, it supports the maintenance and improvement of cognitive function.

[0382] Step 9:

[0383] The server detects and notifies of anomalies. If an anomaly is detected based on analysis results and the evaluation of the emotion engine, the server automatically notifies the relevant medical institutions and family members.

[0384] Step 10:

[0385] The server stores data and updates the model. By accumulating conversation data and analysis results, and updating the AI ​​model using a learning device, the accuracy of subsequent analyses is improved.

[0386] (Example 2)

[0387] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0388] There is a need to efficiently support the monitoring of the health status of the elderly and the prevention of social isolation. However, conventional methods have limitations in natural dialogue and precise analysis of emotional states, making it difficult to provide individualized intelligence training or to rapidly detect and notify abnormal conditions.

[0389] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0390] In this invention, the server includes means for an information transmission device to automatically connect information to the user based on a pre-set plan, means for using an analysis device to convert dialogue content into linguistic information during interaction with the user and to analyze cognitive function and emotional state, and means for generating individually optimized intelligence training using a generation device based on the results of the analysis and presenting it to the user. This enables detailed monitoring of the health status of the elderly, improvement of cognitive function through designed intelligence training, and prevention of social isolation.

[0391] An "information transmission device" is a device that automatically connects to users based on a pre-set plan.

[0392] "User" refers to the individual who uses the system, particularly the elderly.

[0393] "Dialogue content" refers to the entire content of the conversation, including linguistic information, exchanged between the user and the system's terminal.

[0394] "Linguistic information" refers to text-based information converted using speech recognition technology.

[0395] "Cognitive function" refers to analytical methods and systems used to evaluate a user's cognitive abilities.

[0396] "Emotional state" refers to elements that analyze and judge the user's mental and emotional condition in real time.

[0397] "Analysis equipment" refers to all devices that convert dialogue content into linguistic information and evaluate cognitive functions and emotional states.

[0398] A "generating device" refers to a device or system that generates intelligent training optimized for the user based on analysis results.

[0399] "Intelligence training" refers to a set of programs and activities designed to improve, maintain, or enhance a user's cognitive functions and emotional state.

[0400] "Unusual changes" refer to health or emotional changes that deviate significantly from the user's normal state.

[0401] A "healthcare organization" refers to an agency or facility that receives notifications from the system and provides necessary medical responses and care.

[0402] An "information storage device" refers to a storage medium for accumulating generated linguistic information and analysis results.

[0403] A "learning device" refers to a device or system that updates a generative model using accumulated data to improve the accuracy of evaluations in subsequent interactions.

[0404] A "generative model" refers to a machine learning model used by a learning device to improve the accuracy of subsequent interactions.

[0405] This invention is a system for monitoring the health status of elderly people, combining an information transmission device, analysis equipment, generation equipment, an emotion engine, and related systems. In this system, a server acts as the central point, and each device and equipment works in coordination with it.

[0406] First, the server automatically establishes a connection with the user through an information transmission device based on a pre-configured plan. The user then engages in natural, everyday conversation using a terminal. During this dialogue process, the user's voice is captured by the terminal and sent to the server. The server then passes this voice data to an analysis device, which uses speech recognition technology to convert it into linguistic information. This process utilizes natural language processing algorithms to enable highly accurate conversion.

[0407] Next, the server uses the converted language information to analyze the recognition function and emotional state using an analysis device. The emotion engine determines the emotional tendency in real time from the user's voice tone and selected words. For example, if feelings of worry or anxiety are detected, that information is input into the generation device.

[0408] Based on the results of this emotion analysis, the generating device creates optimized intelligence training tailored to the user. For example, it may offer programs that play relaxing music or instructions for light exercise. This may include special programs aimed at stress reduction.

[0409] Furthermore, the server also has a function to notify medical organizations and relevant parties if any unusual changes are detected. This notification enables a swift and appropriate response.

[0410] The server simultaneously saves past dialogue data and analysis results to a data storage device, and updates the generated AI model using a learning device. This update improves the accuracy of evaluations in subsequent dialogues, enabling more personalized support.

[0411] For example, if a user repeatedly mentions "I can't sleep lately," the emotion engine will determine their stress level, and the generating device will provide a corresponding relaxation program. Furthermore, relevant data is fed back to the learning device to improve its response to similar cases.

[0412] An example of a prompt to input into the generative AI model is as follows: "Please explain in detail how to analyze the user's voice data and determine their emotional state."

[0413] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0414] Step 1:

[0415] The server activates the information transmission device based on a pre-configured plan and automatically initiates a connection to the user. The server references connection schedule data as input and sends a connection request as output. Specifically, the server performs network connection control and establishes a session with the user terminal.

[0416] Step 2:

[0417] The device accepts the connection request and initiates a natural, everyday conversation with the user. The device receives connection information as input and generates voice data as output. Specifically, it facilitates smooth dialogue by asking questions such as, "How are you feeling today?"

[0418] Step 3:

[0419] The device sends audio data acquired through conversation to the server. It receives audio data as input and sends high-quality audio files to the server as output. Specifically, the device's built-in microphone records the user's voice and converts it into an appropriate data format.

[0420] Step 4:

[0421] The server uses analysis equipment to convert transmitted audio data into text-based language information. The input is audio data, and the output is text data. Specifically, a natural language processing algorithm performs speech recognition and then converts it to text.

[0422] Step 5:

[0423] The server analyzes linguistic information transcribed into text using analytical equipment to determine cognitive function and emotional state. It receives text data as input and generates cognitive and emotional evaluation reports as output. Specifically, the analytical algorithm analyzes the text and determines the user's mental state.

[0424] Step 6:

[0425] The server generates optimized intelligent training based on analysis results using generation equipment. It uses the analysis and evaluation report as input and provides an intelligent training program as output. Specifically, it selects an adaptive program and presents it to the user.

[0426] Step 7:

[0427] The server sends notifications to medical organizations and relevant parties if it detects unusual changes in sentiment analysis. The input is a detailed analysis report, and the output is a notification message. The specific operation involves sending warnings via email or application using a communication protocol.

[0428] Step 8:

[0429] The server stores conversation data and analysis results in a data storage device and updates the generative AI model through a learning device. The input is past conversation data and analysis results, and the output is the updated generative AI model. The specific operations involve database management and model update processing using machine learning algorithms.

[0430] (Application Example 2)

[0431] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0432] In modern society, monitoring the health status of the elderly and preventing social isolation are crucial issues. Traditional methods make it difficult to ensure regular personal contact and assess health status, potentially causing the elderly to miss opportunities for appropriate care. Solving this problem and providing an environment where the elderly can live with peace of mind is essential.

[0433] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0434] In this invention, the server includes means for a communication device to automatically establish a communication connection with a target person based on a pre-set schedule; means for using an analysis device to convert conversation content into linguistic data and analyze cognitive function and emotional state during a conversation with the target person; and means for generating individually optimized intelligence training and relaxation content using a generation device based on the results of the analysis and presenting it to the target person. This makes it possible to continuously and effectively monitor the health status of elderly people, prevent social isolation, and provide appropriate care.

[0435] A "communication device" is a device that has the function of automatically establishing a communication connection with a target person based on a pre-set schedule.

[0436] An "analysis device" is a device that converts the content of conversations with a subject into linguistic data and analyzes their cognitive function and emotional state.

[0437] A "generation device" is a device that generates and presents individually optimized intelligence training and relaxation content to the target user based on the analysis results.

[0438] "Means for notifying relevant parties via communication means when an anomaly is detected" refers to a function that, when an anomaly is detected during analysis, notifies pre-designated relevant parties using communication means.

[0439] A "communication terminal" is a device that enables the operation of the aforementioned communication devices, analysis devices, and generation devices, and is a terminal for users to operate.

[0440] "Remote computing resources" refer to services and hardware that support the computational processing required by analytical devices, and usually refer to cloud services.

[0441] A "learning device" is a device that stores conversation content and analysis results, and updates the computational model to improve the accuracy of evaluation in subsequent dialogues.

[0442] An "intermediate device" is a device that has the function of promoting daily interaction among the subjects themselves and among their supervisors, and preventing psychological isolation.

[0443] The system for carrying out this invention consists of a communication device, an analysis device, a generation device, a learning device, and an intermediate device. Each of these devices is intended to monitor the health status of the subject and prevent social isolation.

[0444] The server plays a central role, coordinating with communication terminals to establish regular communication with the target individual. These communication terminals, installed on devices such as smartphones, act as a bridge between the target individual and the server. Once a conversation begins, the terminal captures the conversation as audio data and converts it into text data using the Google Cloud Speech-to-Text API. Next, an analysis device uses this text data for natural language processing to evaluate the target individual's cognitive function and emotional state. TensorFlow-based machine learning algorithms are used for the analysis.

[0445] Based on the analysis results, the server uses a generation device to create intelligence training and relaxation content optimized for the target individual. This content is presented to the target individual's smartphone. Furthermore, if an anomaly is detected, the server notifies pre-registered stakeholders using the Twilio API, etc. Data is accumulated using a cloud database, and the learning device updates the generated AI model to improve accuracy in subsequent interactions.

[0446] For example, if data suggests that older adults tend to experience stress on weekends, the system will provide guided meditation content for relaxation. An example of a prompt for the generative AI model would be, "If data suggests that the user is particularly anxious on weekends, please suggest appropriate relaxation methods."

[0447] The introduction of this system will enable elderly people to receive care safely at home while maintaining social connections.

[0448] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0449] Step 1:

[0450] The server automatically establishes a communication connection with the target person via a communication terminal based on a pre-configured schedule. The input is schedule information, and the output is the establishment of the communication connection. In this process, the communication device sends a signal to the target person's smartphone, opening the communication line.

[0451] Step 2:

[0452] The device records the subject's conversation as audio data. The input is voice acquired through the microphone, and the output is digitized audio data. The device uses the microphone to collect voice and performs digital signal processing.

[0453] Step 3:

[0454] The server uses the Google Cloud Speech-to-Text API to convert audio data into text data. The input is audio data, and the output is text data. The server performs speech recognition via the API and saves the results to a database.

[0455] Step 4:

[0456] The server sends text data to the analysis device and performs natural language processing. The input is text data, and the output is the analysis result (cognitive function and emotional state). The analysis device uses TensorFlow to perform the analysis and returns the result.

[0457] Step 5:

[0458] The server generates intelligence training and relaxation content using a generation device based on the analysis results. The input is the analysis results, and the output is customized content. The generation device uses this data to create content suitable for the target audience.

[0459] Step 6:

[0460] It presents generated content on the user's device and facilitates interaction. The input is the generated content, and the output is what is displayed to the user. The device transmits information through its screen and speakers.

[0461] Step 7:

[0462] The server stores analysis results and user responses in a cloud database, and the learning device updates the generated AI model for the next interaction. The input is the analysis results and user response data, and the output is the updated AI model. The learning device incorporates the new data into the model to improve accuracy.

[0463] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0464] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0465] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0466] [Third Embodiment]

[0467] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0468] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0469] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0470] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0471] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0472] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0473] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0474] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0475] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0476] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0477] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0478] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0479] The present invention aims to efficiently monitor the health status of the elderly and prevent social isolation through a system combining a communication device, an analysis device, a generation device, and necessary means. Embodiments of this system are described below.

[0480] The server first uses a communication device to periodically connect with the elderly target person according to a pre-set schedule. This communication is conducted via voice or video call. Once the terminal connects to the target person, it initiates a conversation based on everyday topics and collects information about the target person's health status.

[0481] During a conversation, the audio data acquired by the device is constantly converted into language data by a server via an analysis device. This analysis device uses natural language processing technology to analyze the cognitive function and emotional state of the subject from their responses. For example, if the subject's statements show signs of emotional instability, the server reflects this in the analysis results.

[0482] Next, based on the analysis results, the server generates individually optimized intelligence training and brain training content through a generation device. This generated content is presented to the subject via a terminal, making it possible to implement optimal training tailored to the subject's current health condition.

[0483] Furthermore, if an anomaly is detected, the server automatically notifies healthcare facilities and relevant parties. This encourages early response and reduces health risks. For example, if signs of depression are detected during a conversation, the server will issue an emergency alert and coordinate with healthcare facilities to enable early intervention.

[0484] The collected data is securely stored on a server, and a learning device is used for subsequent analyses and model accuracy improvements. This enables more accurate evaluations and allows for the provision of higher-quality support to the target individuals.

[0485] In this way, this system aims to reduce the risk of lonely deaths and health problems by continuously monitoring the health status of elderly people and providing appropriate support.

[0486] The following describes the processing flow.

[0487] Step 1:

[0488] The server checks the communication schedule. The server retrieves the date and time of the next communication connection from each participant's profile stored in the database. This prepares the server for communication to take place according to the predetermined schedule.

[0489] Step 2:

[0490] The server automatically initiates a communication connection. At the specified time, the server initiates a voice or video call to the target person's phone number or IP address. Using VoIP technology, the connection is established over the internet in the case of a voice call.

[0491] Step 3:

[0492] The device initiates a conversation with the target person. Once the target person accepts the communication, the device makes an opening greeting and starts a conversation based on everyday topics. Questions are asked based on a pre-prepared script to facilitate a natural conversation.

[0493] Step 4:

[0494] The user responds to questions from the AI ​​agent. The user, as the subject, continues the conversation about everyday events and their own health in response to questions from the device. The device receives these responses in real time.

[0495] Step 5:

[0496] The server analyzes the response. The voice data sent from the terminal is converted into text data by the server using speech recognition technology. Then, cognitive function and emotional state are analyzed through natural language processing.

[0497] Step 6:

[0498] The server generates training content based on the analysis results. Using the generation device, it automatically generates brain training and intelligence training tailored to the subject's condition and determines what to provide to the user via the terminal.

[0499] Step 7:

[0500] The device provides training to the user. It presents the generated training to the target user and encourages them to complete it. For example, simple memory tests or quiz-style training may be conducted.

[0501] Step 8:

[0502] The server monitors for anomalies and sends notifications. If an anomaly is detected during the analysis process, the server immediately initiates communication to notify medical institutions and relevant parties. This communication may utilize email or SMS.

[0503] Step 9:

[0504] The server stores data and updates the learning device. By storing conversation content and analysis results in a database and continuously updating the AI ​​model using the learning device, the accuracy of subsequent conversations and analyses is improved.

[0505] (Example 1)

[0506] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0507] In modern society, monitoring the health status of the elderly and preventing social isolation are crucial issues. However, conventional technologies have made it difficult to grasp detailed health conditions in real time or to provide personalized support. Therefore, there is a need for early detection of health risks among the elderly and methods for providing appropriate support.

[0508] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0509] In this invention, the server includes means for communication devices to automatically establish a communication connection with a target person based on a pre-set plan; means for using an analysis device that converts voice data into language data and evaluates cognitive function and emotional state; means for generating individually optimized cognitive training using a generation device and presenting it to the target person; means for notifying medical institutions and relevant parties when an abnormality is detected; and means for accumulating collected data and using a learning device to improve the accuracy of the next analysis and evaluation model. This enables real-time monitoring of the health status of elderly people and the provision of appropriate support tailored to their individual conditions.

[0510] "Communication equipment" refers to a device that connects with a target person via voice or video communication and exchanges information with them.

[0511] "Plan" refers to a schedule or procedure set in advance for communication equipment to connect with a target person.

[0512] An "analysis device" is a device that converts audio data into linguistic data and uses it to evaluate cognitive function and emotional state.

[0513] A "generating device" is a device that generates and presents cognitive training optimized for the target individual based on analysis results.

[0514] "Abnormal" refers to dangerous signs that deviate from the norm based on the subject's health condition or conversation content, and indicates a state that requires early intervention.

[0515] A "learning device" is a device that uses collected data to improve the accuracy of subsequent analyses and evaluation models.

[0516] A "model" refers to the algorithms and structures used in data analysis and the generation of cognitive training.

[0517] This invention provides a system in which a server, terminal, and user work together to monitor the health status of elderly people and prevent social isolation.

[0518] The server uses communication equipment to automatically connect with the elderly user's device according to a pre-configured plan. Typically, a device with calling capabilities is used. This communication is conducted by processing voice data in real time, enabling smooth interaction with the user. Examples of prompts include, "What happened today?" and "Have you been feeling tired lately?" These questions initiate a natural conversation with the user.

[0519] The device engages in conversation with the user based on everyday topics and transmits the user's voice input to the server. The server converts the voice data into language data via analysis equipment and then uses a generative AI model to evaluate cognitive function and emotional state. This process utilizes natural language processing (NLP) technology. For example, from a user's statement, "I often have trouble sleeping," the system analyzes cognitive decline and emotional instability.

[0520] Based on the analysis results, the server uses a generation device to design individually optimized cognitive training. This training is tailored to the user's health condition and abilities and can be presented via a terminal. A specific example is a quiz-style training aimed at improving memory.

[0521] If an anomaly is detected, the server quickly notifies medical institutions and relevant parties, facilitating early response. This helps mitigate health risks. The collected data is stored in learning devices, and machine learning techniques are used to improve the accuracy of subsequent analyses and evaluation models. This supports long-term health management and forms the basis for further technological improvements.

[0522] The implementation of this system will be an essential means of monitoring the health of the elderly in their daily lives and ensuring they can live with peace of mind.

[0523] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0524] Step 1:

[0525] The server automatically connects to the target person's terminal via communication equipment, based on a pre-configured plan.

[0526] The inputs used are schedule information and contact data for the person to connect, and the output is the establishment of a voice or video call connection. Specifically, the server sends the prompt message "What happened today?" to the person and prepares to start the conversation.

[0527] Step 2:

[0528] The device conducts a conversation with the user and collects audio data.

[0529] The input consists of the user's responses to prompts sent from the server. The output is collected audio data. Specifically, the user speaks about their daily life, and the device records what they say.

[0530] Step 3:

[0531] The terminal sends the collected audio data to the server, which then uses analysis equipment to convert the audio data into language data.

[0532] The collected audio data is used as input, and text-based language data is generated as output. During this process, the server utilizes natural language processing technology. The server analyzes the user's utterance, "I've been having trouble sleeping a lot lately," and evaluates their emotional state.

[0533] Step 4:

[0534] The server uses the analyzed language data to generate personalized cognitive training using a generative AI model.

[0535] Language data and past health data are used as input, and an optimized training program is output. Specifically, a quiz-style training program for improving memory is designed and prepared to be provided to the user.

[0536] Step 5:

[0537] The server sends the generated training content to the terminal, and the terminal presents the training to the user.

[0538] The input is a generated training program, and the output is the user receiving training instructions. Specifically, the device instructs the user to "start the next quiz" and then begins the quiz.

[0539] Step 6:

[0540] The user performs the training, and the device sends the results to the server.

[0541] The input consists of user training data and results, and the output is training result data sent to the server. Specifically, the user answers a quiz, and the device records the answers.

[0542] Step 7:

[0543] The server stores the training results that have been sent and, if necessary, notifies medical institutions and relevant parties if it detects any anomalies.

[0544] The input is training result data, and the output may generate anomaly notification messages. The server automatically sends an alert when an anomaly is detected, prompting intervention.

[0545] Step 8:

[0546] The server stores the collected data in the learning device and updates the model to improve the accuracy of subsequent analyses.

[0547] The accumulated data is used as input, and an updated evaluation model is generated as output. Specific operations include the application of machine learning algorithms to accelerate the evolution of the evaluation model.

[0548] (Application Example 1)

[0549] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0550] In modern society, the increasing elderly population necessitates continuous monitoring of their health. However, challenges remain, including the risk of social isolation among the elderly and the difficulty in responding quickly to sudden changes in their health. Furthermore, there is a need for a system that collects health information in a way that does not require complex operations from the elderly, and provides appropriate support.

[0551] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0552] In this invention, the server includes means for a communication unit to automatically establish a communication connection with a task target based on a pre-set schedule; means for using an analysis unit to convert the content of the dialogue with the task target into information data and analyze cognitive function and emotional state; means for generating individually optimized ability training using a generation unit based on the results of the analysis and presenting it to the task target; and means for presenting the optimized information training on a mobile terminal used by the task target. This makes it possible for elderly people to continuously monitor their health status and receive prompt support as needed without becoming socially isolated.

[0553] A "communication unit" is a device that automatically establishes a communication connection with the task target and operates according to a pre-set schedule.

[0554] "Task subjects" are individuals whose health status is being monitored, and these typically refer to elderly people.

[0555] "Information data" refers to data that records the content of interactions with the task target, including data that has been converted into language data.

[0556] An "analysis unit" is a device that analyzes acquired information data and determines the cognitive function and emotional state of the task subject.

[0557] A "generation unit" is a device that creates ability training suitable for the task target based on the analysis results, and plays a role in individual optimization.

[0558] "Mobile devices" refer to information and communication devices that the task target uses on a daily basis, including smartphones and tablets.

[0559] An "intermediate unit" is a device that facilitates interaction between task participants and between supervisors, with the aim of preventing social isolation.

[0560] The system for realizing this invention is comprised of a communication unit, an analysis unit, a generation unit, and a mobile terminal.

[0561] First, the server automatically establishes a communication connection with the user targeted for the task via a communication unit, based on a pre-configured schedule. This provides users with an environment where they can easily monitor their own health status.

[0562] Once a communication connection is established, the content of the conversation with the user is converted into information data and sent to the analysis unit. The analysis unit uses this information data to analyze the user's cognitive function and emotional state step by step. For example, natural language processing is used in the analysis to determine changes in emotion and signs of attention deficit from the user's dialogue. Specifically, detailed language analysis is possible by using speech recognition technologies such as Google Cloud Speech-to-Text and the Google Cloud Natural Language API.

[0563] Once the analysis results are obtained, the server utilizes a generation unit to generate ability training optimized for the user. The training content is presented on the user's mobile device, enabling more personalized support. This allows the user to take appropriate actions according to their own health condition.

[0564] For example, if an analysis reveals that a user is experiencing stress in their daily life, the generation unit will suggest relaxation exercises. In this case, a specific prompt might be: "A 70-year-old man has recently been having trouble sleeping at night. Perform an emotional analysis and suggest appropriate relaxation methods."

[0565] Furthermore, if an anomaly is detected, the server automatically notifies medical institutions and relevant parties, enabling a swift response.

[0566] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0567] Step 1:

[0568] The server automatically establishes a communication connection to the target user based on a pre-configured schedule using a communication unit. The input is schedule data, and the output is the establishment of the communication connection.

[0569] Step 2:

[0570] After a communication connection is established, the terminal records the conversation with the user in real time and acquires audio data. The input is the user's voice, and the output is an audio data file.

[0571] Step 3:

[0572] The server sends the audio data to a speech recognition API such as Google Cloud Speech-to-Text, where it is converted into informational data. The input is audio data, and the output is the converted text data.

[0573] Step 4:

[0574] The analysis unit analyzes the converted text data and performs natural language processing to evaluate the user's cognitive function and emotional state. The input is text data, and the output is the analysis result data.

[0575] Step 5:

[0576] The server generates ability training tailored to the user using a generation unit based on the analysis results. Specifically, it uses a generation AI model to determine appropriate exercises and training content. The input is the analysis results, and the output is data on the training content.

[0577] Step 6:

[0578] The terminal displays the generated training content on the user's mobile device, making it accessible to the user. The input is the training content data, and the output is the user's visual interface.

[0579] Step 7:

[0580] The server automatically notifies medical institutions and relevant parties if an anomaly is detected. The input is anomaly detection data obtained from the analysis results, and the output is the transmission of a notification message.

[0581] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0582] The present invention aims to prevent social isolation by closely monitoring the health status of elderly individuals through a system combining a communication device, an analysis device, a generation device, an emotion engine, and necessary means. The embodiments of this system are described in detail below.

[0583] Based on a schedule set by the server, the communication device automatically initiates a connection to the target person. Once the device establishes a connection with the target person, interaction begins through casual conversation. This conversation proceeds at a comfortable pace and involves asking questions about the target person's daily life. In this way, the device provides a natural space for communication.

[0584] The audio data acquired by the terminal is sent to a server and converted into text data by an analysis device. The analysis device uses natural language processing technology to precisely analyze the user's cognitive function and emotional state. An emotion engine is integrated into this process, analyzing the user's emotions from their tone of voice and word choice, and determining their state in real time. This emotional state is a crucial element for a deeper understanding of the meaning of the conversation.

[0585] Based on the analysis results, the server uses a generator to create intelligence training and brain training programs optimized for the individual, providing programs tailored to their specific needs. For example, if the emotion engine determines that the user is experiencing stress, training with a relaxing effect will be provided.

[0586] Furthermore, the server utilizes emotional state information obtained by the emotion engine to notify healthcare institutions and relevant parties if any unusual changes are detected. This notification includes the results of the analyzed emotional state, providing important information for healthcare professionals to take appropriate action.

[0587] The server stores conversation data and analysis results in a database, and the AI ​​model is updated by a learning device. This improves the accuracy of the analysis, enabling the provision of more appropriate support in subsequent conversations. Through this process, the system constantly monitors the health status of elderly individuals in an optimal manner, thereby reducing the risk of lonely deaths and other health risks.

[0588] The following describes the processing flow.

[0589] Step 1:

[0590] The server checks the communication schedule. The server refers to the registered schedule for each participant and determines the date and time of the next communication. This ensures a timely connection.

[0591] Step 2:

[0592] The server initiates the communication connection. At the scheduled time, the server automatically places a voice or video call to the target person's phone number or IP address. Once the connection is established, the device begins conversing with the target person.

[0593] Step 3:

[0594] The device greets the target person and begins the conversation. Based on a pre-prepared script, the device asks questions about the target person's daily life, communicating in a natural flow.

[0595] Step 4:

[0596] The user answers questions from the device. The person responds to the questions from the device by talking about their daily life and physical condition. The device acquires the audio data and sends it to the server.

[0597] Step 5:

[0598] The server converts the audio data into text. The analysis device uses speech recognition technology to convert the acquired audio data into text data and then performs natural language processing.

[0599] Step 6:

[0600] The server uses an emotion engine to perform emotion analysis. The emotion engine analyzes the converted text data and voice characteristics to evaluate the user's emotional state in real time.

[0601] Step 7:

[0602] The server generates intelligence training based on the analysis results. The generation device creates an individually optimized brain training program based on the analysis results and sends it to the terminal.

[0603] Step 8:

[0604] The device presents training to the user. By having the user complete the generated intelligence training program and providing appropriate feedback, it supports the maintenance and improvement of cognitive function.

[0605] Step 9:

[0606] The server detects and notifies of anomalies. If an anomaly is detected based on analysis results and the evaluation of the emotion engine, the server automatically notifies the relevant medical institutions and family members.

[0607] Step 10:

[0608] The server stores data and updates the model. By accumulating conversation data and analysis results, and updating the AI ​​model using a learning device, the accuracy of subsequent analyses is improved.

[0609] (Example 2)

[0610] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0611] There is a need to efficiently support the monitoring of the health status of the elderly and the prevention of social isolation. However, conventional methods have limitations in natural dialogue and precise analysis of emotional states, making it difficult to provide individualized intelligence training or to rapidly detect and notify abnormal conditions.

[0612] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0613] In this invention, the server includes means for an information transmission device to automatically connect information to the user based on a pre-set plan, means for using an analysis device to convert dialogue content into linguistic information during interaction with the user and to analyze cognitive function and emotional state, and means for generating individually optimized intelligence training using a generation device based on the results of the analysis and presenting it to the user. This enables detailed monitoring of the health status of the elderly, improvement of cognitive function through designed intelligence training, and prevention of social isolation.

[0614] An "information transmission device" is a device that automatically connects to users based on a pre-set plan.

[0615] "User" refers to the individual who uses the system, particularly the elderly.

[0616] "Dialogue content" refers to the entire content of the conversation, including linguistic information, exchanged between the user and the system's terminal.

[0617] "Linguistic information" refers to text-based information converted using speech recognition technology.

[0618] "Cognitive function" refers to analytical methods and systems used to evaluate a user's cognitive abilities.

[0619] "Emotional state" refers to elements that analyze and judge the user's mental and emotional condition in real time.

[0620] "Analysis equipment" refers to all devices that convert dialogue content into linguistic information and evaluate cognitive functions and emotional states.

[0621] A "generating device" refers to a device or system that generates intelligent training optimized for the user based on analysis results.

[0622] "Intelligence training" refers to a set of programs and activities designed to improve, maintain, or enhance a user's cognitive functions and emotional state.

[0623] "Unusual changes" refer to health or emotional changes that deviate significantly from the user's normal state.

[0624] A "healthcare organization" refers to an agency or facility that receives notifications from the system and provides necessary medical responses and care.

[0625] An "information storage device" refers to a storage medium for accumulating generated linguistic information and analysis results.

[0626] A "learning device" refers to a device or system that updates a generative model using accumulated data to improve the accuracy of evaluations in subsequent interactions.

[0627] A "generative model" refers to a machine learning model used by a learning device to improve the accuracy of subsequent interactions.

[0628] This invention is a system for monitoring the health status of elderly people, combining an information transmission device, analysis equipment, generation equipment, an emotion engine, and related systems. In this system, a server acts as the central point, and each device and equipment works in coordination with it.

[0629] First, the server automatically establishes a connection with the user through an information transmission device based on a pre-configured plan. The user then engages in natural, everyday conversation using a terminal. During this dialogue process, the user's voice is captured by the terminal and sent to the server. The server then passes this voice data to an analysis device, which uses speech recognition technology to convert it into linguistic information. This process utilizes natural language processing algorithms to enable highly accurate conversion.

[0630] Next, the server uses the converted language information to analyze the recognition function and emotional state using an analysis device. The emotion engine determines the emotional tendency in real time from the user's voice tone and selected words. For example, if feelings of worry or anxiety are detected, that information is input into the generation device.

[0631] Based on the results of this emotion analysis, the generating device creates optimized intelligence training tailored to the user. For example, it may offer programs that play relaxing music or instructions for light exercise. This may include special programs aimed at stress reduction.

[0632] Furthermore, the server also has a function to notify medical organizations and relevant parties if any unusual changes are detected. This notification enables a swift and appropriate response.

[0633] The server simultaneously saves past dialogue data and analysis results to a data storage device, and updates the generated AI model using a learning device. This update improves the accuracy of evaluations in subsequent dialogues, enabling more personalized support.

[0634] For example, if a user repeatedly mentions "I can't sleep lately," the emotion engine will determine their stress level, and the generating device will provide a corresponding relaxation program. Furthermore, relevant data is fed back to the learning device to improve its response to similar cases.

[0635] An example of a prompt to input into the generative AI model is as follows: "Please explain in detail how to analyze the user's voice data and determine their emotional state."

[0636] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0637] Step 1:

[0638] The server activates the information transmission device based on a pre-configured plan and automatically initiates a connection to the user. The server references connection schedule data as input and sends a connection request as output. Specifically, the server performs network connection control and establishes a session with the user terminal.

[0639] Step 2:

[0640] The device accepts the connection request and initiates a natural, everyday conversation with the user. The device receives connection information as input and generates voice data as output. Specifically, it facilitates smooth dialogue by asking questions such as, "How are you feeling today?"

[0641] Step 3:

[0642] The device sends audio data acquired through conversation to the server. It receives audio data as input and sends high-quality audio files to the server as output. Specifically, the device's built-in microphone records the user's voice and converts it into an appropriate data format.

[0643] Step 4:

[0644] The server uses analysis equipment to convert transmitted audio data into text-based language information. The input is audio data, and the output is text data. Specifically, a natural language processing algorithm performs speech recognition and then converts it to text.

[0645] Step 5:

[0646] The server analyzes linguistic information transcribed into text using analytical equipment to determine cognitive function and emotional state. It receives text data as input and generates cognitive and emotional evaluation reports as output. Specifically, the analytical algorithm analyzes the text and determines the user's mental state.

[0647] Step 6:

[0648] The server generates optimized intelligent training based on analysis results using generation equipment. It uses the analysis and evaluation report as input and provides an intelligent training program as output. Specifically, it selects an adaptive program and presents it to the user.

[0649] Step 7:

[0650] The server sends notifications to medical organizations and relevant parties if it detects unusual changes in sentiment analysis. The input is a detailed analysis report, and the output is a notification message. The specific operation involves sending warnings via email or application using a communication protocol.

[0651] Step 8:

[0652] The server stores conversation data and analysis results in a data storage device and updates the generative AI model through a learning device. The input is past conversation data and analysis results, and the output is the updated generative AI model. The specific operations involve database management and model update processing using machine learning algorithms.

[0653] (Application Example 2)

[0654] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0655] In modern society, monitoring the health status of the elderly and preventing social isolation are crucial issues. Traditional methods make it difficult to ensure regular personal contact and assess health status, potentially causing the elderly to miss opportunities for appropriate care. Solving this problem and providing an environment where the elderly can live with peace of mind is essential.

[0656] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0657] In this invention, the server includes means for a communication device to automatically establish a communication connection with a target person based on a pre-set schedule; means for using an analysis device to convert conversation content into linguistic data and analyze cognitive function and emotional state during a conversation with the target person; and means for generating individually optimized intelligence training and relaxation content using a generation device based on the results of the analysis and presenting it to the target person. This makes it possible to continuously and effectively monitor the health status of elderly people, prevent social isolation, and provide appropriate care.

[0658] A "communication device" is a device that has the function of automatically establishing a communication connection with a target person based on a pre-set schedule.

[0659] An "analysis device" is a device that converts the content of conversations with a subject into linguistic data and analyzes their cognitive function and emotional state.

[0660] A "generation device" is a device that generates and presents individually optimized intelligence training and relaxation content to the target user based on the analysis results.

[0661] "Means for notifying relevant parties via communication means when an anomaly is detected" refers to a function that, when an anomaly is detected during analysis, notifies pre-designated relevant parties using communication means.

[0662] A "communication terminal" is a device that enables the operation of the aforementioned communication devices, analysis devices, and generation devices, and is a terminal for users to operate.

[0663] "Remote computing resources" refer to services and hardware that support the computational processing required by analytical devices, and usually refer to cloud services.

[0664] A "learning device" is a device that stores conversation content and analysis results, and updates the computational model to improve the accuracy of evaluation in subsequent dialogues.

[0665] An "intermediate device" is a device that has the function of promoting daily interaction among the subjects themselves and among their supervisors, and preventing psychological isolation.

[0666] The system for carrying out this invention consists of a communication device, an analysis device, a generation device, a learning device, and an intermediate device. Each of these devices is intended to monitor the health status of the subject and prevent social isolation.

[0667] The server plays a central role, coordinating with communication terminals to establish regular communication with the target individual. These communication terminals, installed on devices such as smartphones, act as a bridge between the target individual and the server. Once a conversation begins, the terminal captures the conversation as audio data and converts it into text data using the Google Cloud Speech-to-Text API. Next, an analysis device uses this text data for natural language processing to evaluate the target individual's cognitive function and emotional state. TensorFlow-based machine learning algorithms are used for the analysis.

[0668] Based on the analysis results, the server uses a generation device to create intelligence training and relaxation content optimized for the target individual. This content is presented to the target individual's smartphone. Furthermore, if an anomaly is detected, the server notifies pre-registered stakeholders using the Twilio API, etc. Data is accumulated using a cloud database, and the learning device updates the generated AI model to improve accuracy in subsequent interactions.

[0669] For example, if data suggests that older adults tend to experience stress on weekends, the system will provide guided meditation content for relaxation. An example of a prompt for the generative AI model would be, "If data suggests that the user is particularly anxious on weekends, please suggest appropriate relaxation methods."

[0670] The introduction of this system will enable elderly people to receive care safely at home while maintaining social connections.

[0671] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0672] Step 1:

[0673] The server automatically establishes a communication connection with the target person via a communication terminal based on a pre-configured schedule. The input is schedule information, and the output is the establishment of the communication connection. In this process, the communication device sends a signal to the target person's smartphone, opening the communication line.

[0674] Step 2:

[0675] The device records the subject's conversation as audio data. The input is voice acquired through the microphone, and the output is digitized audio data. The device uses the microphone to collect voice and performs digital signal processing.

[0676] Step 3:

[0677] The server uses the Google Cloud Speech-to-Text API to convert audio data into text data. The input is audio data, and the output is text data. The server performs speech recognition via the API and saves the results to a database.

[0678] Step 4:

[0679] The server sends text data to the analysis device and performs natural language processing. The input is text data, and the output is the analysis result (cognitive function and emotional state). The analysis device uses TensorFlow to perform the analysis and returns the result.

[0680] Step 5:

[0681] The server generates intelligence training and relaxation content using a generation device based on the analysis results. The input is the analysis results, and the output is customized content. The generation device uses this data to create content suitable for the target audience.

[0682] Step 6:

[0683] It presents generated content on the user's device and facilitates interaction. The input is the generated content, and the output is what is displayed to the user. The device transmits information through its screen and speakers.

[0684] Step 7:

[0685] The server stores analysis results and user responses in a cloud database, and the learning device updates the generated AI model for the next interaction. The input is the analysis results and user response data, and the output is the updated AI model. The learning device incorporates the new data into the model to improve accuracy.

[0686] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0687] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0688] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0689] [Fourth Embodiment]

[0690] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0691] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0692] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0693] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0694] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0695] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0696] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0697] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0698] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0699] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0700] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0701] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0702] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0703] The present invention aims to efficiently monitor the health status of the elderly and prevent social isolation through a system combining a communication device, an analysis device, a generation device, and necessary means. Embodiments of this system are described below.

[0704] The server first uses a communication device to periodically connect with the elderly target person according to a pre-set schedule. This communication is conducted via voice or video call. Once the terminal connects to the target person, it initiates a conversation based on everyday topics and collects information about the target person's health status.

[0705] During a conversation, the audio data acquired by the device is constantly converted into language data by a server via an analysis device. This analysis device uses natural language processing technology to analyze the cognitive function and emotional state of the subject from their responses. For example, if the subject's statements show signs of emotional instability, the server reflects this in the analysis results.

[0706] Next, based on the analysis results, the server generates individually optimized intelligence training and brain training content through a generation device. This generated content is presented to the subject via a terminal, making it possible to implement optimal training tailored to the subject's current health condition.

[0707] Furthermore, if an anomaly is detected, the server automatically notifies healthcare facilities and relevant parties. This encourages early response and reduces health risks. For example, if signs of depression are detected during a conversation, the server will issue an emergency alert and coordinate with healthcare facilities to enable early intervention.

[0708] The collected data is securely stored on a server, and a learning device is used for subsequent analyses and model accuracy improvements. This enables more accurate evaluations and allows for the provision of higher-quality support to the target individuals.

[0709] In this way, this system aims to reduce the risk of lonely deaths and health problems by continuously monitoring the health status of elderly people and providing appropriate support.

[0710] The following describes the processing flow.

[0711] Step 1:

[0712] The server checks the communication schedule. The server retrieves the date and time of the next communication connection from each participant's profile stored in the database. This prepares the server for communication to take place according to the predetermined schedule.

[0713] Step 2:

[0714] The server automatically initiates a communication connection. At the specified time, the server initiates a voice or video call to the target person's phone number or IP address. Using VoIP technology, the connection is established over the internet in the case of a voice call.

[0715] Step 3:

[0716] The device initiates a conversation with the target person. Once the target person accepts the communication, the device makes an opening greeting and starts a conversation based on everyday topics. Questions are asked based on a pre-prepared script to facilitate a natural conversation.

[0717] Step 4:

[0718] The user responds to questions from the AI ​​agent. The user, as the subject, continues the conversation about everyday events and their own health in response to questions from the device. The device receives these responses in real time.

[0719] Step 5:

[0720] The server analyzes the response. The voice data sent from the terminal is converted into text data by the server using speech recognition technology. Then, cognitive function and emotional state are analyzed through natural language processing.

[0721] Step 6:

[0722] The server generates training content based on the analysis results. Using the generation device, it automatically generates brain training and intelligence training tailored to the subject's condition and determines what to provide to the user via the terminal.

[0723] Step 7:

[0724] The device provides training to the user. It presents the generated training to the target user and encourages them to complete it. For example, simple memory tests or quiz-style training may be conducted.

[0725] Step 8:

[0726] The server monitors for anomalies and sends notifications. If an anomaly is detected during the analysis process, the server immediately initiates communication to notify medical institutions and relevant parties. This communication may utilize email or SMS.

[0727] Step 9:

[0728] The server stores data and updates the learning device. By storing conversation content and analysis results in a database and continuously updating the AI ​​model using the learning device, the accuracy of subsequent conversations and analyses is improved.

[0729] (Example 1)

[0730] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0731] In modern society, monitoring the health status of the elderly and preventing social isolation are crucial issues. However, conventional technologies have made it difficult to grasp detailed health conditions in real time or to provide personalized support. Therefore, there is a need for early detection of health risks among the elderly and methods for providing appropriate support.

[0732] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0733] In this invention, the server includes means for communication devices to automatically establish a communication connection with a target person based on a pre-set plan; means for using an analysis device that converts voice data into language data and evaluates cognitive function and emotional state; means for generating individually optimized cognitive training using a generation device and presenting it to the target person; means for notifying medical institutions and relevant parties when an abnormality is detected; and means for accumulating collected data and using a learning device to improve the accuracy of the next analysis and evaluation model. This enables real-time monitoring of the health status of elderly people and the provision of appropriate support tailored to their individual conditions.

[0734] "Communication equipment" refers to a device that connects with a target person via voice or video communication and exchanges information with them.

[0735] "Plan" refers to a schedule or procedure set in advance for communication equipment to connect with a target person.

[0736] An "analysis device" is a device that converts audio data into linguistic data and uses it to evaluate cognitive function and emotional state.

[0737] A "generating device" is a device that generates and presents cognitive training optimized for the target individual based on analysis results.

[0738] "Abnormal" refers to dangerous signs that deviate from the norm based on the subject's health condition or conversation content, and indicates a state that requires early intervention.

[0739] A "learning device" is a device that uses collected data to improve the accuracy of subsequent analyses and evaluation models.

[0740] A "model" refers to the algorithms and structures used in data analysis and the generation of cognitive training.

[0741] This invention provides a system in which a server, terminal, and user work together to monitor the health status of elderly people and prevent social isolation.

[0742] The server uses communication equipment to automatically connect with the elderly user's device according to a pre-configured plan. Typically, a device with calling capabilities is used. This communication is conducted by processing voice data in real time, enabling smooth interaction with the user. Examples of prompts include, "What happened today?" and "Have you been feeling tired lately?" These questions initiate a natural conversation with the user.

[0743] The device engages in conversation with the user based on everyday topics and transmits the user's voice input to the server. The server converts the voice data into language data via analysis equipment and then uses a generative AI model to evaluate cognitive function and emotional state. This process utilizes natural language processing (NLP) technology. For example, from a user's statement, "I often have trouble sleeping," the system analyzes cognitive decline and emotional instability.

[0744] Based on the analysis results, the server uses a generation device to design individually optimized cognitive training. This training is tailored to the user's health condition and abilities and can be presented via a terminal. A specific example is a quiz-style training aimed at improving memory.

[0745] If an anomaly is detected, the server quickly notifies medical institutions and relevant parties, facilitating early response. This helps mitigate health risks. The collected data is stored in learning devices, and machine learning techniques are used to improve the accuracy of subsequent analyses and evaluation models. This supports long-term health management and forms the basis for further technological improvements.

[0746] The implementation of this system will be an essential means of monitoring the health of the elderly in their daily lives and ensuring they can live with peace of mind.

[0747] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0748] Step 1:

[0749] The server automatically connects to the target person's terminal via communication equipment, based on a pre-configured plan.

[0750] The inputs used are schedule information and contact data for the person to connect, and the output is the establishment of a voice or video call connection. Specifically, the server sends the prompt message "What happened today?" to the person and prepares to start the conversation.

[0751] Step 2:

[0752] The device conducts a conversation with the user and collects audio data.

[0753] The input consists of the user's responses to prompts sent from the server. The output is collected audio data. Specifically, the user speaks about their daily life, and the device records what they say.

[0754] Step 3:

[0755] The terminal sends the collected audio data to the server, which then uses analysis equipment to convert the audio data into language data.

[0756] The collected audio data is used as input, and text-based language data is generated as output. During this process, the server utilizes natural language processing technology. The server analyzes the user's utterance, "I've been having trouble sleeping a lot lately," and evaluates their emotional state.

[0757] Step 4:

[0758] The server uses the analyzed language data to generate personalized cognitive training using a generative AI model.

[0759] Language data and past health data are used as input, and an optimized training program is output. Specifically, a quiz-style training program for improving memory is designed and prepared to be provided to the user.

[0760] Step 5:

[0761] The server sends the generated training content to the terminal, and the terminal presents the training to the user.

[0762] The input is a generated training program, and the output is the user receiving training instructions. Specifically, the device instructs the user to "start the next quiz" and then begins the quiz.

[0763] Step 6:

[0764] The user performs the training, and the device sends the results to the server.

[0765] The input consists of user training data and results, and the output is training result data sent to the server. Specifically, the user answers a quiz, and the device records the answers.

[0766] Step 7:

[0767] The server stores the training results that have been sent and, if necessary, notifies medical institutions and relevant parties if it detects any anomalies.

[0768] The input is training result data, and the output may generate anomaly notification messages. The server automatically sends an alert when an anomaly is detected, prompting intervention.

[0769] Step 8:

[0770] The server stores the collected data in the learning device and updates the model to improve the accuracy of subsequent analyses.

[0771] The accumulated data is used as input, and an updated evaluation model is generated as output. Specific operations include the application of machine learning algorithms to accelerate the evolution of the evaluation model.

[0772] (Application Example 1)

[0773] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0774] In modern society, the increasing elderly population necessitates continuous monitoring of their health. However, challenges remain, including the risk of social isolation among the elderly and the difficulty in responding quickly to sudden changes in their health. Furthermore, there is a need for a system that collects health information in a way that does not require complex operations from the elderly, and provides appropriate support.

[0775] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0776] In this invention, the server includes means for a communication unit to automatically establish a communication connection with a task target based on a pre-set schedule; means for using an analysis unit to convert the content of the dialogue with the task target into information data and analyze cognitive function and emotional state; means for generating individually optimized ability training using a generation unit based on the results of the analysis and presenting it to the task target; and means for presenting the optimized information training on a mobile terminal used by the task target. This makes it possible for elderly people to continuously monitor their health status and receive prompt support as needed without becoming socially isolated.

[0777] A "communication unit" is a device that automatically establishes a communication connection with the task target and operates according to a pre-set schedule.

[0778] "Task subjects" are individuals whose health status is being monitored, and these typically refer to elderly people.

[0779] "Information data" refers to data that records the content of interactions with the task target, including data that has been converted into language data.

[0780] An "analysis unit" is a device that analyzes acquired information data and determines the cognitive function and emotional state of the task subject.

[0781] A "generation unit" is a device that creates ability training suitable for the task target based on the analysis results, and plays a role in individual optimization.

[0782] "Mobile devices" refer to information and communication devices that the task target uses on a daily basis, including smartphones and tablets.

[0783] An "intermediate unit" is a device that facilitates interaction between task participants and between supervisors, with the aim of preventing social isolation.

[0784] The system for realizing this invention is comprised of a communication unit, an analysis unit, a generation unit, and a mobile terminal.

[0785] First, the server automatically establishes a communication connection with the user targeted for the task via a communication unit, based on a pre-configured schedule. This provides users with an environment where they can easily monitor their own health status.

[0786] Once a communication connection is established, the content of the conversation with the user is converted into information data and sent to the analysis unit. The analysis unit uses this information data to analyze the user's cognitive function and emotional state step by step. For example, natural language processing is used in the analysis to determine changes in emotion and signs of attention deficit from the user's dialogue. Specifically, detailed language analysis is possible by using speech recognition technologies such as Google Cloud Speech-to-Text and the Google Cloud Natural Language API.

[0787] Once the analysis results are obtained, the server utilizes a generation unit to generate ability training optimized for the user. The training content is presented on the user's mobile device, enabling more personalized support. This allows the user to take appropriate actions according to their own health condition.

[0788] For example, if an analysis reveals that a user is experiencing stress in their daily life, the generation unit will suggest relaxation exercises. In this case, a specific prompt might be: "A 70-year-old man has recently been having trouble sleeping at night. Perform an emotional analysis and suggest appropriate relaxation methods."

[0789] Furthermore, if an anomaly is detected, the server automatically notifies medical institutions and relevant parties, enabling a swift response.

[0790] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0791] Step 1:

[0792] The server automatically establishes a communication connection to the target user based on a pre-configured schedule using a communication unit. The input is schedule data, and the output is the establishment of the communication connection.

[0793] Step 2:

[0794] After a communication connection is established, the terminal records the conversation with the user in real time and acquires audio data. The input is the user's voice, and the output is an audio data file.

[0795] Step 3:

[0796] The server sends the audio data to a speech recognition API such as Google Cloud Speech-to-Text, where it is converted into informational data. The input is audio data, and the output is the converted text data.

[0797] Step 4:

[0798] The analysis unit analyzes the converted text data and performs natural language processing to evaluate the user's cognitive function and emotional state. The input is text data, and the output is the analysis result data.

[0799] Step 5:

[0800] The server generates ability training tailored to the user using a generation unit based on the analysis results. Specifically, it uses a generation AI model to determine appropriate exercises and training content. The input is the analysis results, and the output is data on the training content.

[0801] Step 6:

[0802] The terminal displays the generated training content on the user's mobile device, making it accessible to the user. The input is the training content data, and the output is the user's visual interface.

[0803] Step 7:

[0804] The server automatically notifies medical institutions and relevant parties if an anomaly is detected. The input is anomaly detection data obtained from the analysis results, and the output is the transmission of a notification message.

[0805] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0806] The present invention aims to prevent social isolation by closely monitoring the health status of elderly individuals through a system combining a communication device, an analysis device, a generation device, an emotion engine, and necessary means. The embodiments of this system are described in detail below.

[0807] Based on a schedule set by the server, the communication device automatically initiates a connection to the target person. Once the device establishes a connection with the target person, interaction begins through casual conversation. This conversation proceeds at a comfortable pace and involves asking questions about the target person's daily life. In this way, the device provides a natural space for communication.

[0808] The audio data acquired by the terminal is sent to a server and converted into text data by an analysis device. The analysis device uses natural language processing technology to precisely analyze the user's cognitive function and emotional state. An emotion engine is integrated into this process, analyzing the user's emotions from their tone of voice and word choice, and determining their state in real time. This emotional state is a crucial element for a deeper understanding of the meaning of the conversation.

[0809] Based on the analysis results, the server uses a generator to create intelligence training and brain training programs optimized for the individual, providing programs tailored to their specific needs. For example, if the emotion engine determines that the user is experiencing stress, training with a relaxing effect will be provided.

[0810] Furthermore, the server utilizes emotional state information obtained by the emotion engine to notify healthcare institutions and relevant parties if any unusual changes are detected. This notification includes the results of the analyzed emotional state, providing important information for healthcare professionals to take appropriate action.

[0811] The server stores conversation data and analysis results in a database, and the AI ​​model is updated by a learning device. This improves the accuracy of the analysis, enabling the provision of more appropriate support in subsequent conversations. Through this process, the system constantly monitors the health status of elderly individuals in an optimal manner, thereby reducing the risk of lonely deaths and other health risks.

[0812] The following describes the processing flow.

[0813] Step 1:

[0814] The server checks the communication schedule. The server refers to the registered schedule for each participant and determines the date and time of the next communication. This ensures a timely connection.

[0815] Step 2:

[0816] The server initiates the communication connection. At the scheduled time, the server automatically places a voice or video call to the target person's phone number or IP address. Once the connection is established, the device begins conversing with the target person.

[0817] Step 3:

[0818] The device greets the target person and begins the conversation. Based on a pre-prepared script, the device asks questions about the target person's daily life, communicating in a natural flow.

[0819] Step 4:

[0820] The user answers questions from the device. The person responds to the questions from the device by talking about their daily life and physical condition. The device acquires the audio data and sends it to the server.

[0821] Step 5:

[0822] The server converts the audio data into text. The analysis device uses speech recognition technology to convert the acquired audio data into text data and then performs natural language processing.

[0823] Step 6:

[0824] The server uses an emotion engine to perform emotion analysis. The emotion engine analyzes the converted text data and voice characteristics to evaluate the user's emotional state in real time.

[0825] Step 7:

[0826] The server generates intelligence training based on the analysis results. The generation device creates an individually optimized brain training program based on the analysis results and sends it to the terminal.

[0827] Step 8:

[0828] The device presents training to the user. By having the user complete the generated intelligence training program and providing appropriate feedback, it supports the maintenance and improvement of cognitive function.

[0829] Step 9:

[0830] The server detects and notifies of anomalies. If an anomaly is detected based on analysis results and the evaluation of the emotion engine, the server automatically notifies the relevant medical institutions and family members.

[0831] Step 10:

[0832] The server stores data and updates the model. By accumulating conversation data and analysis results, and updating the AI ​​model using a learning device, the accuracy of subsequent analyses is improved.

[0833] (Example 2)

[0834] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0835] There is a need to efficiently support the monitoring of the health status of the elderly and the prevention of social isolation. However, conventional methods have limitations in natural dialogue and precise analysis of emotional states, making it difficult to provide individualized intelligence training or to rapidly detect and notify abnormal conditions.

[0836] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0837] In this invention, the server includes means for an information transmission device to automatically connect information to the user based on a pre-set plan, means for using an analysis device to convert dialogue content into linguistic information during interaction with the user and to analyze cognitive function and emotional state, and means for generating individually optimized intelligence training using a generation device based on the results of the analysis and presenting it to the user. This enables detailed monitoring of the health status of the elderly, improvement of cognitive function through designed intelligence training, and prevention of social isolation.

[0838] An "information transmission device" is a device that automatically connects to users based on a pre-set plan.

[0839] "User" refers to the individual who uses the system, particularly the elderly.

[0840] "Dialogue content" refers to the entire content of the conversation, including linguistic information, exchanged between the user and the system's terminal.

[0841] "Linguistic information" refers to text-based information converted using speech recognition technology.

[0842] "Cognitive function" refers to analytical methods and systems used to evaluate a user's cognitive abilities.

[0843] "Emotional state" refers to elements that analyze and judge the user's mental and emotional condition in real time.

[0844] "Analysis equipment" refers to all devices that convert dialogue content into linguistic information and evaluate cognitive functions and emotional states.

[0845] A "generating device" refers to a device or system that generates intelligent training optimized for the user based on analysis results.

[0846] "Intelligence training" refers to a set of programs and activities designed to improve, maintain, or enhance a user's cognitive functions and emotional state.

[0847] "Unusual changes" refer to health or emotional changes that deviate significantly from the user's normal state.

[0848] A "healthcare organization" refers to an agency or facility that receives notifications from the system and provides necessary medical responses and care.

[0849] An "information storage device" refers to a storage medium for accumulating generated linguistic information and analysis results.

[0850] A "learning device" refers to a device or system that updates a generative model using accumulated data to improve the accuracy of evaluations in subsequent interactions.

[0851] A "generative model" refers to a machine learning model used by a learning device to improve the accuracy of subsequent interactions.

[0852] This invention is a system for monitoring the health status of elderly people, combining an information transmission device, analysis equipment, generation equipment, an emotion engine, and related systems. In this system, a server acts as the central point, and each device and equipment works in coordination with it.

[0853] First, the server automatically establishes a connection with the user through an information transmission device based on a pre-configured plan. The user then engages in natural, everyday conversation using a terminal. During this dialogue process, the user's voice is captured by the terminal and sent to the server. The server then passes this voice data to an analysis device, which uses speech recognition technology to convert it into linguistic information. This process utilizes natural language processing algorithms to enable highly accurate conversion.

[0854] Next, the server uses the converted language information to analyze the recognition function and emotional state using an analysis device. The emotion engine determines the emotional tendency in real time from the user's voice tone and selected words. For example, if feelings of worry or anxiety are detected, that information is input into the generation device.

[0855] Based on the results of this emotion analysis, the generating device creates optimized intelligence training tailored to the user. For example, it may offer programs that play relaxing music or instructions for light exercise. This may include special programs aimed at stress reduction.

[0856] Furthermore, the server also has a function to notify medical organizations and relevant parties if any unusual changes are detected. This notification enables a swift and appropriate response.

[0857] The server simultaneously saves past dialogue data and analysis results to a data storage device, and updates the generated AI model using a learning device. This update improves the accuracy of evaluations in subsequent dialogues, enabling more personalized support.

[0858] For example, if a user repeatedly mentions "I can't sleep lately," the emotion engine will determine their stress level, and the generating device will provide a corresponding relaxation program. Furthermore, relevant data is fed back to the learning device to improve its response to similar cases.

[0859] An example of a prompt to input into the generative AI model is as follows: "Please explain in detail how to analyze the user's voice data and determine their emotional state."

[0860] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0861] Step 1:

[0862] The server activates the information transmission device based on a pre-configured plan and automatically initiates a connection to the user. The server references connection schedule data as input and sends a connection request as output. Specifically, the server performs network connection control and establishes a session with the user terminal.

[0863] Step 2:

[0864] The device accepts the connection request and initiates a natural, everyday conversation with the user. The device receives connection information as input and generates voice data as output. Specifically, it facilitates smooth dialogue by asking questions such as, "How are you feeling today?"

[0865] Step 3:

[0866] The device sends audio data acquired through conversation to the server. It receives audio data as input and sends high-quality audio files to the server as output. Specifically, the device's built-in microphone records the user's voice and converts it into an appropriate data format.

[0867] Step 4:

[0868] The server uses analysis equipment to convert transmitted audio data into text-based language information. The input is audio data, and the output is text data. Specifically, a natural language processing algorithm performs speech recognition and then converts it to text.

[0869] Step 5:

[0870] The server analyzes linguistic information transcribed into text using analytical equipment to determine cognitive function and emotional state. It receives text data as input and generates cognitive and emotional evaluation reports as output. Specifically, the analytical algorithm analyzes the text and determines the user's mental state.

[0871] Step 6:

[0872] The server generates optimized intelligent training based on analysis results using generation equipment. It uses the analysis and evaluation report as input and provides an intelligent training program as output. Specifically, it selects an adaptive program and presents it to the user.

[0873] Step 7:

[0874] The server sends notifications to medical organizations and relevant parties if it detects unusual changes in sentiment analysis. The input is a detailed analysis report, and the output is a notification message. The specific operation involves sending warnings via email or application using a communication protocol.

[0875] Step 8:

[0876] The server stores conversation data and analysis results in a data storage device and updates the generative AI model through a learning device. The input is past conversation data and analysis results, and the output is the updated generative AI model. The specific operations involve database management and model update processing using machine learning algorithms.

[0877] (Application Example 2)

[0878] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0879] In modern society, monitoring the health status of the elderly and preventing social isolation are crucial issues. Traditional methods make it difficult to ensure regular personal contact and assess health status, potentially causing the elderly to miss opportunities for appropriate care. Solving this problem and providing an environment where the elderly can live with peace of mind is essential.

[0880] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0881] In this invention, the server includes means for a communication device to automatically establish a communication connection with a target person based on a pre-set schedule; means for using an analysis device to convert conversation content into linguistic data and analyze cognitive function and emotional state during a conversation with the target person; and means for generating individually optimized intelligence training and relaxation content using a generation device based on the results of the analysis and presenting it to the target person. This makes it possible to continuously and effectively monitor the health status of elderly people, prevent social isolation, and provide appropriate care.

[0882] A "communication device" is a device that has the function of automatically establishing a communication connection with a target person based on a pre-set schedule.

[0883] An "analysis device" is a device that converts the content of conversations with a subject into linguistic data and analyzes their cognitive function and emotional state.

[0884] A "generation device" is a device that generates and presents individually optimized intelligence training and relaxation content to the target user based on the analysis results.

[0885] "Means for notifying relevant parties via communication means when an anomaly is detected" refers to a function that, when an anomaly is detected during analysis, notifies pre-designated relevant parties using communication means.

[0886] A "communication terminal" is a device that enables the operation of the aforementioned communication devices, analysis devices, and generation devices, and is a terminal for users to operate.

[0887] "Remote computing resources" refer to services and hardware that support the computational processing required by analytical devices, and usually refer to cloud services.

[0888] A "learning device" is a device that stores conversation content and analysis results, and updates the computational model to improve the accuracy of evaluation in subsequent dialogues.

[0889] An "intermediate device" is a device that has the function of promoting daily interaction among the subjects themselves and among their supervisors, and preventing psychological isolation.

[0890] The system for carrying out this invention consists of a communication device, an analysis device, a generation device, a learning device, and an intermediate device. Each of these devices is intended to monitor the health status of the subject and prevent social isolation.

[0891] The server plays a central role, coordinating with communication terminals to establish regular communication with the target individual. These communication terminals, installed on devices such as smartphones, act as a bridge between the target individual and the server. Once a conversation begins, the terminal captures the conversation as audio data and converts it into text data using the Google Cloud Speech-to-Text API. Next, an analysis device uses this text data for natural language processing to evaluate the target individual's cognitive function and emotional state. TensorFlow-based machine learning algorithms are used for the analysis.

[0892] Based on the analysis results, the server uses a generation device to create intelligence training and relaxation content optimized for the target individual. This content is presented to the target individual's smartphone. Furthermore, if an anomaly is detected, the server notifies pre-registered stakeholders using the Twilio API, etc. Data is accumulated using a cloud database, and the learning device updates the generated AI model to improve accuracy in subsequent interactions.

[0893] For example, if data suggests that older adults tend to experience stress on weekends, the system will provide guided meditation content for relaxation. An example of a prompt for the generative AI model would be, "If data suggests that the user is particularly anxious on weekends, please suggest appropriate relaxation methods."

[0894] The introduction of this system will enable elderly people to receive care safely at home while maintaining social connections.

[0895] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0896] Step 1:

[0897] The server automatically establishes a communication connection with the target person via a communication terminal based on a pre-configured schedule. The input is schedule information, and the output is the establishment of the communication connection. In this process, the communication device sends a signal to the target person's smartphone, opening the communication line.

[0898] Step 2:

[0899] The device records the subject's conversation as audio data. The input is voice acquired through the microphone, and the output is digitized audio data. The device uses the microphone to collect voice and performs digital signal processing.

[0900] Step 3:

[0901] The server uses the Google Cloud Speech-to-Text API to convert audio data into text data. The input is audio data, and the output is text data. The server performs speech recognition via the API and saves the results to a database.

[0902] Step 4:

[0903] The server sends text data to the analysis device and performs natural language processing. The input is text data, and the output is the analysis result (cognitive function and emotional state). The analysis device uses TensorFlow to perform the analysis and returns the result.

[0904] Step 5:

[0905] The server generates intelligence training and relaxation content using a generation device based on the analysis results. The input is the analysis results, and the output is customized content. The generation device uses this data to create content suitable for the target audience.

[0906] Step 6:

[0907] It presents generated content on the user's device and facilitates interaction. The input is the generated content, and the output is what is displayed to the user. The device transmits information through its screen and speakers.

[0908] Step 7:

[0909] The server stores analysis results and user responses in a cloud database, and the learning device updates the generated AI model for the next interaction. The input is the analysis results and user response data, and the output is the updated AI model. The learning device incorporates the new data into the model to improve accuracy.

[0910] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0911] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0912] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0913] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0914] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0915] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0916] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0917] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0918] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0919] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0920] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0921] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0922] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0923] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0924] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0925] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0926] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0927] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0928] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0929] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0930] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0931] The following is further disclosed regarding the embodiments described above.

[0932] (Claim 1)

[0933] A means by which a communication device automatically establishes a communication connection with a target person based on a pre-set schedule,

[0934] A means of using an analytical device that converts conversation content into linguistic data and analyzes cognitive function and emotional state during conversations with subjects,

[0935] A means for generating individually optimized intelligence training using a generation device based on the results of the aforementioned analysis and presenting it to the subject,

[0936] A means of notifying medical institutions and relevant parties when an abnormality is detected,

[0937] A system that includes this.

[0938] (Claim 2)

[0939] The system according to claim 1, characterized in that it stores conversation content and analysis results, and updates the model using a learning device to improve the evaluation accuracy in the next conversation.

[0940] (Claim 3)

[0941] The system according to claim 1, which includes an intermediate device for promoting daily interaction among subjects and among supervisors, with the aim of preventing social isolation.

[0942] "Example 1"

[0943] (Claim 1)

[0944] A means by which communication equipment automatically establishes a communication connection with a target person based on a pre-set plan,

[0945] A means of using an analytical device that converts audio data into linguistic data during conversations with subjects and evaluates cognitive function and emotional state,

[0946] Based on the results of the aforementioned evaluation, a means for generating individually optimized cognitive training using a generation device and presenting it to the subject,

[0947] A means of notifying medical institutions and relevant parties when an abnormality is detected,

[0948] A means of accumulating collected data and using learning equipment to improve the accuracy of the next analysis and evaluation model,

[0949] A system that includes this.

[0950] (Claim 2)

[0951] The system according to claim 1, which includes an intermediary device to facilitate constant interaction between individuals and between administrators, with the aim of preventing social isolation.

[0952] (Claim 3)

[0953] The system according to claim 1, characterized in that it uses examples of instruction sentences generated by a generative AI model to initiate a dialogue with the target person.

[0954] "Application Example 1"

[0955] (Claim 1)

[0956] A means by which a communication unit automatically establishes a communication connection to a task target based on a pre-set schedule,

[0957] In a dialogue with the task target, a means is used that converts the dialogue content into information data and analyzes cognitive function and emotional state using an analysis unit.

[0958] A means for generating individually optimized competency training using a generation unit based on the results of the aforementioned analysis and presenting it to the task target,

[0959] A means of notifying medical institutions and relevant parties when an abnormality is detected,

[0960] A means of presenting optimized information training on a mobile device used by the task target,

[0961] A system that includes this.

[0962] (Claim 2)

[0963] The system according to claim 1, characterized in that it saves the content of the dialogue and the analysis results, and updates the model using a learning unit to improve the evaluation accuracy in the next dialogue.

[0964] (Claim 3)

[0965] The system according to claim 1, comprising an intermediate unit for promoting normal interaction between task subjects and between supervisors, with the aim of preventing social isolation.

[0966] "Example 2 of combining an emotion engine"

[0967] (Claim 1)

[0968] The information transmission device provides a means for automatically connecting information to the user based on a pre-set plan,

[0969] In dialogue with the user, a means of using an analytical device that converts the content of the dialogue into linguistic information and analyzes the cognitive function and emotional state,

[0970] A means for generating individually optimized intelligence training using a generation device based on the results of the aforementioned analysis and presenting it to the user,

[0971] A means of notifying medical organizations and relevant parties when an unusual change is detected,

[0972] A means for storing the generated language information and analysis results in an information storage device, and updating the generative model using a learning device to improve the evaluation accuracy in the next dialogue,

[0973] A system that includes this.

[0974] (Claim 2)

[0975] The system according to claim 1, characterized in that the generated intelligence training is adjusted based on the user's emotional analysis.

[0976] (Claim 3)

[0977] The system according to claim 1, characterized by having an intermediate device to facilitate daily interaction between users and supervisors and to prevent social isolation.

[0978] "Application example 2 when combining with an emotional engine"

[0979] (Claim 1)

[0980] A means by which a communication device automatically establishes a communication connection with a target person based on a pre-set schedule,

[0981] A means of using an analytical device that converts conversation content into linguistic data and analyzes cognitive function and emotional state during conversations with subjects,

[0982] Based on the results of the aforementioned analysis, a means for generating individually optimized intelligence training and relaxation content using a generation device and presenting it to the subject,

[0983] A means of notifying relevant parties via communication means when an anomaly is detected,

[0984] A means for using a communication terminal and remote computing resources to operate the aforementioned system,

[0985] A system that includes this.

[0986] (Claim 2)

[0987] The system according to claim 1, characterized in that it stores conversation content and analysis results, and updates the computational model using a learning device to improve the evaluation accuracy in the next dialogue.

[0988] (Claim 3)

[0989] The system according to claim 1, which includes an intermediate device for promoting daily interaction among subjects and among supervisors, with the aim of preventing psychological isolation. [Explanation of Symbols]

[0990] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A means by which a communication unit automatically establishes a communication connection to a task target based on a pre-set schedule, In a dialogue with the task target, a means is used that converts the dialogue content into information data and analyzes cognitive function and emotional state using an analysis unit. A means for generating individually optimized competency training using a generation unit based on the results of the aforementioned analysis and presenting it to the task target, A means of notifying medical institutions and relevant parties when an abnormality is detected, A means of presenting optimized information training on a mobile device used by the task target, A system that includes this.

2. The system according to claim 1, characterized in that it saves the content of the dialogue and the analysis results, and updates the model using a learning unit to improve the evaluation accuracy in the next dialogue.

3. The system according to claim 1, comprising an intermediate unit for promoting normal interaction between task subjects and between supervisors, with the aim of preventing social isolation.