Monitoring terminal, information processing method, program, information processing system, and server
The monitored terminal uses a dialogue control unit to clarify ambiguous inputs through dialogues, improving communication quality for monitored individuals by refining message content and enhancing intent conveyance.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- MIXI INC
- Filing Date
- 2025-05-20
- Publication Date
- 2026-07-01
AI Technical Summary
Existing communication technologies for monitored individuals, particularly children, lack advanced interactive support to accurately convey intentions and emotional nuances due to insufficient intelligent processing of ambiguous voices and fragmented information, limiting the quality of communication support.
A monitored terminal equipped with an operation input unit, voice input unit, location information acquisition unit, and a dialogue control unit that generates questions to clarify intent and performs dialogues to refine message content, transmitting the clarified message to a caregiver's terminal.
Enhances communication accuracy by clarifying ambiguous inputs through dialogues, allowing monitored individuals to convey their intentions more effectively to caregivers.
Smart Images

Figure 2026109491000001_ABST
Abstract
Description
Technical Field
[0001] The present invention relates to a monitored terminal, an information processing method, a program, an information processing system, and a server, and particularly relates to a technology for assisting communication of a monitored person.
Background Art
[0002] Conventionally, various mobile terminals and monitoring services have been proposed for the purpose of safety confirmation and communication support of monitored persons such as children and the elderly. For example, a system is known in which position information and preset stereotypical status information (e.g., "fine", "at home", etc.) are transmitted from a terminal carried by a monitored person to a terminal of a monitor.
[0003] In recent years, monitoring GPS terminals for children have also evolved. In addition to a simple position information notification function, they have functions such as a voice message transmission and reception function and a simple information display function (e.g., a display function for displaying a clock, battery remaining amount, message arrival time, sender icon, etc.) to encourage children's independence. Such terminals aim to support children in being conscious of their actions according to time and managing the charging of the device by themselves, and also to enhance the sense of security of both children and guardians by visually confirming the sender and the incoming message status. In many of these terminals, functions that deviate from the essence of monitoring, such as video viewing, games, and SNS, are intentionally excluded in consideration of the safety of children.
[0004] Furthermore, technologies have been proposed to enable smoother communication between a caregiver's terminal (for example, a parent's smartphone) and a monitored person's terminal. For example, Patent Document 1 discloses an information processing device (server) that transmits first voice data based on first text input into a first terminal (caregiver's terminal) to a second terminal (monitored person's terminal), and transmits second text data based on second voice input into the second terminal to the first terminal. With this technology, caregivers can transmit messages they input as text to the monitored person via voice, and conversely, the monitored person can transmit messages they input as voice to the caregiver via text, enabling communication tailored to each situation.
[0005] However, even with systems that primarily relay the mutual conversion and transmission of voice and text, such as those described in Patent Document 1, and with existing child terminals equipped with the display functions mentioned above, it cannot be said that they have sufficient communication support functions to allow those being monitored, especially children whose language expression abilities are still developing, to freely and accurately convey their intentions, circumstances, and emotional nuances to their guardians. There was insufficient technology for intelligent processing (e.g., AI) to deeply understand the true meaning from the ambiguous voices and fragmented information uttered by children, and to actively support the clarification and improvement of message content. While information display via the display function can assist in communication, it does not allow intelligent processing to support the generation or clarification of the message content itself through dialogue.
[0006] In particular, for dedicated devices for children that lack complex text input interfaces and have simplified operation, the quality of communication via voice input is crucial. However, there has been a lack of advanced interactive support technologies that actively support children's communication abilities through intelligent processing, collaboratively clarifying and refining message content. [Prior art documents] [Patent Documents]
[0007] [Patent Document 1] Patent No. 7671096 [Overview of the project] [Problems that the invention aims to solve]
[0008] One of the objectives of the present invention is to provide a monitoring terminal, information processing method, program, information processing system, and server capable of supporting the communication of the person being monitored. More specifically, the objective is to provide technology that improves the quality of communication support so that the person being monitored, especially a child with underdeveloped language skills, can communicate their intentions and situation to their caregiver more accurately and easily. [Means for solving the problem]
[0009] A monitored terminal according to one aspect of the present invention is a monitored terminal carried by a person being monitored, and is characterized by comprising: an operation input receiving unit that receives operation input from the person being monitored; a voice input receiving unit that receives voice input from the person being monitored; a location information acquisition unit that acquires location information; a dialogue control unit that, when the voice input received by the voice input receiving unit satisfies predetermined conditions, generates a question to clarify the intent of the voice input, performs a dialogue with the person being monitored, and acquires a message determined by the dialogue; and a transmission control unit that transmits the message acquired by the dialogue control unit, together with the location information acquired by the location information acquisition unit, to the monitor's terminal. [Effects of the Invention]
[0010] According to one aspect of the present invention, it is possible to support the communication of the person being cared for. Specifically, when certain conditions are met, such as when the voice input from the person being cared for is ambiguous, the dialogue control unit asks questions to clarify the intent of the message content and generates a message through dialogue with the person being cared for, so that the person being cared for can send a message to the caregiver that more accurately reflects their own intentions. [Brief explanation of the drawing]
[0011] [Figure 1] This is an overview diagram of the entire system. [Figure 2] This is a hardware configuration diagram of the monitored device. [Figure 3] This is a hardware configuration diagram of an information processing device (server). [Figure 4] This is a functional block diagram of the monitoring device. [Figure 5] This is a functional block diagram of an information processing device (server). [Figure 6] This is a flowchart of the message generation and transmission process during normal operation. [Figure 7] This is an emergency response procedure flowchart. [Figure 8] This is an example of a dialogue sequence. [Figure 9] This is an example of an urgency determination rule and weighting table. [Figure 10] This is an example of a personalization settings table. [Figure 11] This is an example of the display screen of the monitored terminal (at the start of the conversation). [Figure 12] This is an example of the display screen of the monitored terminal (question / option presentation). [Figure 13] This is example 3 of the display screen of the monitored terminal (when confirming and sending message candidates). [Figure 14] This is an example of the display screen of the monitored terminal (emergency mode). [Figure 15] This is an example of the display screen of the monitored terminal (when a new message is received). [Figure 16] This is an example of a monitor terminal (smartphone app) screen (message reception screen). [Figure 17] This is example 2 of the monitor terminal (smartphone app) screen (conversation log confirmation screen). [Figure 18] This is example 3 of the monitor terminal (smartphone app) screen (emergency notification reception screen). [Figure 19]This is an example screen of the caregiver terminal (smartphone app) 4 (settings screen). [Figure 20] This is an example screen of the caregiver terminal (smartphone app) 5 (learning status / conversation tendency report screen). [Figure 21] This is an example screen of the caregiver terminal (smartphone app) 6 (server connection status screen).
Embodiments for Carrying Out the Invention
[0012] Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In each figure, the same or corresponding components are denoted by the same reference numerals, and redundant descriptions will be omitted as appropriate.
[0013] (Example of the overall configuration of the system) FIG. 1 is a diagram showing an example of the overall configuration of the watched-over system 1 according to the present embodiment. The watched-over system 1 includes a watched-over terminal 100 carried by the watched-over person P, a caregiver terminal 200 used by the caregiver G, and an information processing device (server) 500 connected to these terminals via a communication network NW. The communication network NW can include any communication network such as the Internet, a mobile phone network, or a wireless LAN. The caregiver terminal 200 is typically a smartphone or tablet terminal owned by a protector, and by executing dedicated application software (hereinafter referred to as "caregiver app"), it receives information from the watched-over terminal 100 and performs various settings. The information processing device 500 not only relays the communication between the watched-over terminal 100 and the caregiver terminal 200, but also plays a role in executing part or all of the dialogue generation process and emergency degree determination process described later, and storing and managing various data.
[0014] In this specification, "dialogue" refers to a process in which information is exchanged at least once between an information transmitter (e.g., a person being monitored or a dialogue control unit) and an information receiver (e.g., a dialogue control unit or a person being monitored), thereby supporting the formation of some kind of common understanding, decision-making, or information generation. This exchange of information may include not only linguistic information (voice, text, etc.) but also non-linguistic information (operational input, gestures, icon selection, etc.). Furthermore, "inquiry" refers to any interaction aimed at prompting a response, action, or thought from the other party, and is used as a concept that encompasses not only question-form utterances but also the presentation of options, requests for confirmation, and requests for supplementary information. Furthermore, the term "dialogue" in this invention does not necessarily mean that information is transmitted an equal number of times between the two parties. Even if one party (e.g., the dialogue control unit) takes the lead in presenting information, and the response from the other party (e.g., the person being monitored) is limited, or even if there is no response at all and the process proceeds through a predetermined confirmation process, if the initial objective (in this invention, clarification of the intent of the message content) is achieved throughout the entire process, this can also be considered a form of "dialogue."
[0015] (Example hardware configuration for a monitored device) Figure 2 is a block diagram showing an example of the hardware configuration of the monitored terminal 100 according to this embodiment. The monitored terminal 100 can be configured, for example, as a wristwatch-type wearable device worn on a child's wrist, a pendant-type device worn around the neck, or a small, dedicated terminal that can be attached to a bag or the like. The monitored terminal 100 includes a processor 101, memory 102, location information acquisition unit 103, operation input unit 104, microphone 105, speaker 106, communication unit 107, display unit 108, vibration unit 109 (optional), and battery 110. These components are interconnected via a bus 111.
[0016] The processor 101 controls the overall operation of the monitored terminal 100. By executing programs stored in the memory 102, the processor 101 functions as one of the functional units (operation input reception unit, voice input reception unit, location information acquisition control unit, dialogue control unit, message acquisition unit, transmission control unit, urgency determination unit, mode control unit, display control unit, etc.) as shown in the functional block diagram (Figure 4) described later. In this embodiment, the processor 101 may be configured to execute only specific pre-installed functions and may be configured as a so-called closed system that does not have the function to install and execute application programs developed by a third party afterwards. Some of the processing by the dialogue control unit (for example, initial processing of speech recognition, simple dialogue control, etc.) may be executed on the terminal side, while more advanced processing (natural language understanding, complex dialogue strategies, urgency determination, learning, etc.) may be executed in cooperation with the information processing device 500.
[0017] Memory 102 stores programs executed by processor 101, data being processed, parts of the dialogue model (such as a lightweight version), short-term learning data of the child's voice patterns and behavioral patterns, and dialogue logs.
[0018] The location information acquisition unit 103 acquires location information indicating the current location of the monitored terminal 100. As an intermediate concept, the location information acquisition unit 103 may include means of using satellite positioning systems, means of using short-range wireless communication, means of using mobile phone base station information, or means of using a combination of these information. As a lower-level concept, examples include satellite positioning means such as GPS (Global Positioning System) receivers, Michibiki (QZSS) receivers, BeiDou receivers, Galileo receivers, and SBAS-compatible receivers, positioning means based on Wi-Fi access point information, positioning means using Bluetooth® Low Energy (BLE) beacons, or positioning means based on base station IDs and radio wave strength of the mobile phone network. Furthermore, motion estimation positioning means that perform pedestrian autonomous positioning (PDR) using information from motion sensors such as acceleration sensors and gyroscopes may be used as an auxiliary or in combination. This makes it possible to acquire or estimate location information with a certain degree of accuracy even indoors or in places where satellite radio waves are difficult to reach.
[0019] The operation input unit 104 is an interface for the person being monitored P to input various operation instructions. In one example of this embodiment, the operation input unit 104 can be configured without a free text input interface. In this case, the operation input unit 104 may include, as an intermediate concept, an input means for detecting physical pressure or an input means for detecting touch operation (however, without a keyboard display for free text input). As a lower-level concept, examples include one or more physical push buttons (e.g., a power button, a message send / call start button, an SOS dedicated button, a YES / NO answer button, a function selection button, etc.), a slide switch, a jog dial, etc. As an input means for detecting touch operation, a capacitive or resistive touch sensor can be provided in a specific area to accept gesture input such as tap, swipe, and long press, but these are used for activating or selecting specific functions and are not used for free text input.
[0020] In this context, "lacking a free text input interface" means that the device essentially lacks an interface that allows the user to input any string of characters without restriction, such as a smartphone or PC keyboard (for example, a full QWERTY keyboard display, or an interface that allows continuous display of all Japanese syllabary characters and free combination input). Therefore, for example, an interface that displays a limited number of symbols, icons, or parts of the Japanese syllabary (e.g., representative characters of rows such as "a" and "ka," or candidate combinations of several characters to form a specific word) on the display unit 108 for the purpose of assisting interaction with the dialogue control unit, and allows the person being monitored to select these by tapping, in combination with voice input, or as a substitute for voice input, to respond to the dialogue control unit or input message elements, may fall under the category of "operation input unit without a free text input interface." Such limited selection-based input is effective in situations where speaking is difficult or as a means of assisting more accurate information transmission, and is distinct from communication through free text input. For example, an input method that does not have a free text input interface is one such example of an input unit that does not have a free text input interface. This could involve selecting an icon indicating a response such as "yes" or "no" in response to a question from the dialogue control unit, selecting one of several fixed emoji icons that represent an emotion, or displaying several location icons or the first letter of a location name (e.g., "ga", "ko", "i") as options in response to a question about a place.
[0021] The aforementioned "operation input unit without a free text input interface" is not primarily intended for the free creation and input of arbitrary new sentences or phrases by the person being monitored, but rather, (a) Selection from a limited set of options presented by the dialogue control unit or the system (e.g., icons, canned phrases, message suggestions, emotion expression templates, yes / no responses, etc.) (b) Pre-registered or configured specific information (e.g., trigger for emergency contact, instruction to send a standard message, activation of a specific function), or (c) In the course of the dialogue with the dialogue control unit, the dialogue control unit selects from a very small number of characters, symbols, or words that have been narrowed down according to the context (e.g., when the dialogue control unit asks, "Who are you talking about? A friend whose name starts with 'Ta'?", it presents a few proper noun candidates such as "Tanaka" and "Takahashi," or the limited character set used to construct them). This is designed to allow such operations to be performed simply and intuitively. Therefore, even if the operation input unit includes a configuration that partially allows the input of a limited range of characters and symbols (e.g., input of vowel-consonant combinations assigned to several buttons to assist in situations where voice input is difficult, or a numeric keypad-style input interface that works in conjunction with predictive candidates of the dialogue control unit), as long as it is primarily a selective operation based on support or prior limitations from the dialogue control unit, as described in (a) to (c) above, and is clearly distinguishable from interfaces such as QWERTY keyboards or Japanese syllabary flick input, which are primarily intended to allow users to freely construct strings from a wide vocabulary and input new information without active support from the system, then the operation input unit falls under the category of "not having a free character input interface." The operation input unit in the present invention is intended to reduce the cognitive load on the person being monitored (especially young children) and enable dialogue and communication with the dialogue control unit without requiring complex character input.
[0022] Microphone 105 acquires the voice of the person being monitored P and surrounding sounds as analog signals and converts them into electrical signals.
[0023] Speaker 106 outputs voice guidance, questions, message readings, warning sounds, and voice messages from the caregiver as audio.
[0024] The communication unit 107 communicates data with the information processing device 500 via a communication network NW (for example, LPWA (Low Power Wide Area) communication methods such as LTE-M, NB-IoT, or Wi-Fi, Bluetooth®, etc.).
[0025] The display unit 108 is configured as a "display function" and displays basic information such as the time, battery level, time of incoming voice messages, and sender's icon, as well as information related to interaction with the dialogue control unit (inquiry keywords, choice icons, message candidate summaries, etc.) and emergency mode warnings. It does not include video playback or other similar functions. The displayed content can be muted.
[0026] The vibration unit 109 transmits incoming call notifications, warnings, etc., to the person being monitored through vibration. The battery 110 supplies power to various parts of the monitored terminal 100.
[0027] (Example of hardware configuration for an information processing device (server)) Figure 3 is a block diagram showing an example of the hardware configuration of the information processing device 500 according to this embodiment. The information processing device 500 includes a communication interface (IF) 501, a storage device 502, a CPU 503, and the like.
[0028] The communication interface IF501 is an interface for communicating with the monitored terminal 100 and the monitor terminal 200 via the communication network NW.
[0029] The storage device 502 is an HDD or SSD, and stores various data, information processing programs, dialogue models, learning data for each monitored person, account information, and so on.
[0030] The CPU 503 controls the operation of the entire information processing device 500 and executes programs stored in the storage device 502, thereby functioning as each of the functional units (communication control unit, account management unit, dialogue generation unit, data processing unit, database, etc.) shown in the functional block diagram (Figure 5) described later.
[0031] (Example of functional block for a monitored device) Figure 4 shows the main functional blocks of the monitored terminal 100. The processor 101 executes a program stored in the memory 102, thereby functioning as a device equipped with an operation input reception unit 401, a voice input reception unit 402, a location information acquisition unit 403, a dialogue control unit 404, a message acquisition unit 405, a transmission control unit 406, an emergency level determination unit 407 (optional), a mode control unit 408 (optional), a display control unit 409 (optional), and a voice output unit 410.
[0032] The operation input receiving unit 401 receives operation input from the person being monitored P via the operation input unit 104.
[0033] The voice input receiving unit 402 receives voice input from the person being monitored P via the microphone 105.
[0034] The location information acquisition unit 403 controls the location information acquisition unit 103 (hardware) and acquires location information.
[0035] The dialogue control unit 404, based on the voice data received by the voice input receiving unit 402 and instructions from the operation input receiving unit 401, performs a dialogue with the person being monitored P in cooperation with the dialogue generation unit (described later) of the information processing device 500, or based on the dialogue logic inside the terminal. Specifically, when it is determined that the voice input from the person being monitored P satisfies "predetermined conditions" (this determination itself may be left to the information processing device 500), it outputs the "question" content received from the information processing device 500 or the "question" content generated inside the terminal via the voice output unit 410 or the display control unit 409, and sends the response from the person being monitored P to the information processing device 500 or processes it inside the terminal.
[0036] The message acquisition unit 405 acquires the message content that has been determined through dialogue with the dialogue control unit 404. This message content may be generated and determined within the terminal by the dialogue control unit 404, or it may be determined in cooperation with the information processing device 500 and received from the information processing device 500.
[0037] The transmission control unit 406 transmits the message acquired by the message acquisition unit 405, along with the location information acquired by the location information acquisition unit 403, to the caregiver's terminal 200 via the communication unit 107.
[0038] The urgency determination unit 407 determines the urgency of the situation of the person being monitored based on various sensor data, voice data, and operation patterns acquired by the monitored terminal 100. This determination may be performed in cooperation with the information processing device 500.
[0039] The mode control unit 408 controls the operating mode of the monitored terminal 100 (normal mode, emergency mode, etc.) based on the determination result of the emergency determination unit 407.
[0040] The display control unit 409 controls the display content of the display unit 108. The audio output unit 410 controls the speaker 106 to output sound.
[0041] (Example of a functional block for an information processing device (server)) Figure 5 shows the main functional blocks of the information processing device 500. The CPU 503 executes programs stored in the storage device 502, thereby functioning as a device comprising a communication control unit 511, an account management unit 512, a dialogue generation unit 513, a data processing unit 514, a database 515, a transmission control unit 516, and the like.
[0042] The communication control unit 511 controls the transmission and reception of data between the monitored terminal 100 and the monitor terminal 200 via the communication interface 501.
[0043] The account management unit 512 manages the account information, device information, and settings information of the person being monitored P and the monitor G.
[0044] The dialogue generation unit 513 generates a question to clarify the intent of the voice input received from the monitored terminal 100 when the voice input meets predetermined conditions. It also executes a dialogue based on the response received from the monitored terminal 100 and determines a message to be conveyed to the caregiver. The dialogue generation unit 513 may include, as an intermediate concept, voice information processing function, natural language processing function, dialogue strategy generation function, situation judgment function, learning control function, etc.
[0045] The voice information processing function, as a sub-concept, includes a voice recognition function that converts voice data received from the monitored terminal 100 into text (e.g., a Hidden Markov Model (HMM) based or Deep Neural Network (DNN) based voice recognition engine), and a voice synthesis function that generates synthesized speech from the text information and transmits it to the monitored terminal 100 (e.g., a statistical parametric speech synthesis or waveform concatenation speech synthesis engine). It also includes an emotion estimation function that analyzes the prosodic information of the speech (pitch, power, speaking speed, voice quality, etc.) and estimates emotions.
[0046] Natural language processing functions, as a sub-concept, perform syntactic analysis, semantic analysis, intent extraction, keyword extraction, and ambiguity determination of transcribed speech content. For example, large-scale transformer-based language models (LLMs) fine-tuned to the language characteristics of children, or lighter recurrent neural network (RNN)-based models may be used.
[0047] The dialogue strategy generation function determines the content and format of the "question" to be sent next to the monitored terminal 100 (for example, an open question, a closed question, or a choice of options) based on the analysis results from the natural language processing function, the dialogue history stored in the database 515, and the child's profile.
[0048] The situation assessment function comprehensively analyzes various sensor information (location, movement, operation patterns, etc.) and voice information received from the monitored terminal 100, as well as normal behavior and voice patterns stored in the database 515, to determine whether "predetermined conditions" are met and to assess the degree of urgency. Rule-based inference engines and machine learning models (e.g., support vector machines (SVMs), decision trees, random forests, gradient boosting, etc.) may be used for this assessment.
[0049] The learning control function continuously updates the parameters of various models (language model, dialogue strategy model, urgency determination model, etc.) in the dialogue generation unit 513 based on dialogue history, sensor data, and user feedback stored in the database 515, thereby improving personalization and performance.
[0050] The data processing unit 514 performs tasks such as linking location information with messages, generating notification data, converting audio data to a file format, and adjusting the volume.
[0051] Database 515 stores and manages account information, learning data for each person being monitored (voice patterns, behavioral patterns, dialogue history, message style, etc.), dialogue models, map information, emergency contact information, and more.
[0052] The transmission control unit 516 transmits the question generated by the dialogue generation unit 513 to the monitored terminal 100. It also transmits the message confirmed based on the dialogue data received from the monitored terminal 100, as well as the location information acquired by the monitored terminal 100, to the monitor terminal 200.
[0053] (Example of a processing flow) Figure 6 is a flowchart showing an example of the normal message generation and transmission process in the monitored terminal 100 according to this embodiment.
[0054] First, when the person being monitored P indicates their intention to send a message and makes a voice input (step S601), the processor 101 (dialogue control unit 404) of the monitored terminal 100 determines whether the voice input satisfies "predetermined conditions" (for example, whether or not it contains ambiguity) (step S602). This determination may also be performed by the dialogue generation unit 513 of the information processing device 500 after the monitored terminal 100 transmits the voice data to the information processing device 500.
[0055] "Prescribed conditions" are, as an intermediate concept, cases in which the clarity of the voice input is judged to be low, cases in which the specificity of the content of the voice input is judged to be low, cases in which the emotional state of the person being monitored, estimated from the voice input, is judged to be a specific state, or cases in which the voice input is judged to deviate from past communication patterns.
[0056] As a sub-concept, if the clarity of the voice input is low, it can be determined based on specific physical characteristics such as the volume being below a predetermined threshold, frequent interruptions in speech, high ambient noise, or speech speed being too fast or too slow. If the specificity of the content of the voice input is low, it can be determined based on content characteristics such as few keywords included in the speech, a significant lack of the main components of the message (elements represented by the 5W1H such as who, what, when, where, why, and how), or an abundance of demonstrative pronouns (that, it, etc.) making the object unclear. If the emotional state of the person being monitored, as estimated from the voice input, is a specific state, it may occur, for example, when the dialogue control unit 404 or dialogue generation unit 513 estimates an emotion that could affect the interpretation and transmission of the message content, such as strong anger, sadness, fear, or excitement, based on the tone of voice, intonation, and tremor of the voice. When voice input deviates from past communication patterns, it may be, for example, when the vocabulary used is different from what is normally used, the message length is significantly different from usual, or the tone of voice used is different from what is normally used when speaking to a particular person, and there is a statistically significant difference compared to the past dialogue history and voice characteristics of the person being monitored stored in database 515.
[0057] If the predetermined conditions are not met (NO in S602), that is, if it is determined that the voice input is clear, a message is generated based on the voice input (step S603) and sent to the monitor terminal 200 along with location information (step S606). This message generation may also be performed by the dialogue generation unit 513 of the information processing device 500, and the generated message data may be sent to the monitor terminal 200.
[0058] On the other hand, if it is determined that the voice input meets predetermined conditions (YES in S602), the dialogue control unit 404 (or the dialogue generation unit 513 of the information processing device 500) generates a "question" to clarify the intent of the message content, outputs it as voice from the voice output unit 410 of the monitored terminal 100, and, if a display unit 108 is provided, displays related information via the display control unit 409 (step S604).
[0059] This "questioning," as a middle-level concept, includes questions that encourage the completion of missing information, questions that present multiple specific options, questions that encourage the refinement of expression, or questions that encourage the confirmation of feelings or reasons.
[0060] As a sub-concept, questions that encourage the completion of missing information generate specific questions such as "Who are you talking about?", "Where do you want to go?", and "When did this happen?" regarding elements of the main components of the message (5W1H) that the dialogue control unit 404 could not extract from the voice input or that it judged to be ambiguous. Questions that present multiple specific options are, for example, when a child only says "sweets", the dialogue control unit 404 presents two or three possible candidates, such as "Are you talking about cookies or chocolate?", and lets the child choose. Questions that encourage the refinement of expression are, for example, when the dialogue control unit 404 has generated a draft message, it confirms with questions such as "Should we send this? Or should we try saying it a little more gently?", or presents alternative expressions such as "You could also say it like this." Questions that encourage confirmation of emotions and reasons are, for example, when a child makes a request in an angry voice, the dialogue control unit 404 attempts to draw out the reasons behind the emotion or the child's true feelings, such as "You sound angry, why do you think that?"
[0061] Then, the system receives a response (voice input or operation input) from the person being monitored P and clarifies and refines the message content through dialogue (step S605). This dialogue may be repeated until the intention becomes clear or until a predetermined number of dialogues are reached. This clarification and refinement process is also performed while sending and receiving response data and new question / message draft data between the monitored terminal 100 and the information processing device 500. Once the final message is determined through the dialogue, the message acquisition unit 405 acquires it, and the transmission control unit 406 transmits it to the monitor terminal 200 along with location information (step S606).
[0062] (Specific examples) For example, if the child P, who is being monitored, speaks only "Mom, the park..." into the microphone 105 (S601), the processor 101 (dialogue control unit 404) transmits the voice data to the information processing device 500. The dialogue generation unit 513 of the information processing device 500 determines that this voice input lacks specificity (meets a predetermined condition) (YES in S602). Therefore, the dialogue generation unit 513 generates a question such as "What happened at the park? Can you tell me what you want to tell your mom?" and transmits it to the monitored terminal 100. The monitored terminal 100 outputs the question audibly from the speaker 106. At the same time, the display unit 108 displays, for example, a "park" icon 901 and a "?" mark 902, or a keyword 903 such as "What do you want to tell me?" as shown in Figure 11 (S604). When the child responds, "I fell down while playing," the audio is sent to the information processing device 500, and the dialogue generation unit 513 further asks, "Are you okay? Does it hurt anywhere?" As shown in Figure 12, the display unit 108 displays, for example, an icon 904 of a "bandage," the words "Are you okay?" 905, and "Yes" icons 906 and "No" icons 907 to prompt a response. When the child responds, "My knee hurts a little, but I'm okay," and presses the button corresponding to the "Yes" icon 906, the dialogue generation unit 513 integrates these conversational contents and generates a draft message such as, "To Mom: I fell down while playing in the park. My knee hurts a little, but I'm okay," and sends it to the monitored terminal 100 to ask the child for confirmation. At this time, the display unit 108 may also display a summary of the message (e.g., "I fell down in the park. My knee hurts, but OK" 908) and a transmission confirmation icon (e.g., a paper airplane mark 909) as shown in Figure 13 (S605). When the child responds with "Yes, that's fine" (by pressing the YES button on the operation input unit 104, for example), the processor 101 (transmission control unit 406) sends this confirmed message, along with the location information acquired by the location information acquisition unit 403 at that time, to the caregiver terminal 200 (S606).
[0063] (Regarding learning) The dialogue generation unit 513 (learning control function) of the information processing device 500 stores the dialogue history with the person being monitored P (such as what questions were asked, how the child responded, and what message was ultimately generated or selected) in the database 515. Based on the stored dialogue history, the dialogue model (language model, dialogue strategy model, etc.) within the dialogue generation unit 513 is adaptively adjusted. For example, it learns the vocabulary and expression patterns frequently used by a particular child, easy-to-understand question formats, and easy-to-respond-to options, and automatically adjusts the content and order of questions, the style of message candidate generation, and the content displayed on the display unit 108 to better suit that child in the next dialogue. As a result, the more it is used, the more natural, easy to understand, and less stressful the dialogue control becomes as a communication partner for the child. The memory 102 of the monitored terminal 100 may temporarily store the most recent dialogue history with the monitored person P, response patterns to specific questions, and recognition results of frequently used words and phrases. The dialogue control unit 404 operating on the processor 101 may perform adaptive adjustments based on this locally stored dialogue history, such as adjusting the order of questions, adjusting the display interface, or fine-tuning the weighting of the local recognition model.
[0064] (Regarding emergency response) Figure 7 is a flowchart showing an example of emergency response processing in the monitored terminal 100 according to this embodiment.
[0065] The processor 101 (emergency determination unit 407) of the monitored terminal 100 collects information regarding the status of the monitored person P (such as acoustic characteristics of voice input, operation patterns to the operation input unit 104, and location and movement patterns from the location information acquisition unit 403) continuously or periodically, and transmits it to the information processing device 500 (step S701).
[0066] The dialogue generation unit 513 (situation judgment function, urgency determination model) of the information processing device 500 compares and analyzes this information with normal patterns stored in the database 515 to determine the urgency of the situation (step S702). This urgency determination, as an intermediate concept, includes processes for estimating abnormal emotions (fear, pain, etc.) from the voice of the person being monitored, processes for detecting unusual operations on the operation input unit (pressing the SOS button, abnormal rapid pressing, etc.), processes for detecting dangerous situations (entering a dangerous area, falling / impact, prolonged immobility, etc.) from location information and motion sensors, or processes for evaluating these pieces of information in combination. Sub-concepts include processes such as inputting acoustic features of speech (pitch, power, formant, Mel-frequency cepstrum coefficient (MFCC), etc.) into a machine learning model to classify emotions, detecting the magnitude of falls or impacts from acceleration sensor data patterns and determining whether the value exceeds a predetermined threshold, and determining from GPS information whether the current location is within a pre-registered danger area or deviates significantly from the usual range of activity. Furthermore, these individual judgment results are comprehensively evaluated using rule-based or machine learning-based methods to determine the final urgency level (e.g., four levels: low, medium, high, and SOS).
[0067] If the urgency level exceeds a preset threshold (YES in S702), the information processing device 500 notifies the monitored terminal 100 of the determination result. The processor 101 (mode control unit 408) of the monitored terminal 100 receives this notification and automatically switches the operating mode to emergency mode (step S703).
[0068] In emergency mode, the processor 101 simplifies or stops the detailed dialogue processing that is performed during normal operation (step S704) and executes predetermined emergency contact processing. Specifically, it generates emergency information including the current location information of the person being monitored P, the determined level of urgency, and, if necessary, voice recordings (step S705), and automatically transmits it to the primary contact person's (usually a guardian) monitoring terminal 200 via the information processing device 500 (step S706).
[0069] Subsequently, the information processing device 500 or the processor 101 of the monitored terminal 100 monitors whether a response from the first contact can be confirmed within a predetermined time (step S707). If there is no response (NO in S707), it performs an escalation process to send the same emergency information to the second contact (step S708).
[0070] Furthermore, during emergency mode, the processor 101 of the monitored terminal 100 may also execute in parallel processes (step S709) such as generating a warning sound from the speaker 106 or displaying an emergency warning on the display unit 108 (via the display control unit 409) such as a flashing red light as shown in Figure 14, a large SOS icon 910, or the current status (e.g., "Contacting guardian" 911).
[0071] (Regarding UI / UX) If the monitored terminal 100 in this embodiment is equipped with a display unit 108, the dialogue control unit 404, when executing a question, displays an easily understandable icon related to the content of the question (e.g., the park icon 901 and the "?" mark 902 in Figure 11, and the bandage icon 904 in Figure 12) or multiple response options (e.g., the "yes" icon 906 and the "no" icon 907 in Figure 12) on the display unit 108. Furthermore, depending on the content of the question, a UI could be adopted that displays more specific options, such as buttons indicating specific rows of the Japanese syllabary (e.g., "a," "ka," "sa"), icons representing place categories (e.g., "school," "home," "park"), or simple face icons representing emotions (e.g., "happy," "sad," "angry"), allowing the person being monitored to tap these as part of their response. This allows the monitored person P to more easily understand the intent of the conversation and respond appropriately by combining voice instructions with visual information. The display control unit 409 controls the basic display functions such as the clock, battery level, and sender icon so that they are displayed in accordance with, or in conjunction with, the conversation and emergency situation displays. For example, during a conversation, as shown in Figures 11 to 13, the clock display area is temporarily switched to the message display area, or the time and battery level are displayed in small font at the top of the screen while the main area is used for the conversation. Figure 15 shows an example of the display function when a new message is received, showing the sender icon 912, a message 913 such as "You have a message!", and the reception time 914.
[0072] (modified version) Instead of the person being monitored responding verbally to the question "What happened at the park?", they can tap to select one of several icons or short phrases displayed on the display unit 108, such as "It was fun," "I fell down," or "I played with friends." Furthermore, when asked "Does anything hurt?", they can tap the "band-aid" icon from the displayed icons of "face (smiling)," "face (crying)," or "band-aid," and then add verbally, "My knee hurts a little, but I'm okay." This type of dialogue is conceivable. In this way, combining voice input with tap input from limited options displayed on the display unit (icons, symbols, short phrases, etc.) enables smoother and more accurate communication.
[0073] Furthermore, providing a dedicated physical button for sending an SOS signal on the operation input unit 104 is an effective means for children to intuitively and quickly call for help in emergencies. When this SOS button is pressed, the processor 101 (emergency determination unit 407) transmits the information to the information processing unit 500, and the information processing unit 500 (situation judgment function of the dialogue generation unit 513) immediately determines that the emergency level is the highest level and instructs the monitored terminal 100 to switch to emergency mode and start emergency contact processing.
[0074] (Examples of UI for monitoring devices) Figure 16 shows an example of how a message and location information are displayed on the monitoring terminal 200 (smartphone app) used by the monitor G when it receives a message and location information from the monitored terminal 100 during normal operation. The current location icon 1002 of the monitored person P is displayed on the map 1001, and the confirmed message (e.g., "I fell in the park, but I'm okay!") is displayed in a speech bubble 1003. The message sending time 1004 and the battery level 1005 of the monitored terminal 100 are also displayed.
[0075] Figure 17 shows an example of the dialogue log confirmation screen (displayed according to privacy settings). The main exchanges between the person being monitored P and the dialogue control unit (or dialogue generation unit) (e.g., "What's wrong?", child: "I fell down", "Are you okay?", child: "Yes") are displayed in chronological order, allowing the monitor G to gain a deeper understanding of the context in which the messages were generated.
[0076] Figure 18 shows an example of an emergency notification screen. The entire screen is displayed in a warning color (e.g., red), and the type of emergency (e.g., "SOS button pressed!" 1007), the last location of the person being monitored P 1008, and the time of the emergency 1009 are displayed in large letters. Action buttons that encourage immediate response, such as the "Audio Confirmation" button 1010 (plays ambient sounds) and the "Contact Police" button 1011, are also placed on the screen.
[0077] Figure 19 shows an example of the settings screen on the caregiver terminal 200. The caregiver G can arbitrarily set the level of intervention for dialogue support (e.g., "actively assist," "moderately assist," "OFF") 1012, the priority setting of emergency contacts 1013, the setting of safe zones (geofences) 1014, and the ON / OFF setting of the learning function 1015. These settings are saved in the information processing device 500. As part of the "active assistance" mode, or as a separate setting, a "constant support mode" may be provided that prompts for confirmation or additional information each time voice input is received, regardless of the ambiguity of the content. This is effective, for example, for very young children or when more attentive support is desired. When the caregiver selects this mode, the processor 101 of the monitored terminal 100 will cooperate with the information processing device 500 each time voice input is received and will execute a question without going through a formal "predetermined condition" check, or using "the fact that voice input has been received" as the "predetermined condition". Furthermore, the settings screen of the monitoring terminal 200 may include an item that allows setting the level of location information to be added to messages sent from the monitored terminal 100 (for example, "Always add detailed location information," "Do not add location information except in emergencies," "Notify only the location category," "Do not add location information," etc.). Alternatively, the dialogue control unit or dialogue generation unit may be able to turn on or off a function that automatically adjusts the level of location information added according to the situation, and the criteria for this decision may be set (e.g., omit sending location information within a specific safe zone). Furthermore, the settings screen of the monitoring terminal 200 may include items that allow the dialogue control unit or dialogue generation unit to automatically generate and send a message without waiting for a clear response from the person being monitored, as well as details of the confirmation process in such cases (e.g., confirmation time, whether or not a cancellation operation is required). For example, options such as "Automatically send a message in case of emergency or if there is no response" and "Always require final confirmation from the person being monitored" may be provided, allowing the monitor to configure settings according to the characteristics of the person being monitored and the usage scenario.
[0078] Figure 20 shows the "Learning Status / Dialogue Trend Report Screen," which is a newly added example screen for the supervisor terminal. On this screen, the communication data of the person being monitored P (statistical information with consideration for privacy; e.g., tendencies in frequently used words, number of cases where intent became clear in dialogue, message sending frequency, etc.) stored in the information processing device 500 is presented to the supervisor G in graphs and summaries 1016. This allows the supervisor G to indirectly understand the child's communication development and the effectiveness of dialogue support. In addition, permission settings regarding the use of learning data 1017 can also be made on this screen.
[0079] Figure 21 shows the "Server Linkage Status Screen," which is a newly added example screen for the monitoring terminal. It displays the communication status between the monitored terminal 100 and the information processing device 500, the version information of the dialogue model, the data synchronization status, etc., allowing the health of the system to be confirmed.
[0080] (Regarding improvements to computer functions and user interfaces) This invention contributes to improving the functionality of a coordinated computer system consisting of a monitored terminal 100 and an information processing device 500. Processing load and efficiency: By performing much of the dialogue processing on the information processing device 500, the processing load on the monitored terminal 100 can be reduced, and battery consumption can be suppressed. The information processing device 500 can utilize more advanced and large-scale dialogue models, and by aggregating and learning from data from multiple monitored terminals, the accuracy of dialogue support can be continuously improved. By optimizing the communication data format between the terminal and the server, the amount of communication can also be reduced. User Interface Improvements: The display function (display unit 108) of the monitored terminal 100 is utilized as a dialogue interface, combining voice, visual information (icons, keywords, choices), and physical button operations to enable intuitive and easy communication with dialogue support, even for children who may have difficulty with text input. The UI of the monitor terminal 200 is also improved to allow for a deeper understanding of the child's situation and the function of dialogue support through dialogue logs (Figure 17), learning progress reports (Figure 20), etc. Data structure: The profile of each person being monitored (voice characteristics, behavioral patterns, communication style, dialogue history, emotional tendencies, etc.) stored in the database 515 of the information processing device 500 is an important data structure that forms the basis for providing personalized dialogue services. Real-time performance and reliability: The information processing device 500 intervenes in urgency assessment and emergency information transmission, enabling more reliable and rapid processing (e.g., simultaneous notification to multiple contacts, escalation management, and collaboration with related organizations (future expansion)), thereby improving system reliability.
[0081] (Combinations of each embodiment) It should be noted that the present invention is not limited to the embodiments described above, and can be modified as appropriate without departing from the spirit of the invention. For example, each component and processing step described above can be combined arbitrarily as long as they do not contradict each other. Furthermore, elements of each embodiment may be combined as appropriate.
[0082] Addendum (Summary) [General tasks] To support the communication of those being cared for. To provide technology that improves the quality of communication support, enabling individuals being cared for, especially those with limited verbal expression skills, to communicate their intentions and circumstances more accurately and easily to their caregivers using a device they carry.
[0083] Issues corresponding to [Appendix 1] To provide technology that improves the quality of communication support, enabling individuals being cared for, especially those with limited verbal expression skills, to communicate their intentions and circumstances more accurately and easily to their caregivers using a device they carry. [Note 1] (corresponding to claim 1) A monitoring device carried by the person being monitored, An operation input receiving unit that receives operation input from the person being monitored, A voice input receiving unit that receives voice input from the person being monitored, A location information acquisition unit that acquires location information, A dialogue control unit generates a question to clarify the intent of the voice input when the voice input received by the voice input receiving unit satisfies predetermined conditions, performs a dialogue with the person being monitored, and obtains a message determined by the dialogue. A transmission control unit transmits the message acquired by the dialogue control unit, along with the location information acquired by the location information acquisition unit, to the caregiver's terminal. A monitoring terminal characterized by being equipped with the following features. [Effects of Appendix 1] In the monitored terminal, when certain conditions are met, such as when voice input from a monitored person with underdeveloped language skills, such as a child, is ambiguous, the dialogue control unit can proactively ask questions and clarify the intent of the message through dialogue, thereby improving the quality of communication support.
[0084] Issues corresponding to [Appendix 2] By specifying the methods of operation and input, we can provide communication support that is easier to use for specific user groups (e.g., young children or others who have difficulty with complex character input). [Note 2] (corresponding to claim 2) The aforementioned operation input receiving unit is a monitored terminal as described in Appendix 1, which does not have a free text input interface. [Effects of Appendix 2] By not having a free text input interface for the control input section, even young children or those being monitored who are unfamiliar with specific operations can use the communication support function with simple operations, reducing the possibility of errors and enabling more intuitive communication.
[0085] Issues corresponding to [Appendix 3] To improve the accuracy of communication support by specifying the physical voice characteristics on which the dialogue control unit determines that the voice input of the person being monitored is unclear and initiates an active dialogue. [Note 3] (corresponding to claim 3) The monitored terminal as described in Appendix 1, wherein the predetermined condition includes the condition that the volume of the voice input received by the voice input receiving unit is below a predetermined threshold. [Effects of Appendix 3] By intervening when the volume of voice input is low, it can assist in message generation even with difficult-to-hear speech, thereby increasing the success rate of communication.
[0086] Issues corresponding to [Appendix 4] To improve the accuracy of communication support by specifying the physical voice characteristics on which the dialogue control unit determines that the voice input of the person being monitored is unclear and initiates an active dialogue. [Note 4] (corresponding to claim 4) The monitored terminal as described in Appendix 1, wherein the predetermined conditions include the fact that the voice input received by the voice input receiving unit has been interrupted. [Effects of Appendix 4] By intervening when voice input is interrupted, it is possible to understand the intent even from incomplete speech and support message generation.
[0087] Issues corresponding to [Appendix 5] To further improve the quality of communication support by specifying the content-based voice characteristics on which the dialogue control unit determines that the voice input from the person being monitored is not specific enough and initiates an active dialogue. [Note 5] (corresponding to claim 5) The monitored terminal as described in Appendix 1, wherein the predetermined condition includes the fact that the number of keywords included in the voice input received by the voice input receiving unit is less than a predetermined threshold. [Effects of Appendix 5] By intervening when the content of the voice input is not very specific, it is possible to understand the intent even from utterances with insufficient information and support the generation of clearer messages.
[0088] Issues corresponding to [Appendix 6] The dialogue control unit estimates the emotional state of the person being monitored and initiates a dialogue based on that estimate, thereby providing communication support that is more attentive to the feelings of the person being monitored. [Note 6] (corresponding to claim 6) The monitored terminal as described in Appendix 1, wherein the predetermined condition includes the case where the dialogue control unit estimates that the monitored person is in a predetermined negative emotional state based on the acoustic characteristics of the voice input received by the voice input receiving unit. [Effects of Appendix 6] Intervening when it is estimated that the person being cared for is in a negative emotional state allows for more nuanced communication support, such as adjusting emotional statements or caring for the feelings of the person being cared for.
[0089] Issues corresponding to [Appendix 7] To make the content of the "questions" made by the dialogue control unit more specific and to efficiently clarify the main components of the message. [Note 7] (corresponding to claim 7) The questioning performed by the dialogue control unit includes a question to identify missing elements among the main components of the message content, as described in Appendix 1, for the monitored terminal. [Effects of Appendix 7] When a message lacks essential basic information (such as the 5W1H framework), asking precise questions to fill that gap improves the message's logic and clarity.
[0090] Issues corresponding to [Appendix 8] The goal is to improve the smoothness of the dialogue and enhance the effectiveness of communication support by having the dialogue control unit ask questions in a format that is easy for the person being monitored to respond to. [Note 8] (corresponding to claim 8) The monitored terminal as described in Appendix 1, wherein the dialogue control unit generates a plurality of message candidates and presents them to the monitored person, and confirms the message based on the selection made by the monitored person. [Effects of Appendix 8] By presenting multiple message options, even those who have difficulty expressing themselves verbally can easily respond, allowing the conversation to proceed smoothly.
[0091] Issues corresponding to [Appendix 9] To further ensure the safety of those being monitored, in addition to communication support functions, we will provide a function that automatically detects and responds to emergencies. [Note 9] (corresponding to claim 9) The monitored terminal further comprises an urgency determination unit that determines the urgency of the monitored person's situation based on voice input received by the voice input receiving unit, a message acquired by the dialogue control unit, or the acoustic characteristics of the voice input, and the transmission control unit transmits the message along with identifiable priority information, or with priority over other messages, when the urgency determination unit determines that the urgency is high, as described in Appendix 1. [Effects of Appendix 9] The system assesses the urgency of the situation of the person being monitored, and if the situation is urgent, it sends a message with high priority. This encourages a swift response by the caregiver in the event of an emergency, thereby improving the safety of the person being monitored.
[0092] Issues corresponding to [Appendix 10] To improve the accuracy and reliability of assessments by establishing specific criteria for determining the urgency of a situation. [Note 10] (corresponding to claim 10) The monitoring terminal described in Appendix 9 determines that the urgency level is high when the emotion estimated from the voice input is negative. [Effects of Appendix 10] By detecting negative emotions in the person being monitored and increasing the urgency level, it becomes easier to respond to mental crises.
[0093] Issues corresponding to [Appendix 11] To improve the accuracy and reliability of assessments by establishing more specific criteria for determining the urgency of a situation. [Note 11] (corresponding to claim 11) The monitoring terminal as described in Appendix 9, wherein the urgency determination unit determines that the urgency is high when the ambient noise level of the voice received by the voice input receiving unit is above a predetermined threshold. [Effects of Appendix 11] By detecting situations with high ambient noise levels (e.g., accident scenes, incidents in crowded areas) and increasing the urgency level, it becomes easier to respond to physically dangerous situations.
[0094] Issues corresponding to [Appendix 12] To optimize the dialogue function to suit the individual characteristics of the person being monitored, thereby achieving more natural and effective communication support. [Note 12] (corresponding to claim 12) The monitored terminal further comprises a storage unit for storing a history of conversations with the monitored person, and the conversation control unit adaptively adjusts the content or order of the questions based on the stored conversation history, as described in Appendix 1. [Effects of Appendix 12] By learning from conversation history and adjusting questioning techniques to suit each child, it becomes possible to provide more personalized, less stressful, and more effective communication support.
[0095] Issues corresponding to [Appendix 13] To provide a user interface (UI) that enables smoother and more intuitive communication on devices with limited text input interfaces. [Note 13] (corresponding to claim 13) The monitored terminal further comprises a display means, and the dialogue control unit causes the display means to display an icon or a set of response options related to the content of the question when the question is made, as described in Appendix 1. [Effects of Appendix 13] By displaying icons and options as visual aids, children can more easily understand the intent of the conversation and respond more readily than with voice-only interactions, thus improving the smoothness of communication.
[0096] Issues corresponding to [Appendix 14] To provide a means for those being cared for to quickly and reliably request help in emergencies. [Note 14] (corresponding to claim 14) The aforementioned operation input receiving unit is a monitored terminal as described in Appendix 1, including a dedicated physical button for emitting an SOS signal. [Effects of Appendix 14] By incorporating a dedicated physical SOS button, children can intuitively send an emergency signal without hesitation, even in a panic, increasing the likelihood of a swift rescue operation.
[0097] Issues corresponding to [Appendix 15] To provide an information processing method that supports communication for those being cared for. [Note 15] (corresponding to claim 15) An information processing method characterized by including the steps of: a processor receiving voice input from a person being monitored; generating a question to clarify the intent of the voice input if the received voice input satisfies predetermined conditions; performing a dialogue with the person being monitored; obtaining a message determined through the dialogue; adding location information to the obtained message; and transmitting the added message and location information to the monitor's terminal. [Effects of Appendix 15] Communication support methods similar to those used by monitoring devices can also be protected as inventions.
[0098] Issues corresponding to [Appendix 16] To provide a program on the monitoring device that supports communication for the person being monitored. [Note 16] (corresponding to claim 16) A program characterized in that it causes a processor to execute the following steps in a monitoring terminal carried by the person being monitored: receiving operational input from the person being monitored; receiving voice input from the person being monitored; acquiring location information; generating a question to clarify the intent of the voice input if the received voice input satisfies predetermined conditions, performing a dialogue with the person being monitored, and acquiring a message determined by the dialogue; and causing the acquired message, along with the acquired location information, to be sent to the monitor's terminal. [Effects of Appendix 16] The invention can be protected as a program that operates on a monitored terminal.
[0099] Issues corresponding to [Appendix 17] To provide a program for a monitoring terminal that receives and displays information from the person being monitored and enables responses. [Note 17] (corresponding to claim 17) A program characterized in that the processor causes the monitoring terminal to receive, from the monitored terminal or server, a message confirmed through dialogue with the monitored person when the voice input from the monitored person meets predetermined conditions, and location information acquired by the monitored terminal; to display the received message and location information on the display unit; and to transmit response information to the monitored terminal or the server when response information is received by the operation input unit in connection with the display on the display unit. [Effects of Appendix 17] The invention is protected as a program that operates on the monitor's terminal, enabling a smooth communication system between the monitor and the person being monitored.
[0100] Issues corresponding to [Appendix 18] To provide an information processing system that supports communication through the cooperation of a monitored terminal and a server. [Note 18] (corresponding to claim 18) An information processing system comprising a monitoring terminal carried by the person being monitored, and a server that can communicate with the monitoring terminal and a monitoring terminal used by the monitor via a communication network, wherein the monitoring terminal includes an operation input receiving means for receiving operation input from the person being monitored, a voice input receiving means for receiving voice input from the person being monitored, a location information acquisition means for acquiring location information, and transmits the voice input received by the voice input receiving means to the server, performs a dialogue with the person being monitored based on a question received from the server to clarify the intent of the voice input, and confirms from the server the result of the dialogue An information processing system comprising: a first control means that receives a predetermined message and transmits the message to the caregiver's terminal along with location information acquired by the location information acquisition means; and a second control means that the server receives the voice input from the monitored terminal, and if the voice input satisfies predetermined conditions, generates a question to clarify the intent of the voice input and transmits it to the monitored terminal, performs a dialogue with the monitored person based on the response from the monitored terminal to the question, and determines a message to be conveyed to the caregiver through the dialogue and transmits it to the monitored terminal. [Effects of Appendix 18] By having the monitored terminal and the server share their respective roles and work together, a more advanced and efficient overall communication support system can be protected.
[0101] Issues corresponding to [Appendix 19] To provide a server that handles the core processing for communication support. [Note 19] (corresponding to claim 19) A server capable of communicating via a communication network with a monitored terminal carried by a person being monitored and a monitor terminal used by a monitor, comprising: a communication interface that receives data including voice input from the monitored terminal and transmits data to the monitor terminal; a dialogue generation unit that generates a question to clarify the intent of the voice input when the voice input received via the communication interface satisfies predetermined conditions; a transmission control unit that causes the dialogue generation unit to transmit the question generated by the dialogue generation unit to the monitored terminal via the communication interface, receives a message confirmed based on the dialogue data including a response to the question and location information acquired by the monitored terminal from the monitored terminal via the communication interface, and transmits the received confirmed message and location information to the monitor terminal via the communication interface. [Effects of Appendix 19] This allows for the protection of the invention as a server-side device and clarifies the core functions of the communication support system. [Explanation of symbols]
[0102] 1. Monitoring System 100 ... Monitoring device 101… Processor 102… Memory 103 ... Location information acquisition unit (hardware) 104 ... Operation input section (hardware) 105... Mike 106… Speakers 107 … Communications Department 108 … Display section 109 ... Vibration section 110 ... Battery 111... Bus 200... Monitoring terminal 401 ... Operation Input Reception Unit 402… Voice Input Reception Unit 403 … Location information acquisition unit (functional unit) 404 ... Dialogue Control Unit 405 ... Message retrieval section 406 ... Transmission control unit (monitored terminal) 407 … Urgency determination unit 408 ... Mode control unit 409 ... Display Control Unit 410 ... Audio output section 500 ... Information processing equipment (server) 501 ... Communication Interface (IF) (Server) 502 ... Storage device (server) 503 … CPU (Server) 511 ... Communication Control Unit (Server) 512 … Account Management Department (Server) 513 ... Dialogue generation unit 514 … Data processing unit (server) 515 … Database (server) 516 ... Transmission control unit (server) 900 … Dialogue sender icon 901… Park icon 902... "?" mark icon 903… "What do you want to convey?" Keyword 904… Band-aid icon 905... "Are you okay?" (text) 906... "Yes" icon 907... "No" icon 908… Message Summary 909 ... Send confirmation icon 910 ... SOS icon 911… Emergency status display 912… Sender icon (New message) 913… "Your message has arrived!" 914… Received time (new message) 1001… Map 1002 ... Person being monitored location icon 1003… Message bubble 1004 ... Message sent time 1005… Battery level indicator (monitoring terminal) 1006 ... Dialogue Log 1007… Emergency type 1008… Final location of the person being monitored (monitoring terminal) 1009… Time of emergency 1010... "Voice Confirmation" button 1011… "Contact Police" button 1012… Setting the level of dialogue support intervention 1013… Emergency contact settings 1014 ... Safety Zone Setting 1015… Learning function ON / OFF setting 1016… Learning Status and Dialogue Trend Report 1017 ... Learning Data Usage Permission Settings 1018… Server connection status NW… Communication Network P... Person being watched over G... Watcher S601~S606… Message generation and transmission processing steps (corresponding to Figure 6) S701~S709… Emergency response processing steps (corresponding to Figure 7)
Claims
1. A monitoring device carried by the person being monitored, An operation input receiving unit that receives operation input from the person being monitored, A voice input receiving unit that receives voice input from the person being monitored, A location information acquisition unit that acquires location information, A dialogue control unit generates a question to clarify the intent of the voice input when the voice input received by the voice input receiving unit satisfies predetermined conditions, performs a dialogue with the person being monitored, and obtains a message determined by the dialogue. A transmission control unit transmits the message acquired by the dialogue control unit, along with the location information acquired by the location information acquisition unit, to the caregiver's terminal. A monitoring terminal characterized by being equipped with the following features.
2. The monitored terminal according to claim 1, wherein the operation input receiving unit does not have a free text input interface.
3. The monitored terminal according to claim 1, wherein the predetermined condition includes the volume of the voice input received by the voice input receiving unit being less than a predetermined threshold.
4. The monitored terminal according to claim 1, wherein the predetermined condition includes the fact that the voice input received by the voice input receiving unit has been interrupted.
5. The monitored terminal according to claim 1, wherein the predetermined condition includes the number of keywords included in the voice input received by the voice input receiving unit being less than a predetermined threshold.
6. The monitored terminal according to claim 1, wherein the predetermined condition includes the case where the dialogue control unit estimates that the person being monitored is in a predetermined negative emotional state based on the acoustic characteristics of the voice input received by the voice input receiving unit.
7. The monitored terminal according to claim 1, wherein the questioning performed by the dialogue control unit includes a question for identifying a missing element among the main components of the message content.
8. The monitored terminal according to claim 1, wherein the dialogue control unit generates a plurality of message candidates and presents them to the monitored person, and confirms the message based on the selection made by the monitored person.
9. The monitored terminal further comprises an urgency determination unit that determines the urgency of the monitored person's situation based on voice input received by the voice input receiving unit, a message acquired by the dialogue control unit, or the acoustic characteristics of the voice input, and the transmission control unit transmits the message along with identifiable priority information or with priority over other messages when the urgency determination unit determines that the urgency is high, according to claim 1.
10. The monitoring terminal according to claim 9, wherein the urgency determination unit determines that the urgency is high when the emotion estimated from the voice input is negative.
11. The monitoring terminal according to claim 9, wherein the urgency determination unit determines that the urgency is high when the ambient noise level of the voice received by the voice input receiving unit is above a predetermined threshold.
12. The monitored terminal further comprises a storage unit for storing a history of conversations with the monitored person, and the conversation control unit adaptively adjusts the content or order of the questions based on the stored conversation history, as described in claim 1.
13. The monitored terminal further comprises a display means, and the dialogue control unit causes the display means to display an icon or a plurality of response options related to the content of the question when executing the question, as described in claim 1.
14. The monitored terminal according to claim 1, wherein the operation input receiving unit includes a dedicated physical button for emitting an SOS signal.
15. The processor, It accepts voice input from the person being monitored. If the received voice input meets predetermined conditions, a question is generated to clarify the intent of the voice input, and a dialogue is performed with the person being monitored. The message confirmed through the aforementioned dialogue is obtained, The acquired message is then modified by adding location information. The added message and location information are sent to the caregiver's terminal. Information processing method characterized by including steps.
16. In the processor, In a monitoring device carried by the person being monitored, The step of receiving operation input from the person being monitored, The step of receiving voice input from the person being monitored, Steps to obtain location information, The steps include: generating a question to clarify the intent of the voice input if the received voice input meets predetermined conditions, performing a dialogue with the person being monitored, and obtaining a message determined by the dialogue; The steps include sending the acquired message, along with the acquired location information, to the caregiver's terminal, A program characterized by causing the execution of a specific action.
17. In the processor, In the monitoring device, The monitored terminal or server receives, when the voice input from the monitored person meets predetermined conditions, a message confirmed through dialogue with the monitored person, and location information acquired by the monitored terminal. The received message and the location information are displayed on the display unit. When response information is received by the operation input unit in connection with the display on the aforementioned display unit, the response information is transmitted to the monitored terminal or the server. A program characterized by executing a process.
18. An information processing system comprising a monitoring terminal carried by the person being monitored, and a server that can communicate with the monitoring terminal and the monitoring terminal used by the monitor via a communication network, The aforementioned monitoring terminal is An operation input receiving means for receiving operation input from the person being monitored, A voice input receiving means for receiving voice input from the person being monitored, A means for acquiring location information, The system includes a first control means that transmits the voice input received by the voice input receiving means to the server, performs a dialogue with the person being monitored based on a question received from the server to clarify the intent of the voice input, receives a message confirmed by the dialogue from the server, and transmits the message to the monitor's terminal along with the location information acquired by the location information acquisition means. The aforementioned server is The system includes a second control means that receives the voice input from the monitored terminal, generates a question to clarify the intent of the voice input and transmits it to the monitored terminal if the voice input satisfies predetermined conditions, performs a dialogue with the monitored person based on the response from the monitored terminal to the question, determines a message to be conveyed to the caregiver through the dialogue and transmits it to the monitored terminal, An information processing system characterized by the following:
19. A server that can communicate via a communication network with a monitoring terminal carried by the person being monitored and a monitoring terminal used by the caregiver, A communication interface that receives data including voice input from the monitored terminal and transmits data to the monitor terminal, A dialogue generation unit generates a question to clarify the intent of the voice input when the voice input received via the communication interface satisfies predetermined conditions, A transmission control unit causes the dialogue generation unit to transmit the question it generates to the monitored terminal via the communication interface, receives a message determined based on the dialogue data including the response to the question and location information acquired by the monitored terminal from the monitored terminal via the communication interface, and transmits the received message determined and location information to the caregiver terminal via the communication interface. A server characterized by having the following features.