A terminal for a person being monitored, an information processing device, a control method for a terminal being monitored, a program, and an information processing system.

The monitoring terminal with dual cameras and sensors provides a comprehensive understanding of the monitored individual's context, enhancing the ability of guardians to respond to their situation and state.

JP2026110468APending Publication Date: 2026-07-02MIXI INC

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
MIXI INC
Filing Date
2025-07-11
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing technologies for monitoring individuals under guardianship lack the ability to provide a comprehensive understanding of their situation and internal state beyond basic location information.

Method used

A monitoring terminal equipped with multiple heterogeneous sensors and dual cameras that determine the user's context, dynamically controlling imaging based on this context to capture detailed information about the user and their environment.

Benefits of technology

Enables a deeper and more intuitive grasp of the monitored individual's situation and internal state, allowing remote guardians to respond effectively to their needs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026110468000001_ABST
    Figure 2026110468000001_ABST
Patent Text Reader

Abstract

In the field of technology that uses mobile devices to understand the situation of users in remote locations, this new information provision technology supports user safety and smooth communication by enabling a deeper and more intuitive understanding of the specific situation and inner state of the user, in addition to basic information such as location information. [Solution] The processor of the monitored terminal, which is equipped with a first camera on the front of the housing, a second camera on the back, and multiple heterogeneous sensors, determines the user's context based on information acquired from the multiple heterogeneous sensors, and, according to the determined context, causes the first camera or the second camera to perform imaging.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The present disclosure relates to a terminal for a person under guardianship, an information processing device, a method for controlling a terminal for a person under guardianship, a program, and an information processing system.

Background Art

[0002] Conventionally, a technique has been known in which a caregiver at a remote location grasps the position information of a person under guardianship using a mobile terminal (GPS tracker) having a GPS (Global Positioning System) function (see, for example, Patent Document 1).

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] In the technology for monitoring the safety of a person under guardianship at a remote location, it has been difficult to grasp the specific situation and internal state in which the person under guardianship is placed only with basic information such as position information. Therefore, an object of the present disclosure is to provide a new information providing technology that enables a deeper and more intuitive grasp of the situation of a person under guardianship.

Means for Solving the Problems

[0005] To solve the above problems, a monitoring terminal according to one aspect of the present disclosure comprises a first camera provided on the front of the housing, a second camera provided on the back of the housing, a plurality of heterogeneous sensors for acquiring information about the user's state and the surrounding environment, and a processor. The processor determines the user's context based on the information acquired from the plurality of heterogeneous sensors, and, according to the determined context, causes at least one of the first camera and the second camera to perform imaging. [Effects of the Invention]

[0006] According to one aspect of this disclosure, by performing optimal imaging according to the user's context, it becomes possible for others in a remote location to grasp more deeply and intuitively the specific external situation and internal state of the person being monitored. [Brief explanation of the drawing]

[0007] [Figure 1] This is an overview diagram showing the overall configuration of an information processing system according to one embodiment of the present disclosure. [Figure 2] This is a block diagram showing the hardware configuration of the monitoring terminal according to this embodiment. [Figure 3] This diagram shows the functional configuration of the monitoring terminal according to this embodiment. [Figure 4] This is a block diagram showing the hardware configuration of the information processing device according to this embodiment. [Figure 5] This is a block diagram showing the functional configuration of the information processing device according to this embodiment. [Figure 6] This sequence diagram shows the flow of context determination and imaging control processing according to this embodiment. [Figure 7] This flowchart shows the flow of the dual-camera coordinated shooting process according to this embodiment. [Figure 8] This figure shows an example of the data structure of the context determination rule table used in this embodiment. [Figure 9]This figure shows the concept of the shooting guide function according to this embodiment. [Figure 10] This figure shows an example of the home screen of the monitored terminal 100 according to this embodiment. [Figure 11] This figure shows an example of the UI screen for the navigation function of the monitored terminal 100 according to this embodiment. [Figure 12] This figure shows an example of the emergency notification confirmation UI screen of the monitored terminal 100 according to this embodiment. [Figure 13] This is a functional block diagram showing a more detailed functional configuration of the information processing device 200 according to this embodiment. [Figure 14] This is a flowchart showing the data reception and processing flow in the information processing device 200 according to this embodiment. [Figure 15] This is a block diagram showing the hardware configuration of the parental control terminal 300 according to this embodiment. [Figure 16] This is a functional block diagram showing the functional configuration of the parental control terminal 300 according to this embodiment. [Figure 17] This is a flowchart showing the information reception and display flow in the parental control terminal 300 according to this embodiment. [Figure 18] This figure shows an example of the UI screen for the map display function in the parent terminal 300 according to this embodiment. [Figure 19] This figure shows an example of the UI screen for the album function in the parent terminal 300 according to this embodiment. [Figure 20] This figure shows an example of the UI screen for the settings function in the parent terminal 300 according to this embodiment. [Modes for carrying out the invention]

[0008] The embodiments of this disclosure will be described in detail below with reference to the drawings. In each drawing, the same components are denoted by the same reference numerals, and redundant explanations are omitted.

[0009] <1. System Configuration and Application Overview>

[0010] FIG. 1 is an overview diagram showing the overall configuration of an information processing system 1 according to an embodiment of the present disclosure. The information processing system 1 includes a wardee terminal 100 carried by a user (for example, a child) P who is the wardee, a protector terminal 300 used by others such as a protector, and an information processing device 200 (server) communicably connected to these. These devices communicate with each other via a network NW (for example, the Internet or a mobile phone network). The information processing device 200 may be a physically single server or may be composed of a plurality of servers constructed in a cloud computing environment.

[0011] The wardee terminal 100 and the protector terminal 300 may be, for example, a smartphone, a tablet terminal, or a dedicated device specialized for this function. In the present embodiment, it is assumed that a dedicated application operates on these terminals.

[0012] This application supports remote monitoring by providing the location information of the user acquired by the wardee terminal 100 and the captured images / videos to the protector terminal 300. The user can operate the operation unit 110 of the wardee terminal 100 to record a voice message or actively take a photo and transmit it to the protector terminal 300. Also, as a major feature of this application, the wardee terminal 100 has a function of autonomously judging a specific situation, automatically performing imaging, and notifying the protector.

[0013] Figure 6 is a sequence diagram showing the main coordinated operations between each device in this information processing system 1. As shown in this figure, this system first starts with multiple types of disparate information acquired by the monitored person's terminal 100. This information includes behavioral information from the IMU (Inertial Measurement Unit), location information from the GPS receiver 107, the user's intention inferred from button operations, and the content of images captured by the camera. The information processing device 200 does not evaluate this received information individually, but rather analyzes it by fusing it together in a complex manner, thereby intelligently determining higher-level user contexts (situations) that go beyond mere location and movement, such as "emergencies," "walking," and "conversing with friends." Then, according to the determined context, it generates a notification to the guardian (for example, an emergency notification or an image-attached message to convey daily activities) and sends it to the guardian's terminal 300. Thus, this diagram provides an overview of how diverse information is highly integrated and how each device works in conjunction to achieve the dual objectives of this disclosure: "ensuring safety" and "supporting communication."

[0014] <2. Hardware Configuration>

[0015] Figure 2 is a block diagram showing the hardware configuration of the monitored terminal 100 according to this embodiment. The monitored terminal 100 includes a processor 101, memory 102, storage 103, communication interface 111, first camera 104, second camera 105, sensor group 106, GPS receiver 107, audio input / output unit 108, display unit 109, and operation unit 110. The processor 101 comprehensively controls the operation of the entire terminal. The memory 102 functions as a work area for the processor 101, and the storage 103 stores the OS (Operating System), application programs, and various data described later.

[0016] The first camera 104 is located on the front of the housing (user side) and primarily captures the user's facial expressions. The second camera 105 is located on the back of the housing (in the user's line of sight) and primarily captures the environment around the user. The sensor group 106 includes multiple different types of sensors, such as an IMU that detects 3-axis acceleration, angular velocity, and geomagnetic field, as well as an illuminance sensor and a microphone. Although this embodiment shows an example of using multiple different types of sensors, this disclosure is not limited to this. It is also possible to configure the system to perform contextual judgment using only information obtained from a single high-performance sensor (e.g., a high-resolution camera). For example, AI image analysis can be used to comprehensively determine the user's posture, facial expressions, movements, surrounding environment, and the presence of others, and image capture control can be performed based on this. The type and number of sensors may be appropriately selected according to the characteristics of the target user (age, physical ability, cognitive ability, etc.) and the purpose of monitoring. Although this embodiment describes the first camera 104 and second camera 105 fixedly positioned on the front and back of the housing as an example, the camera configuration is not limited to this. For example, a configuration is possible in which a single high-performance camera that can be mechanically rotated is placed on top of the housing, and the imaging direction is dynamically controlled according to the context. Furthermore, a configuration that uses a 360-degree omnidirectional camera to simultaneously image the user and the surrounding environment, and extracts and processes images with an appropriate field of view according to the context, is also included in the technical concept of this disclosure. In addition, a configuration in which multiple cameras are placed at different angles and heights and imaging is performed using the optimal combination according to the context is also conceivable. The important thing is to realize an imaging function that appropriately captures both the user's state (including their internal state) and the surrounding environment.

[0017] Furthermore, the first camera 104 and the second camera 105 are not limited to single-function cameras. For example, the imaging unit (corresponding to the first camera) located on the front of the housing may be configured as a composite camera module combining two or more image sensors with different detection mechanisms, such as an RGB sensor for capturing normal color images, a ToF (Time-Of-Flight) sensor for measuring the three-dimensional distance to the subject, and an event vision sensor for detecting motion (events) with low power consumption. In this case, the imaging control unit 122 not only switches between the first and second cameras according to the context determined by the context determination unit 121, but also dynamically controls which sensor to activate within the first camera. For example, in a context where the user is determined to be "having a conversation with a friend," the RGB sensor can be selected to capture facial expressions in detail, and in a context where the user is "walking indoors," the event vision sensor or ToF sensor can be selected to monitor risks such as falls while keeping power consumption down, enabling more detailed imaging control. This allows the device to acquire information on the most suitable modality depending on the situation, dramatically improving the accuracy and efficiency of monitoring.

[0018] Figure 4 is a block diagram showing the hardware configuration of the information processing device 200 according to this embodiment. The information processing device 200 includes a processor 201, memory 202, storage 203, communication interface 204, etc. Since these configurations are similar to those of a typical server computer, a detailed explanation is omitted.

[0019] Figure 6 is a sequence diagram showing the main processing flow in this system. This sequence diagram shows a series of coordinated operations between each device, starting with the monitored person's terminal 100 detecting context, the information being transmitted to the information processing device 200, analyzed, and then, if necessary, a notification being sent to the guardian's terminal 300.

[0020] <3. Functional Configuration>

[0021] Figure 3 is a functional block diagram showing the functional configuration of the monitored terminal 100. The processor 101 of the monitored terminal 100 mainly functions as a context determination unit 121, an imaging control unit 122, a parameter control unit 123, a guidance output unit 124, a UI control unit 125, and a navigation control unit 126 by executing programs stored in the storage 103.

[0022] Furthermore, the processor 101 of the monitored person terminal 100 may function as a resource control unit to optimize the terminal's battery consumption. The resource control unit dynamically controls the operating modes of hardware components such as the sensor group 106 and the GPS receiver 107 according to the context determined by the context determination unit 121. For example, in low-activity contexts such as "stationary at home" or "sleeping," the sampling rate of various sensors is significantly reduced, and the polling interval of the communication module is set to a longer interval. On the other hand, in contexts requiring attention, such as "moving along the school route," high-frequency data acquisition and real-time communication are performed. In this way, by intelligently optimizing resource allocation according to the situation, long operating time for the entire terminal is achieved without sacrificing monitoring accuracy in critical situations.

[0023] The context determination unit 121 determines the user's context based on multiple different types of sensor information acquired from the sensor group 106, GPS receiver 107, audio input / output unit 108, etc. Here, "context" is a higher-level concept that comprehensively represents the situation, actions, intentions, etc., in which the user is placed. Specifically, this determination unit analyzes multiple pieces of information, such as the behavior shown by the acceleration sensor, the movement pattern shown by the GPS, the sound picked up by the microphone, and whether or not buttons were pressed, by fusing them together. This allows it to determine not just a simple state, but also more complex and higher-level contexts, such as "falling (emergency)," "walking in a park (daily activity)," or "talking with a friend (communication)." In this embodiment, an example is shown in which the context determination unit 121 operates within the processor 101 of the monitored person terminal 100, but this disclosure is not limited to this. The technical concept of this disclosure also includes a configuration in which part or all of the context judgment processing is performed by an external information processing device 200 connected via a network, and the judgment result is received by the monitored person terminal 100 to perform imaging control. In this case, the processor 101 of the monitored person terminal 100 realizes the context judgment function in cooperation with the external device, and the physical location of the judgment entity is not an essential difference. What is important is that the series of functions, including context judgment based on sensor information and corresponding imaging control, are realized integrally as a whole system.

[0024] Contexts can be classified into intermediate concepts such as "stationary state," "moving state," "communication state," and "emergency state." "Moving state" includes walking, running, and travel by vehicle, while "communication state" includes face-to-face conversations and recorded voice messages.

[0025] Each context is determined by a specific combination of sensor information defined in the context determination rule table 400, as shown in Figure 8. For example, the "fall" context, a sub-concept belonging to "emergency situation," is determined when the IMU included in the sensor group 106 detects free fall (a state where acceleration is less than the acceleration due to gravity) for a predetermined period of time or longer, followed immediately by an impact exceeding a predetermined threshold. This configuration makes it possible to automatically detect dangerous situations such as a user falling.

[0026] The sensors used in this disclosure can be configured in various ways depending on the characteristics of the person being monitored. For monitoring children, effective sensors include an IMU for fall detection, GPS for preventing them from getting lost, a microphone for collecting voices in dangerous areas, an illuminance sensor for determining whether the child is indoors or outdoors, and an infrared sensor for detecting abnormal body temperature. On the other hand, for monitoring the elderly, possible sensors include biosensors for monitoring heart rate and blood pressure, accelerometers for detecting wandering, impact sensors for detecting impacts during falls, near-field communication sensors (NFC, etc.) for medication management, and temperature and humidity sensors for monitoring the indoor environment. Furthermore, comprehensive situational judgment is also possible with a single high-performance sensor. For example, by combining a high-resolution camera with AI image analysis technology, it is possible to comprehensively judge the user's posture, movements, facial expressions, dangers in the surrounding environment, and interactions with others, achieving contextual judgment with accuracy comparable to conventional multi-sensor systems. In this case, multidimensional information obtained from a single sensor can be comprehensively utilized as information about the user's state and the surrounding environment.

[0027] Furthermore, the context determination unit 121 may determine the context using a machine learning model instead of, or in combination with, a rule-based method. For example, a set of sensor data (acceleration, angular velocity, GPS trajectory, voice, etc.) and a correct label (context) indicating the situation may be collected in advance from a large number of subjects under various circumstances, and this can be used as training data to construct a classifier that identifies the context (e.g., support vector machine, random forest, neural network, etc.). The context determination unit 121 can probabilistically determine the current context by inputting the sensor data acquired at runtime into this trained model. In this case, features input to the machine learning model may include, for example, the mean, variance, standard deviation, and peak value of each of the three axes of acceleration data obtained from the IMU, the frequency spectrum analysis results, the speed of movement and rate of change of direction of movement obtained from GPS, the time spent in a specific area, the time-series pattern of the sound pressure level obtained from the microphone, the fundamental frequency of the sound waveform, and the brightness change pattern obtained from the illuminance sensor, either individually or in combination.

[0028] The imaging control unit 122 initiates imaging using at least one of the first camera 104 or the second camera 105, depending on the context determined by the context determination unit 121. Depending on the context determined by the context determination unit 121, at least one of the first camera 104 or the second camera 105 may be activated to initiate imaging. For example, if the context of "fall" is determined, both the first camera 104 and the second camera 105 are activated to simultaneously grasp the user's safety (facial expression) and the surrounding situation. This allows guardians to confirm not only the fact that a fall occurred, but also the specific circumstances at the time through video, dramatically increasing their sense of security. Furthermore, although this embodiment shows an example where the cameras are activated after context determination, the mode of imaging control is not limited to this. For example, the first camera 104 and the second camera 105 may be operated in a constantly activated state, and depending on the determined context, the system may selectively record, save, and transmit appropriate time intervals or images from appropriate cameras from the captured video data. Furthermore, the technical concept of this disclosure also includes configurations that dynamically change imaging parameters (resolution, frame rate, compression ratio, etc.) according to the context, and configurations that synthesize and edit images from multiple cameras to generate optimal images. What is important is that there is some kind of correspondence between the determined context and the image captured, and that appropriate image information is obtained according to the context.

[0029] The imaging control unit 122 can perform a variety of imaging controls depending on the physical configuration of the camera. In the case of a single rotatable camera, it mechanically controls the camera's orientation according to the context and performs time-division multiplexing of user facial expression and environment photography. In the case of a 360-degree camera, it identifies the user area and environment area from the omnidirectional image using image recognition and applies appropriate image processing (resolution adjustment, exposure compensation, etc.) to each area. In the case of a multi-camera configuration, it selects the optimal camera combination for the context according to the characteristics of each camera (wide-angle, telephoto, macro, etc.). This enables consistent imaging control functions regardless of the diversity of the camera configuration.

[0030] The parameter control unit 123 controls the imaging parameters of the other camera based on imaging information acquired by one of the first camera 104 and the second camera 105. This allows one camera to function as an advanced ambient light sensor for the other camera, which is one of the technical features of this disclosure.

[0031] Figure 7 is an example of a flowchart showing the flow of this dual-camera coordinated shooting process. In this process, the parameter control unit 123 first acquires imaging information from one camera (e.g., second camera 105) (S701). Next, the parameter control unit 123 analyzes the imaging information to determine features related to ambient light, such as brightness and color information (S702). Then, based on the analysis results, the parameter control unit 123 determines and sets the imaging parameters (exposure, white balance, etc.) of the other camera (e.g., first camera 104) so ​​that the subject is captured optimally (S703).

[0032] For example, the parameter control unit 123 automatically adjusts the exposure value of the first camera 104 based on the average brightness of the image acquired by the second camera 105. Specifically, if the average brightness of the second camera 105 is high, it determines that there is a high possibility that the user's face will be underexposed due to backlighting and enables the HDR function.

[0033] Furthermore, the monitored terminal 100 according to this embodiment may be equipped with a shooting guide function to support the user in taking photographs proactively. This function is intended to enable users, such as children, who have little knowledge of subject composition or shooting environment, to take higher quality photographs. This function is realized, for example, by the UI control unit 125 shown in Figure 3 working in cooperation with the imaging control unit 122 and the parameter control unit 123.

[0034] Figure 9 shows a conceptual diagram of the shooting guide function according to this embodiment. For example, when a user presses the "take a photo" button to switch to shooting mode, the monitored person terminal 100 analyzes the preview image acquired by the first camera 104 or the second camera 105.

[0035] Based on this analysis result, the UI control unit 125 overlays and displays guide information to assist in shooting on the display unit 109. For example, if the monitored terminal 100 detects a person's face in the preview video, it can display a human-shaped guide frame 710 on the display unit so that the face is in the appropriate position. By aligning the subject with this guide frame 710, the user can naturally take a well-balanced composition.

[0036] As another example, if the monitored terminal 100 determines, based on the brightness information measured by the parameter control unit 123 from the preview video, that the shooting environment is too dark, it can display a text message 720 that is easy for children to understand, such as "Take the picture in a brighter place," or a sun icon on its display.

[0037] These shooting guide functions have the effect of allowing users, especially children, to enjoy the process of taking photographs, and as a result, to obtain clear, well-composed images that make it easier for parents to understand the situation. This function may serve as the basis for a dependent claim such as "the processor displays guide information on the display unit to assist in shooting when imaging is performed using the camera."

[0038] Furthermore, the white balance of the first camera 104 can be corrected based on the color temperature information of the image acquired by the second camera 105. Children, for example, often don't pay much attention to the position of the light source when taking pictures, resulting in many photos with unclear facial expressions due to backlighting, etc. This configuration automatically corrects these kinds of shooting failures specific to children, effectively recording the user's facial expressions clearly under any lighting conditions.

[0039] Figure 5 is a functional block diagram showing an example of the functional configuration of the information processing device 200, and Figure 13 shows a more detailed functional configuration. The processor 201 of the information processing device 200 executes a program stored in the storage 203 to receive and analyze various data transmitted from the monitored person terminal 100, and to process and provide it to the guardian terminal 300 as useful information, realizing a variety of functions. In particular, in this embodiment, the information processing device 200 can also be the main unit of context judgment processing. The information processing device 200 receives raw time-series data (raw data) acquired by the sensor group 106 and the image data itself captured by the camera from the monitored person terminal 100, and the context analysis unit 230 of the information processing device 200 comprehensively analyzes this information and past behavioral history stored in the database 250. Because the information processing device 200 has more abundant computing resources than the terminal, it can apply more complex and large-scale machine learning models (e.g., behavior recognition models using deep learning), enabling the determination of more advanced contexts that are difficult for the terminal alone to judge (e.g., signs of emotional changes such as arguments with friends, signs of poor health based on subtle changes in walking patterns, etc.). In addition, the processor 201, as the image / sound processing unit 240, also performs the functions of the anonymization processing unit 222 and the map display data generation unit 223. The anonymization processing unit 222 detects faces other than the user's from the received image and applies anonymization processing (blurring, etc.) to protect privacy. Sensitive information subject to anonymization here includes, for example, the faces of third parties who appear together with the user, nameplates or belongings that show a person's name, car license plates, private homes with distinctive appearances, and school uniforms. Parents may be able to select which information to anonymize from the settings function of the parent terminal 300 (e.g., privacy settings 904 in Figure 20). The map display data generation unit 223 generates data for placing the received image data on a map, linking it with relevant location and time information. After these processes, the notification generation unit 260 generates a notification in a format that is easy for parents to understand and sends it to the parent terminal 300.

[0040] <4. Specific operating procedures for the monitoring device>

[0041] This section describes the specific operating procedures for the monitored device 100. Since this device is primarily intended for use by children, intuitive and simple operation is required.

[0042] The terminal can be turned on or off by pressing and holding the power button included in the (Power On / Off) control unit 110. When the power is turned on, it is also possible to set the terminal to require a passcode or biometric authentication (fingerprint authentication, facial recognition, etc.) for security verification.

[0043] (Recording and sending voice messages) When the "Voice Message" button on the operation unit 110 is briefly pressed, the device enters recording standby mode. When the button is pressed again, or when silence continues for a predetermined period of time, recording ends and the message is automatically sent to the parent terminal 300. During recording, an icon or message indicating that recording is in progress is displayed on the display unit 109.

[0044] (Active photo capture and transmission) When the "Photo Capture" button on the operation unit 110 is briefly pressed, the second camera 105 (in the user's line of sight) is activated, and a still image is automatically captured with optimal focus and exposure and transmitted to the parent terminal 300. Switching to selfie mode using the first camera 104 (on the user's side) is possible by pressing the "Camera Switch" button. The captured image is previewed on the display unit 109 before transmission, and the user can choose whether or not to send it.

[0045] (Emergency notification via SOS button) Pressing and holding the dedicated SOS button located on the side or back of the monitored person's terminal 100 for more than 3 seconds will trigger an emergency notification. This notification includes the current location information, image data automatically captured by the first camera 104 and the second camera 105, and surrounding audio data, and is simultaneously transmitted to pre-registered guardian terminals 300.

[0046] Figure 12 shows an example of an emergency notification confirmation UI screen 600 of the monitored person terminal 100 according to this embodiment. When an emergency notification is successfully issued, the UI control unit 125 displays this confirmation screen 600 on the display unit 109. This confirmation screen 600 consists of a message display area 601 indicating that a notification has been sent to the guardian, and a confirmation button 602 for the user to close the screen and return to the home screen. This allows the user to visually confirm that an emergency notification has been activated, and to gain a sense of security. In addition, a "cancel" button (not shown) may be provided to cancel the notification if operated within a certain period of time in case of accidental operation.

[0047] <4-2. Example UI for a device used by the person being monitored>

[0048] This section describes a specific example of the user interface (UI) of the monitored terminal 100. The UI of this terminal is designed with intuitive operation and rapid response in mind.

[0049] Figure 10 shows an example of the home screen 400 of the monitored terminal 100 according to this embodiment. This home screen 400 displays basic status information such as the time display 401, date display 402, battery level icon 403, and signal strength icon 404 on the display unit. Furthermore, this screen 400 has a configuration in which the "voice message" button 405, the "photo shoot" button 406, and the "SOS" button 407 are displayed larger than other UI elements, so that even children can perform the desired operation without getting confused. To prevent accidental operation, it is desirable that the "SOS" button 407 be displayed in a different color and shape from other buttons and configured to accept input only by long-press operation.

[0050] Figure 11 shows an example UI screen 500 of the navigation function of the monitored terminal 100 according to this embodiment. This navigation screen 500 is configured to display a map area 501, a large arrow icon 502 indicating the direction to proceed, a "remaining distance display" 503 indicating the distance to the next turn, "destination information" 504 indicating the total distance to the destination and the estimated time of arrival, and a "voice ON / OFF button" 505 to switch voice guidance ON / OFF. It is desirable that the map area 501 has a function to automatically rotate according to the direction of travel of the user and always display in accordance with the user's vision.

[0051] <5. Processing Flow of Information Processing Device>

[0052] <5-1. Functional Configuration of Information Processing Device (Details)>

[0053] Figure 13 is a functional block diagram showing a more detailed functional configuration of the information processing device 200 according to this embodiment. The information processing device 200 functions as a communication unit 210, a data receiving unit 220, a context analysis unit 230, an image / sound processing unit 240, a database 250, a notification generation unit 260, and a data transmission unit 270.

[0054] The communication unit 210 has the function of sending and receiving data between the monitored person's terminal 100 and the guardian's terminal 300. The data receiving unit 220 has the function of receiving location information, sensor information, image data, audio data, etc. transmitted from the monitored person's terminal 100.

[0055] The context analysis unit 230 has the function of analyzing in detail the current context of the person being monitored based on information transmitted from the monitored person terminal 100. The information received here may include not only information that has been given some meaning on the terminal side (e.g., "fall detection flag"), but also raw time-series data (raw data) output from the sensor group 106 and the captured image data itself. The context analysis unit 230 has the function of analyzing in detail the current context of the person being monitored based on the received various sensor information and past behavior history data. The image / audio processing unit 240 has the function of performing various processes on the received image data and audio data, such as anonymization, map display data generation, image recognition, and voice recognition. Furthermore, the context analysis unit 230 may include a predictive model that performs time-series analysis of long-term data (behavior patterns, movement history, vital information, etc.) stored in the database 250. This predictive model can, for example, predict the "risk of getting lost within 30 minutes" based on past travel routes and current travel status, or detect "signs of poor health over the weekend" based on recent changes in activity levels. The predicted future risks and contexts are notified to parents in advance via the notification generation unit 260 in a manner different from normal contextual notifications. This enables not only reactive responses but also preventative monitoring.

[0056] The database 250 has the function of accumulating various data received from the monitored person's terminal 100, context analysis results, and parent settings information in chronological order. The notification generation unit 260 has the function of generating notification content to be sent to the parent terminal 300 based on the analysis results, processing results, and parent settings. The data transmission unit 270 has the function of sending the generated notification data, etc., to the parent terminal 300.

[0057] Furthermore, the information processing system 1 may have a learning loop function that continuously improves the accuracy of context judgments based on feedback from parents. For example, the parent terminal 300 is equipped with a UI for parents to correct the displayed context information (e.g., "walking") to the correct context (e.g., "riding a bicycle") if it differs from the actual situation. The information processing device 200 collects this correction information as training data and periodically or in real time retrains (fine-tunes) the machine learning model used by the context analysis unit 230. As a result, the system is optimized to each user's unique behavior patterns and lifestyle habits, and the accuracy of judgments improves the more it is used, which is a remarkable effect.

[0058] In addition to real-time notifications, the notification generation unit 260 may also have a function to automatically extract noteworthy events from the day's activities using AI and generate a summarized "activity digest." Noteworthy events include, for example, a place visited for the first time, behavior that statistically deviates significantly from the daily behavior pattern, positive emotions such as joy or surprise that can be inferred from captured images, or a time when conversations with friends were particularly active. By viewing this digest, guardians can quickly and efficiently grasp the day's events of the person being monitored, even when busy, and use it as a starting point for communication.

[0059] Next, referring to the flowchart in Figure 14, each step of the main processing flow executed by the information processing device 200 according to this embodiment will be described in detail.

[0060] Next, referring to the flowchart in Figure 14, each step of the main processing flow executed by the information processing device 200 according to this embodiment will be described in detail.

[0061] (S1401: Waiting for data reception from the monitored person's terminal) First, the information processing device 200 is in a state of constant waiting for various data (location information, sensor information, imaging data, voice data, etc.) to be transmitted via its communication unit 210 from one of the monitored person's terminals 100 connected to the network NW.

[0062] (S1402: Data Reception) When data is transmitted from the monitored terminal 100 and received, the data receiving unit 220 takes in the data and stores it in a predetermined area on memory accessible by the subsequent processing unit. At this time, an ID that identifies the sending terminal is also recorded.

[0063] (S1403: Determination of data type) Next, the data receiving unit 220 or a subsequent processing unit analyzes the header information and data structure of the received data and determines whether the data is of one of the following types: "location information," "sensor information," "image data," or "audio data." The subsequent processing branches according to this determination result.

[0064] (S1404: Context Analysis) If the received data is "location information" from the GPS receiver 107 or "sensor information" from the sensor group 106, the processing is passed to the context analysis unit 230. The context analysis unit 230 combines this information with past behavioral history data stored in the database 250 to perform a detailed analysis of the current context of the person being monitored (e.g., moving, stationary, talking, falling, etc.). The analysis results are stored in the database 250 along with a timestamp.

[0065] (S1405: Image / Audio Processing) If the received data is "image data" or "audio data," the processing is passed to the image / audio processing unit 240. The image / audio processing unit 240 performs anonymization processing for privacy protection by the anonymization processing unit 222, generates map display data by the map display data generation unit 223, and performs advanced processing such as image recognition (e.g., detection of surrounding hazards) and audio recognition (e.g., detection of screams) as needed. The processing results are similarly stored in the database 250.

[0066] (S1406: Data storage in database) All data received and generated in each processing step (raw data, processed data, analysis results, etc.) is stored chronologically in database 250, associated with the ID of the person being monitored, timestamp, etc.

[0067] (S1407: Notification Necessity Determination) The notification generation unit 260 determines whether or not a notification is necessary to the parent terminal 300 based on the context analysis results in S1404 and the image / audio processing results in S1405. This determination is made based on whether the current situation matches the notification conditions set in advance by the parent (e.g., when an "emergency context" is detected, when entering or leaving a specific "geofence" area, etc.).

[0068] (S1408: Notification generation) If it is determined in S1407 that a notification is necessary, the notification generation unit 260 generates a notification message to send to the parent terminal 300. This message includes text indicating the situation (e.g., "Fall detected"), thumbnails of related images or videos, and a link to location information.

[0069] (S1409: Notification transmission) The generated notification data is transmitted via the data transmission unit 270 to the corresponding parent terminal 300 in the form of a push notification or the like.

[0070] (S1410: Processing complete or waiting continues) The series of processes is now complete, and the information processing device 200 returns to the state of S1401 and waits for the next data to be received.

[0071] In the above embodiment, the focus was on real-time contextual judgment based on a complex analysis of sensor information, but the method of contextual judgment is not limited to this. For example, simpler judgment methods can also be implemented, such as time-based judgments based on the user's daily behavior patterns (e.g., time to go to school, time to return home), location-based judgments based on GPS location information (e.g., arrival at school, arrival at home), or event-based judgments based on the user's active actions (e.g., SOS operation, pressing the shutter button). These methods are effective when reducing processing load and optimizing battery consumption are required, and can provide a highly reliable monitoring function based on pre-set conditions.

[0072] <6. Other Application Examples and Modifications>

[0073] This disclosure can be used in a variety of situations, but as one example of its application, we will describe the case where the user P of the monitored terminal 100 is a child in the lower grades of elementary school. In this case, it is desirable that the terminal has a configuration that takes into account the physical and cognitive characteristics of the child.

[0074] Specifically, in terms of physical features, it may be made smaller and lighter so that it is easy to hold even in small hands, and may have a robust casing that can withstand impacts such as drops. Also, considering usability, the on-screen buttons and physical operation buttons 110 may be designed to be larger than those on terminals used by adults, or the total number of buttons may be intentionally reduced.

[0075] Furthermore, a configuration including a dedicated SOS button to immediately notify parents in emergencies is also conceivable. While such a configuration may not be directly related to the core constituent elements of this disclosure, it demonstrates that the problem this disclosure seeks to solve is not merely a general technical issue, but is particularly evident in the specific circumstances of child monitoring. Even if unexpected prior art is cited, the technical value of this disclosure is demonstrated precisely under such specific constraints, and this can serve as strong circumstantial evidence that its conception was not easy.

[0076] Furthermore, this information processing system 1 may be configured to link with external third-party systems. For example, the information processing device 200 could link via API with an enrollment management system operated by a school corporation. By comparing the legitimate arrival and departure times received from the enrollment management system with the context determined by this system (e.g., "arrived at school," "on the way home"), abnormal situations such as impersonation or abduction can be detected with greater accuracy. As another example, it is possible to link with a crime prevention information system distributed by a local government and, if a user approaches an area where suspicious person information has been issued, the context can be automatically switched to "high alert mode," and the frequency of automatic photography and the level of notification to parents can be dynamically increased.

[0077] <Variation example: Automatic recording during voice message recording> In the above embodiment, an example was described in which voice message recording and image capture are performed by separate operations, but this disclosure is not limited thereto. As a variation, a configuration in which these two functions are linked is also possible. For example, while the user is pressing a voice message button (for example, a first button included in the operation unit 110), the processor 101 may start voice recording using the microphone of the voice input / output unit 108, and at the same time, control the image capture control unit 122 to automatically start recording using a camera (first camera 104 in this disclosure) located on the surface of the housing where the microphone's sound hole is located or on the surface of the housing close to the microphone's sound hole. The processor 101 continues recording as long as the user's button operation is detected or as long as voice input is continuing, and when recording and video recording are finished, the acquired voice data and video data are transmitted to the parent terminal 300 via the information processing device 200 along with the location information and timestamp acquired by the GPS receiver 107. With this configuration, the facial expressions and circumstances of the user at the very moment they are trying to communicate something can be recorded simultaneously in both voice and video, allowing the parent to understand the user's state based on richer information.

[0078] In this modified version, consideration for privacy is also incorporated. In the parent terminal 300, the setting management unit 314 provides a UI (for example, in the settings screen 900 in Figure 20) for switching the automatic recording function when voice messages are acquired on or off. This setting can be set individually for each monitored terminal 100 or for each user, and if the user is above a certain age, this function can be set to be disabled by default. In addition, if a person other than the user appears in the recorded video, the processor 101 of the monitored terminal 100 or the anonymization processing unit 222 of the information processing device 200 may be configured to automatically detect the face area of ​​the person in question, perform anonymization processing, and then send it to the parent terminal 300.

[0079] Furthermore, this modified version can be linked with more advanced situational judgment. The context analysis unit 230 of the information processing device 200 analyzes the content of the received voice message in real time, and if it detects keywords suggesting danger, such as "help" or "I'm scared," or acoustic features such as screams, it determines that the context is an "emergency situation." Based on this determination, the information processing device 200 may send an instruction to the monitored person's terminal 100 to also start recording with the second camera 105 to capture the surrounding environment. Separately, the system may also have a function that allows the guardian to remotely control the monitored person's terminal 100's camera to forcibly start imaging from the guardian's terminal 300.

[0080] <7. Configuration and Operation of Parental Devices>

[0081] Next, we will explain in detail the guardian terminal 300, which is used by others such as guardians. The guardian terminal 300 functions as an interface for remotely monitoring the situation of the person being monitored and making setting changes or communicating as needed.

[0082] <7.1. Hardware Configuration>

[0083] Figure 15 is a block diagram showing the hardware configuration of the parental control terminal 300 according to this embodiment. The parental control terminal 300 is assumed to be a typical smartphone or tablet device, and its main components include a processor 301 that comprehensively controls the operation of the entire terminal, a memory 302 that functions as a work area for the processor 301, and a storage 303 that permanently stores the OS, applications, received data, etc.

[0084] The device also includes a communication interface 304 for communicating with the information processing device 200 via a network NW, a display unit 305 for displaying map information, received images, and various setting screens, an operation unit 306 consisting of a touch panel and physical buttons for receiving user instructions, and an audio input / output unit 307 for playing voice messages and making calls.

[0085] <7.2. Functional Configuration>

[0086] Figure 16 is a functional block diagram showing the functional configuration of the parent terminal 300 according to this embodiment. The processor 301 of the parent terminal 300 mainly functions as a data receiving unit 311, a data display unit 312, a notification management unit 313, a setting management unit 314, a data transmission unit 315, and a UI control unit 316 by executing the monitoring application stored in the storage 303.

[0087] The data receiving unit 311 is responsible for receiving location information of the person being monitored, imaging data (images and videos), audio data, contextual information, and various notifications (including emergency notifications) transmitted from the information processing device 200.

[0088] The data display unit 312 displays the various received data on the display unit 305 in a format that parents can intuitively understand. For example, it can plot the location of the person being monitored on a map in real time, display received images and videos in an album format, and provide a UI that allows playback of voice messages.

[0089] The notification management unit 313 manages push notifications and other notifications received from the information processing device 200 and notifies parents through sound, vibration, screen display, etc. In particular, when an emergency notification such as "fall" is received, it has a function to generate a special alert (e.g., a loud alarm sound, a flashing display of the entire screen) that is clearly distinguished from other notifications to draw attention.

[0090] The settings management unit 314 provides an interface for parents to remotely change various settings on the monitored person's terminal 100. For example, the conditions for receiving notifications (e.g., notify only when there is a fall), the frequency of automatic photography, and the destination setting for the navigation function can be set via this settings management unit 314, and the settings are reflected on the monitored person's terminal 100 via the information processing device 200 from the data transmission unit 315.

[0091] The data transmission unit 315 transmits various requests and setting change information based on operations from the guardian to the information processing device 200. The UI control unit 316 controls the UI display of the display unit 305, receives input from the operation unit 306, and passes instructions to the corresponding processing unit.

[0092] Furthermore, this system is designed with transparency and explainability in mind for AI-driven automated decision-making. For example, when the UI control unit 316 receives automatic image capture and contextual information that caused it from the information processing device 200, it not only displays the image on the screen of the parent terminal 300 but also has a function to display the reason "why this image was taken" in simple text (e.g., "A sudden impact was detected, so the surrounding situation was photographed to consider the possibility of a fall"). This allows users to understand the basis of the AI's decisions, prevents it from becoming a black box, and fosters trust in the system.

[0093] <7.3. Processing Flow for Parental Devices>

[0094] Next, referring to the flowchart in Figure 17, we will describe in detail each step of the main processing flow performed by the parent terminal 300. This description will not summarize or integrate anything, but will specifically describe the operation of each step.

[0095] (S1701: Waiting for data to be received from the information processing device) While the application is running, the parent terminal 300 is constantly waiting for various data (location information, images, audio, notifications, etc.) to be transmitted from the information processing device 200 via the data receiving unit 311.

[0096] (S1702: Data reception) When data is received from the information processing device 200, the data receiving unit 311 takes in the data and buffers it in a predetermined area in memory for subsequent processing.

[0097] (S1703: Determination of data type) The UI control unit 316 or the data receiving unit 311 determines the type of data received. Specifically, it identifies whether the data is real-time location information, image data (images / videos), voice messages, system notifications (context changes, SOS alerts, etc.), or other (confirmation of setting changes, etc.).

[0098] (S1704: Location information display processing) If the received data is determined to be location information, the data display unit 312 uses that location information data as input and plots or updates an icon on the map of the display unit 305 indicating the current location of the person being monitored. If necessary, past movement history is also drawn as a line.

[0099] (S1705: Image Data Display Processing) If the received data is determined to be image data (image or video), the data display unit 312 adds the data as a new item on the album function UI and displays a thumbnail. When a parent or guardian taps the thumbnail, the screen transitions to a detailed view screen where they can enlarge the image or play the video.

[0100] (S1706: Audio message playback processing) If the received data is determined to be an audio message, the data display unit 312 displays it as a new item with a play button in the message list UI. When a parent taps the play button, the audio message is played via the audio input / output unit 307.

[0101] (S1707: Notification display processing) If the received data is determined to be a system notification, the notification management unit 313 is responsible for processing it. Depending on the urgency of the notification, it generates a normal alert or an emergency alert and displays the notification content (e.g., "〇〇 has arrived at school", "A fall has been detected", etc.) as a pop-up on the screen.

[0102] (S1709: Parental control input) In parallel with displaying this various information, the UI control unit 316 constantly receives input from parents (such as moving the map, zooming in on images, changing settings, etc.) via the operation unit 306.

[0103] (S1710: Processing according to operation) When the UI control unit 316 receives operation input from a parent or guardian, it executes processing according to the content of that operation. For example, if it is a setting change operation, it requests processing from the setting management unit 314 and sends the change content to the information processing device 200 via the data transmission unit 315.

[0104] (S1711: Processing complete or waiting continues) After each process is completed, the parent terminal 300 returns to state S1701 and waits for the next data reception or operation from the user.

[0105] <7.4. Specific UI Examples>

[0106] Figure 18 shows an example UI screen 700 for the map display function on the parent terminal 300. An icon 702 indicating the current location of the person being monitored is displayed in the map area 701. When the parent taps the icon 702, detailed information such as the context at that time (e.g., "walking," "stationary") and battery level is displayed in a pop-up 706. This allows the parent to understand not only "where the person is" but also "what they are doing there" in a comprehensive way.

[0107] Figure 19 shows an example UI screen 800 for the album function on the parent terminal 300. Images and videos sent from the monitored person's terminal 100 are displayed in chronological order as thumbnails 801. Each thumbnail may display the date and time it was taken, as well as automatically determined context tags (e.g., "school route," "park"). This allows parents to easily search and review images relevant to the situation.

[0108] Figure 20 shows an example UI screen 900 for the settings function on the parent terminal 300. On this screen, settings such as "Notification Settings" 901, "Automatic Shooting Settings" 902, "Geofencing Settings" 903, "Privacy Settings" 904 (such as turning image anonymization ON / OFF), and "SOS Settings" 905 (emergency contacts) can be configured in a list format. This allows for flexible monitoring settings tailored to the needs of each household.

[0109] This disclosure is not limited to the embodiments and applications described above, and various modifications are possible without departing from its essence. Furthermore, this disclosure allows for simpler implementations that do not require the integrated analysis of complex sensor information. For example, imaging control based on predefined rules, such as imaging at specified times based on the user's lifestyle patterns, imaging upon arrival at or departure from a specific location, or imaging in response to a user's request, is also included in the technical concept of this disclosure as broad context-aware imaging. In addition, the monitored person terminal 100 may transmit raw time-series data (raw data) acquired from the sensor group 106, or the image data itself captured by the first camera 104 and the second camera 105, to the information processing device 200. In this case, the context analysis unit 230 of the information processing device 200 determines the context of the monitored person based on the received raw data and image data. Since the information processing device 200 has more abundant computing resources than the monitored person terminal 100, it is possible to apply more complex and large-scale machine learning models (for example, Transformer-based time-series analysis models or behavior recognition models using deep learning). This has the advantage of enabling the recognition of more sophisticated contexts that were difficult to determine with the device alone (e.g., signs of emotional changes such as arguments with friends, or signs of poor health based on subtle changes in walking patterns).

[0110] <Summary>

[0111] [General Issue] One of the purposes of this disclosure is to provide a new information provision technology that enables others in remote locations to understand the state of a user in specific circumstances more deeply and intuitively.

[0112] [Note 1] Issues related to this disclosure One of the purposes of this disclosure is to provide optimal video information according to the user's context. [Note 1] A monitoring terminal comprising a first camera provided on the front of the housing, a second camera provided on the back of the housing, a plurality of heterogeneous sensors for acquiring information about the user's state and surrounding environment, and a processor, wherein the processor determines the user's context based on information acquired from the plurality of heterogeneous sensors, and activates at least one of the first camera and the second camera selected according to the determined context to perform imaging. According to the above, the camera best suited to capturing the user's internal state or external situation is selectively used according to the user's context, thereby improving the accuracy of situational awareness by others in remote locations.

[0113] [Note 2] The challenge is to detect, with high accuracy and immediately, particularly dangerous events such as falls among the user's emergencies. [Note 2] The terminal for a person being monitored according to claim 1, wherein the processor determines the context to be an emergency based on the free fall and subsequent impact detected by the inertial measuring device included in the plurality of different types of sensors, and activates both the first camera and the second camera to perform imaging. According to the above, a fall of the user is detected immediately, and the user's facial expression and surrounding circumstances at that time are simultaneously recorded and notified, enabling the guardian to make a quick and appropriate judgment and response to the situation.

[0114] [Note 3] The challenge is to detect more subtle abnormal situations that do not involve a clear impact such as a fall, such as poor physical condition or unusual behavioral patterns. [Note 3] The terminal for a person being monitored according to claim 1, wherein the processor compares the output of the inertial measuring device corresponding to the user's daily movement pattern stored in advance with the output of the device newly acquired, and determines the context to be an emergency when the comparison result exceeds a predetermined threshold. According to the above, by detecting deviations from the user's daily behavior, it is possible to capture a variety of abnormal situations other than falls, thereby increasing the comprehensiveness of monitoring.

[0115] [Appendix 4] The challenge with a mobile device camera is to capture the subject with optimal image quality, especially under complex lighting conditions such as backlighting. [Appendix 4] The terminal for monitoring a person according to claim 1, wherein the processor controls the imaging parameters of the other camera based on imaging information acquired by one of the first camera and the second camera. According to the above, by using one camera as an advanced environmental sensor for the other camera, the imaging parameters can be dynamically optimized and the shooting quality can be improved.

[0116] [Note 5] The challenge is to prevent the user's face, which is the subject of the image, from becoming completely black in backlit conditions, making it difficult to recognize their facial expression. [Note 5] The terminal for monitoring a person according to claim 4, wherein the processor automatically adjusts the exposure value of the first camera based on the average brightness of the image acquired by the second camera. According to the above, by automatically determining the backlit condition and correcting the exposure, the visibility of the user's facial expression can be ensured.

[0117] [Appendix 6] The challenge is to record the skin of the user, which is the subject of imaging, in natural colors even under ambient light of various shades. [Appendix 6] The terminal for monitoring a person according to claim 4, wherein the processor corrects the white balance of the first camera based on the color temperature information of the image acquired by the second camera. According to the above, by accurately grasping the color of the surrounding light environment and optimizing the white balance, more faithful color reproduction can be achieved.

[0118] [Note 7] The problem is that it is difficult for users such as children who are unfamiliar with operating the terminal to take a picture with the intended composition. [Note 7] The terminal for monitoring a person, according to claim 1, wherein the processor recognizes the user's face from the image being captured when the first camera is capturing an image, and when the second camera is capturing an image, it outputs guidance to guide the user to take a picture with a predetermined composition based on the tilt information acquired from the inertial measuring device. According to the above, by providing guidance in real time during shooting, the success rate of shooting can be increased regardless of the user's shooting skills.

[0119] [Note 8] Issues related to this: When guidance is only provided in text or audio, there is a problem that it is difficult for users to understand intuitively. [Note 8] The terminal for a person being monitored according to claim 7, wherein the processor displays the guidance superimposed on the live view image from the first camera or the second camera. According to the above, by displaying the guidance superimposed on the image being captured, users can intuitively understand what they should do, and usability is improved.

[0120] [Note 9] The challenge is to prevent users from missing photo opportunities or failing to compose their shots, and to enable them to easily acquire higher-quality images. [Note 9] The terminal for a person being monitored according to claim 1, wherein the processor, upon receiving an instruction to take an image, automatically adjusts the imaging range based on the image captured by the first camera or the second camera, and then performs the imaging. According to the above, by having the system automatically adjust the composition after the shooting operation, it is possible to reduce the burden on the user and achieve imaging with fewer failures.

[0121] [Note 10] Issues related to this: When operating while pressing a button with a finger, other operations become difficult or the visibility of the display is impaired. [Note 10] The terminal for a person being monitored according to claim 1, wherein the processor switches to a voice command mode in which it controls the camera's imaging by voice when it determines that it is necessary to perform imaging while the user is operating a button. According to the above, by automatically switching to hands-free operation depending on the situation, a seamless user experience can be provided and operability can be improved.

[0122] [Note 11] Issues related to this: When multiple functions are consolidated into a single button, there is a problem that operation becomes complicated and prone to errors. [Note 11] The monitoring terminal according to claim 1, further comprising: a first button for receiving voice message recording operations; and a second button for receiving imaging operations of the first camera or the second camera. According to the above, by assigning dedicated physical buttons to the main functions, more intuitive and error-free operation can be achieved.

[0123] [Note 12] The problem is that if imaging is performed unintentionally by the user, they may not be aware of this fact, which can cause anxiety regarding their privacy. [Note 12] The terminal for monitoring a person according to claim 1, wherein the housing is equipped with a touch panel display, and the processor displays the video acquired by the first camera or the second camera on the display while the camera is performing imaging, and displays the video on the display in a split view when both cameras are performing imaging simultaneously. According to the above, by displaying the video being captured in real time, the user can be clearly notified that imaging is in progress, and the transparency of the system can be ensured.

[0124] [Note 13] Issues related to this: In small terminals, the increase in the number of parts and design constraints that result from providing separate operation buttons and displays are issues. [Note 13] The terminal for monitoring a person according to claim 12, wherein the display is integrally formed on the surface of the operation buttons that accept voice message recordings. According to the above, by integrating the operation unit and the display unit, it is possible to achieve both miniaturization of the terminal and a simple and sophisticated design.

[0125] [Note 14] The challenge is to prevent unintentional imaging in places where privacy should be given particular importance, such as schools or friends' homes. [Note 14] The terminal for a person being watched according to claim 1, wherein the processor stops the activation of the first camera and the second camera when the terminal's current location is within a pre-registered specific area, except in cases where it is determined to be an emergency. According to the above, by automatically restricting the camera function in specific locations, the user's privacy can be protected in a detailed manner.

[0126] [Note 15] The problem that arises when a user such as a child gets lost is that a one-sided navigation system that simply presents a route does not fully alleviate the user's anxiety. [Note 15] The terminal for a person being monitored according to claim 1, wherein the processor displays navigation information overlaid on the image captured by the second camera, and at the same time determines the user's psychological state from the facial information captured by the first camera, and controls the display manner of the navigation information according to the psychological state. According to the above, by dynamically optimizing the navigation method according to the user's psychological state, it is possible to alleviate the user's anxiety and help them reach their destination with greater peace of mind.

[0127] [Note 16] Issues related to this: There are limitations to the processing power and battery resources of the terminal, making it difficult to perform all advanced decision-making processes on the terminal side. [Note 16] An information processing system comprising an information processing device and a terminal for a person being monitored as described in claim 1, wherein the terminal for the person being monitored transmits information acquired from a plurality of different types of sensors to the information processing device, the information processing device determines the context of the user based on the information received from the terminal for the person being monitored, and transmits a command to the terminal for the person being monitored to instruct imaging according to the determined context. According to the above, by offloading computationally intensive context determination processing to the server side, it is possible to save terminal resources while realizing more advanced and complex situational judgments. On the other hand, transmitting all information to the server may cause problems with communication delays and communication costs. For this reason, this system may adopt a hybrid configuration in which, for example, emergency contexts requiring immediate action, such as "falling over," are judged on the terminal side, and advanced context determinations requiring more detailed analysis and comparison with past data are judged on the server side. This makes it possible to achieve both responsiveness and advanced analytical capabilities.

[0128] [Note 17] Issues related to this: When a third party other than the user is captured in an image, the challenge is how to protect the privacy of that third party. [Note 17] An information processing device that receives an image transmitted from a monitored terminal equipped with a camera, and which detects at least a portion of the person or background contained in the image as sensitive information, and processes the image by anonymizing the sensitive information, thereby processing the image before transmitting it to the guardian terminal. According to the above, by automatically anonymizing the sensitive information contained in the image on the server side, the privacy of those involved can be protected, and an environment in which the service can be used with peace of mind can be provided.

[0129] [Note 18] The problem is that if images sent in chronological order are separated from their context, it is difficult to later understand the circumstances under which they were taken. [Note 18] An information processing device that receives image data, location information, and time information transmitted from a monitored terminal equipped with a camera, and performs the process of arranging the image data on a map in association with the location information and time information, and outputting the map display data. According to the above, by visualizing the image data on a map in association with spatiotemporal information, the events of the day can be intuitively understood as a story, thereby promoting communication. [Explanation of Symbols]

[0130] 1. Information Processing System 100 Terminals for persons under surveillance 101 Processors 102 memory 103 Storage 104 Camera 1 105 Second Camera 106 Sensor Group 107 GPS receiver 108 Audio Input / Output Section 109 Display section 110 Operation section 111 Communication Interface 121 Context determination unit 122 Imaging Control Unit 123 Parameter Control Unit 124 Guidance Output Section 125UI Control Unit 126 Navigation Control Unit 200 Information Processing Devices 201 Processor 202 memory 203 Storage 204 Communication Interface 210 Communications Department 220 Data receiving unit 222 Anonymization Processing Unit 223 Map display data generation unit 230 Context Analysis Department 240 Image / Audio Processing Unit 250 databases 260 Notification generator 270 Data transmission unit 300 Parental Control Devices 301 Processor 302 memory 303 Storage 304 Communication Interface 305 Display section 306 Operation section 307 Audio Input / Output Section 311 Data receiving unit 312 Data display section 313 Notification Management Department 314 Settings Management Department 315 Data transmission unit 316 UI Control Unit 400 Home screen 405 Voice Message Button 406 Photo shooting button 407 SOS button 600 Emergency notification confirmation UI screen 601 Message display area 602 Confirmation button 700 Map display screen 702 Person being monitored location icon 706 Pop-up 800 Album screen 801 Thumbnails 900 Settings screen 901 Notification Settings 902 Automatic shooting settings 903 Geofence Settings 904 Privacy Settings 905 SOS settings NW Network P User

Claims

1. The first camera is located on the front of the casing, A second camera is provided on the back of the aforementioned housing, Multiple heterogeneous sensors to acquire information about the user's state and surrounding environment, Equipped with a processor, The aforementioned processor, Based on the information obtained from the aforementioned multiple different types of sensors, the user's context is determined. Depending on the determined context, imaging is performed by at least one of the first camera and the second camera. A device for monitoring the person being monitored.

2. The processor determines the context to be an emergency based on the free fall and subsequent impact detected by the inertial measuring devices included in the plurality of different types of sensors, and activates both the first camera and the second camera to perform imaging. The monitoring terminal for the person being monitored according to claim 1.

3. The processor compares the output of the inertial measuring device corresponding to the user's daily operation pattern, which has been stored in advance, with the output of the newly acquired inertial measuring device, and determines the context to be an emergency if the result of the comparison exceeds a predetermined threshold. The monitoring terminal for the person being monitored according to claim 2.

4. The processor controls the imaging parameters of the other camera based on the imaging information acquired by one of the first and second cameras. The monitoring terminal for the person being monitored according to claim 1.

5. The processor automatically adjusts the exposure value of the first camera based on the average brightness of the image acquired by the second camera. The monitoring terminal for the person being monitored according to claim 4.

6. The processor corrects the white balance of the first camera based on the color temperature information of the image acquired by the second camera. The monitoring terminal for the person being monitored according to claim 4.

7. The processor recognizes the user's face from the image being captured by the first camera, and, when capturing with the second camera, outputs guidance to guide the user to a predetermined composition based on the tilt information acquired from the inertial measuring device. The monitoring terminal for the person being monitored according to claim 1.

8. The processor superimposes the guidance onto the live view image from the first camera or the second camera. The monitoring terminal for the person being monitored according to claim 7.

9. When the processor receives an imaging instruction, it automatically adjusts the imaging range based on the image captured by the first camera or the second camera, and then performs imaging. The monitoring terminal for the person being monitored according to claim 1.

10. If the processor determines that it wants to perform imaging while the user is operating a button, it switches to a voice command mode to control the camera's imaging by voice. The monitoring terminal for the person being monitored according to claim 1.

11. The housing further includes a first button for receiving a voice message recording operation and a second button for receiving an image capture operation of the first camera or the second camera. The monitoring terminal for the person being monitored according to claim 1.

12. The aforementioned enclosure is equipped with a display, The processor displays the video acquired by either the first camera or the second camera on the display while either camera is performing imaging, and displays the video acquired by both cameras on the display in a split view when both cameras are performing imaging simultaneously. The monitoring terminal for the person being monitored according to claim 1.

13. The aforementioned display is integrally formed on the surface of the operation button that accepts voice message recordings. The monitoring terminal for the person being monitored according to claim 12.

14. Equipped with further positioning means, The processor, when the location information acquired by the positioning means is within a predetermined area, will stop the operation of the first camera and the second camera, except in cases where it is determined to be an emergency. The monitoring terminal for the person being monitored according to claim 1.

15. The processor displays navigation information overlaid on the image captured by the second camera, and at the same time determines the user's psychological state from the facial information captured by the first camera, and controls the display mode of the navigation information according to that psychological state. The monitoring terminal for the person being monitored according to claim 1.

16. An information processing system comprising an information processing device and a terminal for a person being monitored as described in claim 1, The aforementioned terminal for the person being monitored transmits information acquired from the multiple different types of sensors to the information processing device. The aforementioned information processing device is Based on the information received from the monitored person's terminal, the context of the user is determined. A command to instruct imaging to be performed is sent to the monitored person's terminal, based on the determined context. Information processing system.

17. An information processing device that receives images transmitted from a monitoring terminal equipped with a camera, At least a portion of the people or background contained in the image is detected as sensitive information. By anonymizing the sensitive information, the system processes the image before sending it to the parent's device. Information processing device.

18. An information processing device that receives image data, location information, and time information transmitted from a monitoring terminal equipped with a camera, The image data is placed on a map in association with the location and time information, and the process of outputting the map display data is executed. A monitoring device.