Information processing device, information processing method, program, and system
The information processing device addresses the limitations of conventional GPS and camera-equipped monitoring devices by automatically anonymizing images in emergencies, ensuring privacy and reliability through secure data transmission and efficient battery use.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- MIXI INC
- Filing Date
- 2025-06-10
- Publication Date
- 2026-07-02
AI Technical Summary
Conventional GPS monitoring devices struggle to provide detailed non-verbal information about the monitored object, leading to limitations in making appropriate judgments, and camera-equipped devices face issues with privacy violations, device reliability, and sustainability.
An information processing device that includes a control unit for emergency detection, an image processing unit for anonymization, and a communication unit for secure data transmission, which automatically activates the camera to capture and anonymize images only in emergencies, ensuring privacy protection and device reliability by discarding raw images and transmitting only anonymized metadata.
Enhances monitoring quality by providing detailed non-verbal information while protecting privacy, maintaining device reliability through efficient battery use and secure data transmission, reducing the risk of misuse and data leakage.
Smart Images

Figure 2026110463000001_ABST
Abstract
Description
Technical Field
[0001] The present disclosure relates to an information processing apparatus, an information processing method, a program, and a system.
Background Art
[0002] In recent years, with the aging of society and the increase in dual-income households, the demand for devices that monitor the safety of children and the elderly has been rapidly increasing. Conventional GPS monitoring devices generally indirectly grasp the situation of the monitored object through the position information of the monitored object and the text conversion of the voice acquired by the built-in microphone. However, with only this information, it is difficult to grasp in detail the non-verbal information that is indispensable for a deeper understanding of the situation at that time, such as the expression, posture, and surrounding environment of the monitored object, and there are limitations in enabling the monitor to make appropriate judgments.
[0003] Under such circumstances, for the purpose of improving the quality of monitoring, the development of monitoring devices equipped with cameras has begun to be considered.
Prior Art Documents
Patent Documents
[0004]
Patent Document 1
Patent Document 2
Summary of the Invention
Problems to be Solved by the Invention
[0005] However, there may be things that can be improved in use just by mounting a camera.
[0006] This disclosure is made in light of the challenges in the background technology described above, and one of its purposes is to simultaneously solve multiple problems in camera-equipped monitoring devices, including privacy protection, device reliability and sustainability, and improvement of the quality of understanding the situation of the monitored person. [Means for solving the problem]
[0007] One technology in this disclosure is an information processing device comprising: a control unit configured to acquire acceleration information, a control unit configured to acquire audio information, and to automatically activate a camera and take an image when an emergency is detected based on the acquired acceleration information and audio information; an image processing unit configured to perform a process to anonymize a person's face from the captured image; and a communication unit configured to include a process to transmit the anonymized image to an external location and immediately discard the captured original image. [Brief explanation of the drawing]
[0008] [Figure 1] This figure shows the schematic configuration of the entire monitoring system according to one embodiment of the present disclosure. [Figure 2] This figure shows the main hardware configuration of an information processing device according to one embodiment of the present disclosure. [Figure 3] This is a block diagram showing the functional configuration of an information processing device according to one embodiment of the present disclosure. [Figure 4] This flowchart shows the main processing flow from emergency detection to camera activation in an information processing device according to one embodiment of the present disclosure. [Figure 5] This flowchart shows the processing flow from anonymization, transmission, and destruction of captured images in an information processing device according to one embodiment of this disclosure. [Figure 6] This is a data structure diagram showing an example of anonymized metadata transmitted externally from an information processing device according to one embodiment of this disclosure. [Figure 7] This figure shows an example of a parent terminal's operating mode setting screen related to the camera function of an information processing device according to one embodiment of this disclosure. [Figure 8] This figure shows an example of a privacy zone setting screen for an information processing device according to one embodiment of the present disclosure. [Modes for carrying out the invention]
[0009] The embodiments of this disclosure will be described in detail below with reference to the drawings. However, this disclosure is not limited to the embodiments described below, and various modifications and improvements that a person skilled in the art could conceive of are possible without departing from the spirit of this disclosure. Furthermore, the elements described in each embodiment may be combined and applied as appropriate depending on their purpose and function.
[0010] The following are some of the problems that may exist in the technology according to the embodiments of this disclosure.
[0011] Firstly, the greatest concern is the risk of serious privacy violations due to unintentional filming of the child being filmed, or of bystanders who accidentally enter the camera's field of view. This could cause discomfort and anxiety among those being monitored and those around them, potentially leading to resistance to using the device itself. Furthermore, if the raw image data is uploaded to the cloud, security risks such as data leakage and misuse increase, potentially undermining the reliability of the monitoring device itself.
[0012] Secondly, if the camera's functions are poorly designed, there is a risk that the child being monitored may use the camera as a toy out of curiosity. For example, if a child frequently activates the camera or takes videos without any reason, the device's battery will drain rapidly, significantly shortening the operating time, which is crucial for the monitoring function. In addition, the frequent generation and transmission of unnecessary shooting data can unnecessarily increase communication volume and storage capacity, which in turn risks reducing the stability and reliability of the monitoring device's intended functions, such as location information notification and emergency calls.
[0013] 1. Overview of the entire system As shown in Figure 1, a monitoring system 100 according to one embodiment of the present disclosure comprises, as its main components, an information processing device 10 (monitoring device) carried by the child being monitored, a parent terminal 20 used by a parent or other caregiver, a server 30 that mediates communication between the information processing device 10 and the parent terminal 20 and performs various data management and service provision, and a network 40 that connects these components to each other.
[0014] The information processing device 10 is the core device of this disclosure, and its built-in camera function is automatically activated only as a strict trigger for multifaceted and complex emergency detection using AI (voice analysis, acceleration data, GPS information, SOS button operation, etc.). Captured image data is anonymized and deidentified in real time by edge AI operating within the information processing device 10 (face blurring, personal information masking, background abstraction, etc.). After this anonymization process, only extremely small metadata and abstracted image data are sent to the parent terminal 20, and the original raw image data, which is most important from a privacy protection standpoint, is immediately discarded within the device. Furthermore, the camera function of the information processing device 10 is intentionally hidden from its user interface (UI / UX), and is designed so that children cannot freely activate and use it like a normal camera app. This simultaneously achieves privacy protection, prevents misuse of the camera by children, maintains battery efficiency, and optimizes data communication capacity, while dramatically improving the quality of monitoring by safely and quickly providing parents with nonverbal information about their children.
[0015] The parent terminal 20 is, for example, a general-purpose smartphone, tablet terminal, personal computer, or dedicated display device, etc., and dedicated application software for communicating with the information processing apparatus 10 is installed thereon. Through this application, the parent terminal 20 receives and displays anonymized information (such as location information, urgency based on sensor data, anonymized images, metadata including non-verbal information, etc.) from the information processing apparatus 10, and can grasp the situation of the child in real time. Also, the parent can perform various settings (such as the usage mode of the camera function, privacy zone, upper limit of the number of shootings, etc.) of the information processing apparatus 10 through the application of the parent terminal 20.
[0016] The server 30 is usually an information processing system constructed on a cloud computing environment, and is responsible for receiving and aggregating data from the information processing apparatus 10 and transferring the data to the parent terminal 20. Also, it receives setting information and requests from the parent terminal 20 and relays them to the information processing apparatus 10. Furthermore, if necessary, it is also possible to provide the information processing apparatus 10 with update data of the AI learning model in the information processing apparatus 10, base data for behavior history learning, etc. The data collected by the server 30 is only anonymized information, and raw image data is not stored in the server, so privacy protection is thorough.
[0017] The network 40 is a communication infrastructure for enabling data communication among the information processing apparatus 10, the parent terminal 20, and the server 30, and includes the Internet as a public network, a mobile communication network (e.g., cellular networks such as LTE, 5G), Wi-Fi (wireless LAN), or short-range wireless communication technologies such as Bluetooth. There may also be a case where direct short-range wireless communication is possible between the information processing apparatus 10 and the parent terminal 20.
[0018] 2. Description of Hardware Configuration As shown in FIG. 2, the information processing apparatus 10 includes the main hardware components necessary for functioning as a monitoring device. These include a processor 11 as a control unit, a memory 12 for temporarily holding programs and data, a sensor unit 13 for acquiring external information, a camera unit 14 for taking images, an image processing unit 15 for performing specific processing on image data, a communication unit 16 for communicating with the outside, a storage 17 for permanently storing programs and data, a display unit 18 for visually transmitting information, and a power supply unit 19 for supplying power to the entire system. These hardware components are electrically connected to each other via a bus or the like that enables high-speed data transmission, and by operating in cooperation, the overall functions of the information processing apparatus 10 are realized.
[0019] The processor 11 is a central processing unit (CPU) that can be said to be the brain of the information processing apparatus 10. It reads out from the storage 17 program codes for realizing an operating system (OS), various application programs, particularly the programs for realizing emergency detection, camera control, image anonymization processing, communication processing, etc., which are the core of the present disclosure, and executes them to comprehensively control the operation of the entire information processing apparatus 10. It is desirable to employ a high-performance and low-power-consuming processor.
[0020] The memory 12 is a volatile memory (e.g., RAM such as DRAM or SRAM) that temporarily stores the instruction codes and data being processed that are necessary when the processor 11 executes a program. It is also used as the cache memory of the CPU, a work area, and various buffers, enabling high-speed data access. In order to meet the real-time processing requirements of the monitoring device, a memory with sufficient capacity and speed is installed. [[ID=⑨]] [[ID=⑩]]
[0021] [[ID=⑪]] The sensor unit 13 comprehensively senses the external environment, such as the movement of the monitored object to which the information processing device 10 is attached, ambient sounds, and location information, as well as the state of the device itself, and acquires the corresponding information as digital data. Specifically, it includes an accelerometer for detecting the movement state and impacts of the monitored object, a microphone for picking up ambient sounds and voices, a GPS module for acquiring the device's current location information, and an SOS button (physical push button or touch sensor) for the monitored object to directly call for help in an emergency. The various information acquired by the sensor unit 13 is input to the processor 11 in an appropriate format.
[0022] The camera unit 14 includes a lens system for capturing high-definition images, an image sensor (CCD or CMOS sensor) that converts light into electrical signals, and an image signal processing circuit that converts analog signals from the image sensor into digital image data and performs basic image signal processing such as noise reduction and color correction. The camera unit 14 automatically activates only when an emergency is detected or at the request of a parent, and captures still images or short videos of the surrounding environment. For privacy protection, the captured image data is sent directly to the subsequent image processing unit 15 and is not transmitted directly to external devices.
[0023] The image processing unit 15 may be independently configured as a dedicated processor specialized for image processing (e.g., a GPU, an NPU (Neural Processing Unit), or a dedicated ASIC implementing specific functions), or it may function as part of the processor 11 (e.g., an image processing engine integrated into the processor 11). It receives raw image data captured by the camera unit 14 and performs anonymization and de-identification processing in real time and with low power consumption using the device's built-in edge AI. This edge AI processing applies advanced techniques such as lightweight AI model (e.g., model quantization, pruning), specialized learning for specific tasks (face detection, object recognition, pose estimation, etc.), and efficient data pipeline processing (minimizing memory copies, optimizing parallel processing) to achieve high-speed and accurate image analysis and transformation in a small, resource-constrained environment with limited battery capacity, such as a monitoring device.
[0024] The communication unit 16 includes a group of wireless communication modules for data communication between the information processing device 10 and external devices (parent terminal 20) or systems (server 30). This includes cellular communication modems such as LTE / 5G that enable wide-area communication, wireless LAN (Wi-Fi) modules used in home Wi-Fi environments, and Bluetooth modules and NFC (Near Field Communication) modules that enable short-range direct communication with the parent terminal. The communication unit 16 is responsible for securely transmitting only anonymized, low-capacity metadata and abstracted images to the outside.
[0025] Storage 17 is non-volatile memory (e.g., flash memory, eMMC, UFS, etc.) for the information processing device 10 to permanently store data. It stores the operating system executed by the processor 11, various application programs, parameters of the AI learning model, various restriction data set by the parent (usage mode, maximum number of shots, privacy zone settings, etc.), and past behavioral history data of the monitored subject. Raw image data is not stored long-term.
[0026] The display unit 18 is an indicator that visually notifies the child being monitored that the camera is in operation. For example, an LED indicator (which shows the status by flashing or changing color), a small LCD (Liquid Crystal Display), or an organic EL display can be used. The information processing device 10 is further equipped with an audio output unit (auditory notification unit) such as a speaker or buzzer (not shown), which can also provide auditory notification of the camera's operation through melodies or voice messages. This ensures that the child is aware that the camera has been activated, reducing their anxiety about unexpected recordings.
[0027] The power supply unit 19 is configured to supply stable power to the entire information processing device 10 and includes a high-energy-density, long-life rechargeable battery (e.g., lithium-ion battery), a charging circuit to control its charging, and a power management circuit to optimize power supply to each part and manage power consumption. The power supply unit 19 also has a function to constantly monitor the battery level and notify the processor 11 (particularly the camera control unit 102) of this information. This makes it possible to prioritize the maintenance of the device's core functions (location information notification and emergency calls), such as temporarily suspending the camera function when the battery level is low.
[0028] 3. Explanation of Functional Block Configuration As shown in Figure 3, the information processing device 10, from a functional standpoint, comprises major functional blocks such as a sensor information acquisition unit 130, an emergency detection unit 101, a camera control unit 102, an image processing unit 103, a communication unit 104, a notification unit 105, a setting management unit 106, and an action history learning unit 107. These functional blocks are software modules that are logically realized by the processor 11 of the information processing device 10 executing appropriate program code stored in the storage 17, or functional units realized by the cooperative operation of hardware and software.
[0029] The sensor information acquisition unit 130 is connected to the sensor unit 13 (Figure 2) as hardware, and continuously acquires a wide variety of sensor data as digital signals, including acceleration information, audio information, location information (GPS data), and SOS button operation information. The acquired raw data is pre-processed as needed (noise reduction, sampling rate conversion, etc.) and then supplied mainly to the emergency detection unit 101. Here, acceleration information and audio information are not limited to those directly acquired from dedicated acceleration sensors and microphones, but also include, for example, the image processing unit 103 analyzes image data captured by the camera unit 14 (Figure 2) to indirectly detect state changes corresponding to acceleration information from changes in the posture of the monitored subject (e.g., falling, sudden stop), or to indirectly detect abnormal sounds corresponding to audio information (e.g., screaming, collision sounds) by analyzing the waveform of ambient sounds contained in the image data (e.g., vibration patterns, sound source locations). This makes it possible to take a multifaceted approach to capture signs of an emergency from multiple information sources while reducing the number of parts in the device.
[0030] The emergency detection unit 101 is responsible for accurately detecting whether the monitored person is in an "emergency situation" based on various sensor data (acceleration information, voice information, location information, SOS button operation information) supplied from the sensor information acquisition unit 130 and AI learning results, including learning results based on past behavioral history provided by the behavioral history learning unit 107. In addition, explicit expressions of intent from the monitored person (e.g., specific voice commands spoken directly to the device, specific gestures such as shaking the device in a specific pattern, operation of the SOS button, etc.) can also be included as an important input for the AI to make a judgment, as a type of emergency. The AI learning results are used to make a comprehensive judgment based on the time-series changes in sensor data, the correlation between each sensor, and the degree of deviation from the monitored person's normal behavioral pattern. As a specific detection logic, for example, emergency keywords in the voice (e.g., "Help," "It hurts," "Dangerous") are detected in real time using AI transcription technology, and impacts such as falls and sudden stops are detected from acceleration sensor data at the same time. Furthermore, GPS information is used to detect deviations from the range of activity set by the parents and prolonged periods of inactivity in places where the child usually spends time. Instead of simply combining this information from multiple sensors using logical AND and OR, the AI learns these complex patterns and performs dynamic weighting and contextual judgment, significantly reducing false positives. In addition, an anomaly detection model is used to detect statistical deviations from a child's normal behavioral patterns (e.g., movements during sleep, normal walking patterns, time spent in specific locations) as anomalies. The system also incorporates a mechanism for continuously improving judgment accuracy by having the AI learn from false positive cases that occur during operation and from feedback from caregivers. This complex and learning-based approach makes it possible to respond to subtle signs of emergencies and static dangers such as kidnapping, which are often overlooked by conventional single-sensor or simple threshold judgments.
[0031] The camera control unit 102 automatically activates the camera unit 14 (Figure 2) and captures images of the surrounding environment when the emergency detection unit 101 detects an emergency. This function is essential for protecting privacy and preventing misuse of the camera in the monitoring device. Specifically, the camera function is intentionally hidden both physically and logically from the user interface (UI / UX) of the information processing device 10, and the camera app icon and activation button, which are typically found on smartphones, are not provided in a way that children can operate. This prevents unnecessary rapid battery drain and a wasteful increase in communication volume and storage capacity caused by children using the camera as a toy out of curiosity, thereby ensuring the reliability and sustainability of the entire monitoring function. The camera is only automatically activated by the system autonomously when an emergency is detected. Here, "automatic activation" of the camera broadly includes the mode in which the control unit autonomously activates the camera based on the detection result of the system detecting an emergency (including an explicit expression of intent by the monitored person), rather than the monitored person directly pressing an operation button to activate the camera function. For example, if a specific voice command or gesture is detected, the control unit will determine this expression as an emergency trigger and automatically activate the camera without direct operation from the monitored person. Furthermore, the camera control unit 102 strictly controls the operation of the camera function based on setting information regarding restrictions on the use of the camera function obtained from the setting management unit 106 by the parent. These restrictions include camera function usage modes that the parent can set through the app (e.g., emergency-only mode, parent-request-only mode), the maximum number of shots per day, or the maximum shooting time per shot. Also, based on battery level information from the power supply unit 19 (Figure 2), if the battery level falls below a preset threshold (e.g., 20%), the camera function will be temporarily and automatically stopped in order to prioritize maintaining the device's core functions (location information notification and emergency calls). In addition, if the AI detects excessive camera operation or abnormal operating patterns, such as repeatedly activating and stopping the camera in a short period of time, the camera function will be temporarily and automatically stopped in order to maintain the health of the device and prevent malfunctions.
[0032] The image processing unit 103 receives raw image data captured by the camera control unit 102 as input and executes the following multi-layered anonymization and de-identification processes in real time by running an AI model (edge AI) specifically for image processing within the information processing device 10. These processes are completed before the raw image data is transmitted outside the device. • Anonymization of faces: If the captured image contains the faces of people other than the person being monitored (e.g., passersby, shop staff, friends, or other third parties), the AI will automatically and accurately identify these faces and apply blurring, iconization, or mosaic processing to make them unidentifiable. This process is essential to ensure the protection of third-party privacy. • Personal Information Masking: The AI detects personally identifiable information and privacy-related information from captured images, such as car license plates, home or store nameplates, specific signs, barcodes, QR codes (registered trademarks), product names, company logos, and specific identification codes, and masks those parts with blackout, mosaic, or other de-identification patterns. • Background Abstraction: As an optional feature, the background of an image, excluding the person or main subject, can be abstracted into a granular, pixel-art-like form, converted to a solid color background, or filled with a specific pattern. This makes it difficult to identify the location where the photo was taken and further enhances the level of privacy protection.
[0033] Furthermore, the image processing unit 103 may perform a process to convert the entire captured image into a non-realistic style (for example, a cartoon-like illustration, a watercolor-like painting, or a blocky pixel art style) in order to further improve the level of privacy protection. This conversion abstracts the details of the entire image, including the faces of people, and significantly reduces its realism compared to the original image, making it extremely difficult to visually identify specific individuals. In parallel with these image processing operations, the image processing unit 103 extracts non-verbal information as metadata from the anonymized image data, such as facial expression information of the person being monitored (e.g., smiling, anxious, crying, expressionless), posture information (e.g., standing, sitting, lying down, crouching, walking), and type information of the surrounding environment (e.g., indoors, outdoors, park, roadside, commercial facility), and supplies it to the communication unit 104.
[0034] The communication unit 104 transmits only metadata (analysis results) or abstracted images, which have been anonymized and deidentified by the image processing unit 103 and whose size has been drastically reduced, to an external location (mainly the parent terminal 20, or the server 30 used by the parent terminal 20). Here, "transmission to an external location" includes not only transmission to the server 30 on the cloud via a network 40 such as the internet, but also direct data transfer to the parent terminal 20 via short-range wireless communication (e.g., NFC, Bluetooth, Wi-Fi Direct, etc.), which is more secure. Most importantly, the original raw image data captured is immediately discarded within the device without being transmitted externally by the communication unit 104. This fundamentally eliminates the risk of raw images remaining on the cloud and provides the strongest privacy protection against external data leakage and misuse. Furthermore, by transmitting only low-capacity data, it prevents unnecessary increases in communication volume and storage capacity, reduces device operating costs, and contributes to battery life.
[0035] The notification unit 105, based on instructions from the camera control unit 102, visually and / or audibly notifies the child being monitored that the camera is operating. Specifically, visual notification is provided by lighting or flashing an LED indicator built into the display unit 18 (Figure 2) of the information processing device 10, or by displaying a message such as "Camera in operation" or an icon on a small display. At the same time, auditory notification is provided by emitting a specific melody, a short voice message (e.g., "Camera activated"), or a buzzer sound through an audio output unit (speaker or buzzer) not shown. This ensures that the child is aware that the camera has been unexpectedly activated, gaining a sense of security that their privacy is protected, and also provides an opportunity for them to appropriately adjust their behavior in the situation by being aware of the camera's presence.
[0036] The settings management unit 106 manages various setting information related to the camera function, received via the communication unit 104 from the dedicated application on the parent terminal 20 (Figure 1). This setting information is securely stored in the storage 17 (Figure 2) of the information processing device 10. The settings management unit 106 provides information such as the camera function usage mode set by the parent (emergency-only mode, parent request-only mode, etc.), the maximum number of shots per day, the maximum shooting time limit per shot, and privacy zone settings that restrict shooting in specific locations to the camera control unit 102 and the emergency detection unit 101. It also has a function to update this setting information in response to a request from the parent. With the privacy zone setting, the parent can specify a specific area on a map, such as their home or school, where they want camera shooting to be automatically stopped. When the information processing device 10 detects that the camera has entered that zone based on GPS information, the camera function will automatically stop, even in an emergency.
[0037] The behavioral history learning unit 107 continuously learns the past behavioral patterns of the child being monitored (e.g., usual route to school, places frequently visited, average time spent in specific locations, range of activity by time of day, etc.) through AI learning, based on location information (GPS data) obtained from the sensor information acquisition unit 130, and other sensor information such as acceleration information and voice information as needed, and generates and updates a profile. This learning result is provided to the emergency detection unit 101 and contributes to improving the accuracy of emergency detection. For example, it becomes possible to detect more detailed and contextual behavioral abnormalities such as "deviation to places not usually visited" or "abnormally long periods of stillness," which were difficult to judge with conventional simple behavioral range deviation detection, such as when a child stays for a long time in a place that deviates significantly from the learned behavioral pattern (e.g., a place the child never usually goes to, a dangerous place). This increases the possibility of early detection of dangerous situations that are difficult to detect with voice or acceleration sensors alone, such as kidnapping, when the monitored person is still or moving silently.
[0038] 4. Description of the data structure As shown in Figure 6, the anonymized, low-capacity metadata transmitted externally by the communication unit 104 can include at least the following information. This metadata is designed to be visually analyzed and displayed by a dedicated application on the parent terminal 20, enabling the parent to understand the situation of the monitored person in more detail and from multiple perspectives while respecting privacy.
[0039] Timestamp: This is information indicating the exact date and time the camera took the image. It is recorded in ISO 8601 format (e.g., 2025-06-05T10:30:00Z), among others.
[0040] child_id (child ID): A unique identifier used to uniquely identify a specific child being monitored. This allows for accurate situational awareness even when a parent is monitoring multiple children or when devices are switched. Example: USER12345.
[0041] location_data (location information data): Geographic location information of the monitored subject at the time the image was taken. Includes numerical pairs of latitude and longitude. Accuracy information and information from location estimation methods other than GPS (Wi-Fi positioning, base station positioning, etc.) may also be added. Example: { "latitude": 35.658034, "longitude": 139.701636}.
[0042] emergency_level (Emergency Level): A numerical value or category indicating the degree of urgency of the current situation, as determined by the emergency detection unit 101. For example, it may be expressed as a tiered category such as "Low," "Medium," "High," or "Critical," or as a numerical score from 0 to 100 (the higher the number, the greater the urgency). This is an important indicator for monitors to quickly assess the seriousness of the situation.
[0043] face_expression (facial expression information): Information about the facial expressions of the child being monitored, analyzed by the AI within the image processing unit 103. For example, it is transmitted as an emotional category such as "HAPPY," "CONFUSED," "ANXIOUS," "CRYING," "ANGRY," or "NEUTRAL," or as a probability score for each emotion (e.g., { "HAPPY": 0.85, "NEUTRAL": 0.10, "ANXIOUS": 0.05}). This allows parents to understand the child's mental state and emotional nuances.
[0044] Posture information: Information about the posture of the child being monitored, analyzed by the AI within the image processing unit 103. For example, it is transmitted in categories such as "standing," "sitting," "fallen," "crouching," "walking," "running," and "jumping." When combined with fall detection, it becomes possible to understand the situation in more detail, and for example, information such as "fallen" can prompt a quick response.
[0045] environment_type (Surrounding Environment Type): Type information about the surrounding environment captured by the camera. The AI of the image processing unit 103 makes a determination based on image analysis. For example, it is transmitted in categories such as "Indoor (INDOOR)", "Outdoor (OUTDOOR)", "Park (PARK)", "Roadside (ROADSIDE)", "Commercial Facility (COMMERCIAL_AREA)", "School (SCHOOL)", "Home (HOME)", "Station (STATION)", and "Hospital (HOSPITAL)". This makes it difficult to pinpoint the exact location, while still allowing parents to have an abstract understanding of the environment their child is in.
[0046] audio_event: Information about a specific audio event detected by an audio sensor or sound wave analysis from an image. This may include categories such as "SCREAM," "TRAFFIC_NOISE," "SILENCE," "SPEECH," "SIREN," "ANIMAL_SOUND," or "BREAKING_SOUND," or it may also include detection results of specific emergency keywords (e.g., "Help," "It hurts," "Stop") or the results of speech sentiment analysis.
[0047] masked_objects: Information indicating the type of object that has been masked and the anonymized region within that image. This is metadata that allows the parent device to understand which parts of the anonymized image were masked and for what reason when the anonymized image is displayed on the parent device. For example, it is in the form of an array such as [{"type": "license_plate", "area": "masked_region_coords_A"}, {"type": "signboard", "area": "masked_region_coords_B"}] and includes the type of anonymized object and the coordinate information of the anonymized region within the original image. This allows the parent to recognize that part of the image has been anonymized and to grasp the necessary information.
[0048] 5. Explanation of the processing flow As shown in Figure 4, the main processes from emergency detection to camera activation in the information processing device 10 are performed by the following steps S100 to S105. This flow aims to suppress unnecessary camera activation and maximize privacy protection and battery efficiency.
[0049] First, in step S100, the sensor information acquisition unit 130 continuously and in real time acquires a wide range of sensor data from the sensor unit 13 mounted on the information processing device 10, including acceleration information (e.g., device shaking, impact), audio information (e.g., ambient sounds, speech), GPS information (current location, movement speed), and SOS button operation information (e.g., physical press, long press). This acquired raw data is stored in the device's internal high-speed buffer memory or temporary storage as input for emergency detection.
[0050] Next, in step S101, the emergency detection unit 101 comprehensively detects and determines whether the current situation is an "emergency" based on the latest sensor information acquired in step S100 and the AI learning results, which include learning results based on the past behavioral history of the monitored subject provided by the behavioral history learning unit 107 (Figure 3). These AI learning results are used to determine the time-series changes in sensor data, the complex intercorrelations between sensors, and statistical deviations from the monitored subject's normal behavioral patterns, which are difficult to judge from single sensor information alone. For example, if emergency keywords such as "help," "it hurts," and "it's dangerous" are detected in the audio using AI transcription technology, and at the same time, a strong impact such as a child falling or sudden stop is detected from the acceleration sensor data, it is evaluated as a high level of urgency. In addition, GPS information is used to detect deviations from a safe range of activity set in advance by the parent, and unnatural prolonged stillness in places where the child usually acts. Furthermore, the system recognizes an emergency if the monitored person explicitly presses the SOS button, utters a specific voice command (e.g., "Activate the camera"), or performs a specific gesture (e.g., shakes the device three times). Based on this combined information, the AI dramatically reduces false positives and accurately identifies only genuine emergencies where the camera should be activated.
[0051] In step S102, the emergency detection unit 101 determines whether or not an emergency has been detected based on the detection results in step S101. If no emergency is detected (S102: No), the process returns to step S100, and the continuous acquisition of sensor information and emergency detection processing are repeated. This thoroughly suppresses unnecessary camera activation.
[0052] On the other hand, if an emergency is detected (S102: Yes), in step S103, the camera control unit 102 comprehensively checks various settings related to restrictions on the use of the camera function and the self-protection status of the device itself. Specifically, it checks whether the camera function's "usage mode" set by the parent (e.g., "emergency-only mode" which activates only when the AI determines it is an emergency, or "parent request-only mode" which also accepts explicit requests from the parent), obtained from the setting management unit 106, has reached its limit, whether the battery level has not fallen below a preset threshold (e.g., 20%) based on the battery level information from the power supply unit 19, and whether the AI has detected any excessive camera operation patterns, such as repeatedly starting and stopping the camera in a short period of time. It also checks whether the information processing device 10 is located within the privacy zone set by the parent, and if it is within the privacy zone, it restricts camera activation as a general rule.
[0053] In step S104, the camera control unit 102 determines whether or not to allow the camera to activate based on the usage restrictions and self-protection status checks confirmed in step S103. For example, if the battery level is low, if camera activation is not permitted in the parent settings, or if the device is determined to be within the privacy zone, the camera will not be allowed to activate, regardless of whether an emergency has been detected. If activation is not permitted (S104: No), the camera will not activate, and the process returns to step S100 to continue acquiring sensor information. This ensures that privacy and device health are given top priority. When returning to S100, it is also possible to emit a sound of 50 dB or higher so that others can hear it to signal an emergency.
[0054] If activation is permitted (S104: Yes), in step S105, the camera control unit 102 automatically activates the camera unit 14 and captures images of the surrounding environment. At the same time, the notification unit 105 visually (display unit 18) or audibly (audio output unit) notifies the child that the camera is operating. For example, the device's LED flashes and a short beep sounds. This ensures that the child is aware that the camera has been activated, increasing their awareness of privacy protection and reducing their anxiety about being photographed unexpectedly. Depending on the settings, S104 may be skipped, and the camera may always be activated in case of an emergency.
[0055] As shown in Figure 5, the process from anonymizing the captured image to transmitting it externally and then discarding the original raw image is carried out extremely quickly and securely through the following steps S200 to S203. This flow aims to maximize privacy protection and communication efficiency.
[0056] First, in step S200, the raw image data that has just been captured by the camera unit 14 (Figure 2) is input to the image processing unit 103. This raw image data is temporarily held in the memory 12 of the information processing device 10, but it is assumed that it will be immediately and completely discarded in subsequent processing, and it will not be stored long-term in the device's storage 17 or transmitted externally.
[0057] Next, in step S201, the image processing unit 103 uses the device's built-in edge AI (AI model) to analyze the raw image input in step S200 in real time and performs multiple anonymization and de-identification processes. Specifically, the AI recognizes people in the image and automatically detects the faces of people other than the person being monitored (e.g., passersby, shop staff, etc.), and anonymizes them by blurring, iconizing, or mosaic processing to make it impossible to identify individuals. The AI also detects personally identifiable information and privacy-related information such as license plates, nameplates, specific signs, shop names, barcodes, and QR codes (registered trademarks), and masks those parts with black or mosaic. If necessary, "background abstraction" processing is also performed, such as abstracting the entire background like pixel art, converting it to a single-color background, or filling it with a specific pattern, making it difficult to identify the shooting location and further enhancing the level of privacy protection. Furthermore, the image processing unit 103 may perform a process to convert the entire captured image into a non-realistic style that eliminates realistic representation (for example, an illustration style like Japanese manga, a transparent watercolor style, or a retro blocky pixel art style) in order to improve the level of privacy protection. This conversion abstracts the details of the entire image, including the faces of people, and significantly reduces the realism compared to the original image, making it extremely difficult to visually identify specific individuals. In parallel with these anonymization and de-identification processes, the image processing unit 103 extracts non-verbal information as metadata from the anonymized image data, such as facial expression information (smiling, anxious, etc.), posture information (standing, sitting, lying down, etc.), and type information of the surrounding environment (indoors, outdoors, park, roadside, etc.), and supplies it to the communication unit 104.
[0058] In step S202, the communication unit 104 transmits only the metadata (analysis results) or abstracted image, which has been anonymized and deidentified by the image processing unit 103 in step S201 and whose size has been drastically reduced, to the outside. This "outside" often refers to the parent terminal 20 via the application server 30 used by the parent terminal 20, but transmission to the outside is not limited to transmission via a server over a network such as the Internet, but also includes direct transfer to other devices such as the parent terminal 20 via short-range wireless communication such as Bluetooth or NFC (Near Field Communication). This direct transfer function ensures a more closed and secure data distribution route that does not go through the cloud.
[0059] In step S203, the communication unit 104 completely and immediately destroys the original raw image data captured in step S200 from the memory 12 in the device to the outside immediately after the communication unit 104 has finished transmitting it, or immediately after the anonymization process is completed. This fundamentally eliminates the risk of the raw image data being stored in the device for a long period of time or being unintentionally leaked outside the device, thereby thoroughly protecting privacy. Here, "immediate destruction" means that the original raw image data captured is erased in a way that makes it difficult to recover within a period after its primary purpose (in this embodiment, anonymization and extraction of non-verbal information) has been achieved, until it becomes virtually impossible for other processes to access it or for it to be used unintentionally. Specifically, this includes a mode in which the raw image data is completely erased from the storage device immediately after the anonymization process is completed (for example, within a predetermined number of seconds, preferably within 1 second), or a mode in which the control unit actively performs the erasure process, rather than passive loss such as simply being overwritten with subsequent data. This clearly distinguishes it from a configuration in which the data is intentionally destroyed with a delay of several minutes or temporarily saved for analysis purposes. In this way, by ensuring that raw image data never leaves the device and only exists temporarily within the device, we establish the reliability of the monitoring device in terms of privacy protection.
[0060] 6. Definitions of Terms Information Processing Device 10 (Monitoring Device): The "information processing device" refers to a small electronic device intended to be carried daily by the child being monitored. This device integrates location information (GPS), communication functions, multiple environmental sensors, and camera functions, and its main purpose is to inform the caregiver of the child's situation while ensuring both safety and privacy. In terms of specific forms, it can take various forms, such as a wearable device worn by the child (e.g., smartwatch type, wristband type), a pendant type worn around the neck, a keychain type attached to a school bag or backpack, or a stationary device installed in the child's room. In this specification and in the claims, the term "sensor" does not refer only to physical, dedicated hardware sensors such as accelerometers or microphones unless clearly limited by the context. Rather, it should be interpreted to broadly mean functional components (modules, means, or units) for acquiring a particular type of information. For example, a functional configuration in which the image processing unit 103 analyzes image data captured by the camera unit 14 to estimate "acceleration information" from changes in the posture of the person being monitored, or to extract "audio information" from sound wave patterns and mouth movements in the image, is clearly included in the definition of "sensor" in this disclosure.
[0061] Emergency Situation: An "emergency situation" refers to a situation in which the life, physical, or mental safety of the child being monitored may be endangered. This situation is determined not by a single sensor or simple logical judgment, but by a combination of multiple sensor information such as acceleration information, audio information, location information, and SOS button operation information, as well as AI learning results (including behavioral history learning). Specifically, this includes detection of emergency keywords in audio using AI transcription technology (e.g., "Help," "It hurts," "Stop," "Fire!"), detection of falls or sudden stops, or violent exercise or impact using an acceleration sensor, deviation from a safe range of activity set by the parent using GPS, a combination of unnatural prolonged stillness in a place where the child usually operates, or a clear expression of intent such as the monitored child directly operating the SOS button. In addition, explicit expressions of intent from the monitored child (e.g., specific voice commands such as "Take a picture," "Help!", specific gestures such as shaking the device in a specific pattern, or operation of the SOS button) may also be included as a type of emergency situation based on the monitored child's intent. Furthermore, the term "emergency situation" in this disclosure is not necessarily limited to cases in which the AI makes a combined judgment based on multiple pieces of information. For example, individual dangerous events that can be directly judged from a single or relatively small amount of information, such as a "fall" detected based solely on acceleration information exceeding a predetermined threshold, or a "scream" detected based solely on audio information as matching a specific pattern, are clearly included as a form of "emergency" that can trigger camera activation.
[0062] Anonymization: "Anonymization" refers to any process that removes, transforms, or conceals information that could identify a specific individual from image data. This is an essential process for protecting privacy. In this embodiment, specific anonymization methods include the AI automatically identifying the faces of people other than children (third parties) and applying blurring, iconization, or mosaic processing, as well as the AI detecting personally identifiable information or privacy-related information such as license plates, nameplates, specific signs, and store names, and masking those parts with black or mosaic effects. Additionally, as an option, the entire background can be abstracted like pixel art or converted to a monochrome background to make it difficult to identify the shooting location and further enhance the level of privacy protection. Furthermore, a type of anonymization is also included where the entire image is converted to a non-realistic style (for example, an illustration style like Japanese manga, a painting style like a transparent watercolor, or a retro blocky pixel art style), significantly reducing the realism of the original image information, including people's faces, decreasing visual identifiability, and making it extremely difficult to visually identify a specific individual.
[0063] Edge AI: Edge AI refers to a technology that performs AI model inference processing directly on end devices (edge devices) such as the information processing unit 10, rather than on cloud servers or remote data centers. This enables real-time anonymization and analysis of raw image data without transmitting it outside the device, minimizing communication delays, reducing data volume, enhancing data security, and protecting privacy (especially preventing raw data from leaking externally). To achieve high-performance AI processing under the constraints of limited computing resources and battery life, model optimization (e.g., quantization, pruning) and hardware acceleration (e.g., NPU) are essential.
[0064] Nonverbal Information: "Nonverbal information" refers to information that is not in the form of words (speech or text), and includes various situational information extracted from visual and auditory information to understand the situation of the person being monitored. Specifically, this includes the child's facial expressions (smiling, anxious, crying, etc.), posture (standing, sitting, lying down, etc.), the surrounding environment (indoors, outdoors, park, roadside, etc.), ambient noise levels, and the occurrence of specific sounds (shouting, sirens, etc.). This information is abstracted and structured as metadata and provided to parents without directly transmitting raw images, dramatically improving the "resolution" of monitoring while respecting privacy.
[0065] In this specification, "destruction" is not limited to physically or logically erasing data. For example, encrypting the original image data with a strong encryption algorithm, immediately destroying the decryption key used in the process, and not retaining it on the device itself or transmitting it externally, thereby making it virtually impossible for anyone to recover the original image, is also included in the concept of "destruction" as a process that effectively makes the data irreversible.
[0066] In this specification, a decision or action is made "on the basis" of information if that information is used as one of several factors that may influence the outcome in the process of that decision or action. Even if other information (e.g., GPS information, user settings, time information, etc.) is the primary determinant, or if multiple pieces of information are used in a weighted manner, the information is considered "on the basis" of the decision as long as it is referenced as input to the decision logic and has an objectively recognizable influence on the outcome.
[0067] 7. Alternative configurations, variations, and applications In this embodiment, the sensor unit 13 has been described as including an acceleration sensor, a microphone, a GPS module, and an SOS button. However, without departing from the spirit of this disclosure, it is not limited to this, and various types of sensors may be combined to further improve the accuracy of emergency detection. For example, it is conceivable to additionally install an ambient light sensor (to estimate changes in indoor and outdoor conditions or movement to unnatural locations from changes in ambient brightness), a temperature and humidity sensor (to detect abnormal body temperature or heatstroke risk), a heart rate sensor (to detect psychological stress or abnormal conditions), and a biometric information sensor (to detect deterioration of health status from breathing patterns and sweat volume). The information obtained from these additional sensors can be processed in parallel by multiple AI models, and it is possible to calculate the final level of urgency in a comprehensive manner, enabling more multifaceted monitoring of the safety of the person being monitored.
[0068] Furthermore, the specific methods of anonymization processing by edge AI are not limited to blurring or masking faces or abstracting backgrounds. Various image transformation technologies may be applied as long as the objective of achieving both privacy protection and information provision as described in this disclosure is achieved. For example, techniques such as transformation into a specific painting style, conversion to line drawings, or extraction of specific information (e.g., only the silhouette of a person) and complete deletion of other information can be considered. The accuracy and methods of anonymization processing may be dynamically adjusted according to the usage scenario of the information processing device 10 (e.g., indoors or outdoors), the settings of the monitor (e.g., high or low privacy protection level), or the resources of the device (e.g., battery level). For example, it is possible to control the system by switching to an anonymization method with a lower processing load when the battery is low.
[0069] The communication unit 16 can not only utilize mobile phone networks (LTE, 5G, etc.) but can also optimize communication efficiency by combining it with short-range wireless communication technologies such as Wi-Fi and Bluetooth. For example, when the information processing device 10 is in a pre-registered and reliable Wi-Fi environment such as a home or school, it may be permitted to transmit relatively large abstract images at a high frequency, while in other cellular network environments, it may be restricted to transmitting only metadata in order to reduce the amount of data transmitted. This makes it possible to provide optimal information according to the communication environment while keeping communication costs down.
[0070] Although this embodiment has been described primarily as a child monitoring device, the technical features of this disclosure are applicable to other fields as well. For example, in monitoring the elderly, particularly dementia patients, the AI learning model can be customized to detect behavioral abnormalities such as staying in the toilet for extended periods, staying in a specific room, or remaining still in an unnatural location for extended periods, in addition to fall detection. In pet monitoring, deviations from the pet's behavioral patterns (e.g., activity time, rest time) or signs of deteriorating health (e.g., abnormal posture, breathing sounds) can be detected. Furthermore, it can be applied to monitoring the safety of workers in specific work sites such as factories and construction sites. In this case, it can detect worker falls, abnormal postures, and intrusions into dangerous areas, and notify administrators in a way that respects privacy.
[0071] The information processing device disclosed herein may be configured as a wearable device worn by the person being monitored, or as a stationary device installed around the person being monitored. Wearable devices are suitable for real-time situational awareness because they are always worn, while stationary devices are suitable for monitoring a specific space. These configurations can also be used in combination.
[0072] Furthermore, the various functions described in this disclosure can be realized not only through hardware combinations, but also as computer programs, and can be understood as information processing methods that the computer (processor 11) executes. This program may be distributed from the server 30 to the information processing device 10 via a network such as the Internet, or it may be provided by being recorded on a computer-readable recording medium such as a USB memory stick, SD card, CD-ROM, SSD, or HDD.
[0073] (Variation example: Secure logging of the disposal process) In this modified example, when the communication unit 104 or related control unit performs the original image destruction process, it may be configured to record log data indicating the execution of the process (e.g., execution timestamp of the destruction process, hash value as an identifier for the destroyed image, and processing success status) in a secure area with tamper-proof functionality within the storage 17 of the information processing device 10. This log data cannot be accessed through normal user operation and can only be read, for example, in the device's diagnostic mode or after going through a proper authentication procedure. By making the existence and contents of this log verifiable retrospectively, a means is provided to objectively prove whether the device is properly executing the original image destruction process as designed, thereby reducing the difficulty of proving infringement.
[0074] (Variation example: Log recording of decision logic) In this modified version, when the emergency detection unit 101 makes a decision to activate the camera, it may be configured to record log data indicating the main information that formed the basis of that decision (e.g., sensor data exceeding a threshold, detected keywords, identifier of the applied AI model, internal urgency score, etc.) in a secure area with tamper-proof functionality. By checking this log, it is possible to retrospectively verify what information the decision was "based on," thereby reducing the difficulty in proving infringement due to the black-box nature of AI decision-making.
[0075] This disclosure demonstrates that the processor 11 mounted on the information processing device 10 performs complex emergency detection using AI learning results, and further performs advanced image processing in real time using the device's built-in edge AI, resulting in significant technological improvements not found in conventional technologies and contributions to other related technological fields. This will be an important factor to be evaluated as a technological effect and improvement in computer functionality, particularly in determining patent eligibility in foreign countries such as the US and EP.
[0076] Reduced Processing Load: Compared to conventional cloud-based image processing systems, instead of sending all captured high-resolution raw images to a cloud server for processing, the anonymization process is completed within the information processing device 10. This significantly reduces the load on the computing resources of the cloud server and improves the overall scalability of the system. This directly contributes to reducing operating costs and ensuring stable operation in large-scale systems where numerous monitoring devices are running simultaneously.
[0077] Reduced communication frequency and volume: Since it's no longer necessary to frequently transmit high-resolution raw images over the network, the number and volume of communications can be dramatically reduced. This allows for more efficient use of bandwidth, offering significant economic benefits, especially for users with data-limited contract plans. Furthermore, it enables the stable provision of monitoring device functions even in areas with underdeveloped communication infrastructure or unstable communication environments. This results in not only reduced communication costs but also improved communication quality.
[0078] Improved real-time data processing and display: Because raw images are processed directly within the device, network latency associated with sending and receiving data to and from the cloud is virtually eliminated. This minimizes the time it takes for information to be provided to caregivers after an emergency occurs, dramatically improving the real-time nature of information delivery. For example, a parent's smartphone will receive a notification of the situation within seconds of a child falling, prompting a quick response (e.g., rushing to the scene, calling for emergency services). This solves the crucial technical challenge of response speed for monitoring devices.
[0079] Improved data robustness (security): Captured raw image data is never transmitted outside the information processing device 10 and is immediately discarded within the device. This fundamentally eliminates security risks such as data leakage, unauthorized access, or misuse on cloud servers or communication paths. This is the most powerful and radical solution for protecting privacy and provides a decisive technological advantage in ensuring user trust, especially in monitoring devices that handle sensitive data such as the personal information of children and the elderly.
[0080] Improvements in usability include preventing accidental operation, reducing the user's input burden, and improving visibility. To prevent accidental operation, the camera function is physically and logically hidden from the UI / UX of the information processing device 10 and limited to automatic activation by AI, thereby reliably preventing the child being monitored from unintentionally operating the camera or using it as a toy out of curiosity. This prevents problems such as unnecessary shooting, battery drain, and increased communication volume, and ensures the reliability and stable operation of the monitoring function over the long term.
[0081] Regarding the reduction of user input burden, the AI autonomously makes emergency decisions based on complex sensor information, eliminating the need for the monitored person (child) to consciously perform complex operations (e.g., pressing a specific button multiple times, uttering a specific word) to accurately communicate their dangerous situation. In an emergency, they only need to intuitively press the SOS button or utter a specific voice (e.g., "Help"), significantly reducing the psychological and physical burden on the monitored person.
[0082] To improve visibility, the application on the parent terminal 20 displays anonymized non-verbal information (facial expressions, posture, environment type, etc.) transmitted from the information processing device 10 not only as text information, but also graphically in a visually easy-to-understand format, such as icons, graphs, and abstract images. This makes it easier for parents to intuitively and quickly understand their child's situation, reducing the cognitive burden of grasping the information. Furthermore, by presenting appropriate responses as needed (e.g., route guidance on a map, dial buttons for emergency contacts), the burden on the parent is also reduced.
[0083] These technological improvements dramatically enhance the functionality, safety, efficiency, and usability of a specific type of computer system, namely information processing devices (monitoring devices), and contribute significantly to improving computer functionality and user interfaces.
[0084] [General tasks] One of the purposes of this disclosure is to simultaneously address several challenges in camera-equipped monitoring devices, including privacy protection, device reliability and sustainability, and improving the quality of understanding the situation of the person being monitored.
[0085] Issues corresponding to [Appendix 1] One of the purposes of this disclosure is to safely understand the situation of those being monitored while protecting their privacy. [Note 1] An information processing device comprising: a control unit configured to acquire acceleration information, acquire audio information, and automatically activate a camera to take an image when an emergency is detected based on the acquired acceleration information and audio information; an image processing unit configured to perform a process to anonymize a person's face from the captured image; and a communication unit configured to include a process to transmit the anonymized image to an external location and immediately discard the original captured image. According to the information processing device described above, strict event-driven recording and real-time anonymization processing can prevent unintended privacy violations of children and third parties, providing an environment where camera-equipped monitoring devices can be used with peace of mind. Specifically, according to one aspect of this disclosure, by combining strict event-driven recording with real-time anonymization processing within the device, unintentional privacy violations of children and surrounding third parties are fundamentally avoided, providing an environment in which caregivers can use camera-equipped monitoring devices with peace of mind. Furthermore, by thoroughly preventing misuse of the camera, battery life, which is extremely important for monitoring devices, is ensured (for example, it is possible to maintain battery life equivalent to or better than conventional GPS monitoring devices), and unnecessary increases in communication volume and storage capacity can be eliminated. As a result, functions that are the core functions of the monitoring device, such as location information notification and emergency calls, operate stably, and the reliability of the entire system is dramatically improved. In addition, it becomes possible to safely and efficiently provide parents with non-verbal information such as the facial expressions, posture, and surrounding environment of the child being monitored, while giving maximum consideration to privacy. As a result, parents can understand the child's situation accurately at a level that could not be obtained with conventional voice text or location information alone, enabling them to make quicker and more accurate judgments and responses in emergencies, and to provide appropriate words and support in response to subtle changes in the child during normal times, dramatically improving the quality of situation sharing between parents and children.
[0086] Issues corresponding to [Appendix 2] One of the purposes of this disclosure is to reduce false positives and achieve more accurate and efficient emergency detection. [Appendix 2] An information processing device according to claim 1, wherein the control unit further detects an emergency based on a plurality of pieces of information from the acceleration information, the voice information, the location information, and the operation information of the SOS button, and an AI learning result based on the plurality of pieces of information including a learning result based on the past behavior history of the information processing device. This allows the AI to learn the time-series changes and correlations of sensor data, and by performing dynamic weighting, anomaly detection models, and false positive learning, it dramatically reduces false positives and minimizes battery consumption, resulting in remarkable benefits.
[0087] Issues corresponding to [Appendix 3] One of the purposes of this disclosure is to protect the privacy of third parties in real time through image processing within the device. [Appendix 3] An information processing device according to claim 1, wherein the image processing unit further analyzes the captured image in real time using an edge AI built into the device and anonymizes the faces of persons other than the person being monitored. This enables high-speed and high-precision anonymization processing within the constraints of a small, low-power monitoring device, eliminating the risk of raw data leakage to external parties.
[0088] Issues corresponding to [Appendix 4] One of the purposes of this disclosure is to ensure that personally identifiable information is masked and to enhance privacy protection. [Appendix 4] An information processing device according to claim 1, wherein the image processing unit further analyzes the captured image in real time using an edge AI built into the device and masks personally identifiable information (such as license plates and nameplates). This allows for more comprehensive protection of third-party privacy and reduces the risk of leakage of personal information other than that of the monitored individuals, thanks to the technological advantages of real-time processing using edge AI.
[0089] Issues corresponding to [Appendix 5] One of the purposes of this disclosure is to fundamentally eliminate the risk of privacy infringement and to optimize communication volume and storage capacity. [Appendix 5] An information processing apparatus according to claim 1, wherein the communication unit further transmits only low-capacity metadata or abstract images anonymized by the image processing unit to the outside, and immediately discards the captured original image within the device. This fundamentally eliminates the risk of raw images being uploaded outside the device, providing the strongest protection against data leaks and misuse. Furthermore, low-volume data transmission optimizes communication volume and storage capacity, improving the reliability and sustainability of the monitoring function.
[0090] Issues corresponding to [Appendix 6] One of the purposes of this disclosure is to prevent misuse of cameras by children and to maintain the reliability of the monitoring function. [Appendix 6] An information processing device according to claim 1, wherein the control unit further hides the camera function from the user interface of the information processing device and automatically activates the camera only when an emergency is detected. This directly addresses the risks of battery drain and increased data usage caused by children using the camera as a toy, and significantly improves the reliability and sustainability of the monitoring function.
[0091] Issues corresponding to [Appendix 7] One of the purposes of this disclosure is to provide parents with high-quality nonverbal information about their child's situation, without relying on raw images. [Appendix 7] An information processing device according to claim 5, wherein the communication unit further transmits to the outside information including at least one of the facial expression information of the person being monitored, posture information, and type information of the surrounding environment, as anonymized low-capacity metadata. This provides parents with crucial clues to comprehensively assess their child's mental and physical state and surrounding environment without raw images, which is an added value not found in conventional monitoring devices. It also significantly improves the "resolution" of the monitoring function, thus justifying its inventiveness.
[0092] Issues corresponding to [Appendix 8] One of the purposes of this disclosure is to prevent misuse while providing flexibility in camera use to meet parents' needs. [Appendix 8] An information processing device according to claim 1, wherein the control unit further restricts the operation of the camera function based on the camera function usage mode set by the parent (including emergency-only mode and parent-request-only mode), and / or the maximum number of shots and time per day. This allows parents to actively control and restrict the operation of the device's camera function, encouraging appropriate use of the device, balancing battery consumption optimization with privacy protection, and improving its reliability as a monitoring device.
[0093] Issues corresponding to [Appendix 9] One of the purposes of this disclosure is to ensure the healthy operation of the device and the continuity of its monitoring function. [Note 9] An information processing device according to claim 1, wherein the control unit further automatically temporarily stops the camera function when the battery level of the information processing device falls below a threshold and / or when the AI detects excessive operation of the camera. This avoids the risk of the entire monitoring function shutting down due to battery depletion or device malfunction, and ensures the continuity of the most important functions (location information and voice calls), thereby guaranteeing reliability and stable operation as a monitoring device.
[0094] Issues corresponding to [Appendix 10] One of the purposes of this disclosure is to securely control the use of camera functions based on explicit expressions of intent from the person being monitored. [Appendix 10] An information processing apparatus according to claim 1, wherein the control unit further comprises detecting the emergency, which includes detecting the input of a specific voice command or a specific gesture by the person being monitored. This allows the system to safely and reliably reflect the wishes of the person being monitored if they choose to use the camera function, while suppressing unnecessary recording and raising awareness of privacy protection.
[0095] Issues corresponding to [Appendix 11] One of the purposes of this disclosure is to achieve both privacy protection and visual contextual understanding by converting images to a non-photorealistic style. [Note 11] An information processing device according to claim 1, wherein the image processing unit further performs a process that reduces the identifiability of a person in the image by converting the entire captured image into a non-photorealistic style. This abstracts the details of the entire image, including people's faces, making it difficult to identify specific individuals. As a result, it enhances the level of privacy protection while also imbuing recorded images with artistic value and visual enjoyment.
[0096] Issues corresponding to [Appendix 12] One of the purposes of this disclosure is to reduce reliance on external networks and enable more direct and secure data transfer. [Note 12] An information processing apparatus according to claim 1, wherein the communication unit further includes a process of directly transmitting the anonymized image to a parent terminal by short-range wireless communication. This allows for direct data exchange between the parent device and the device without going through a cloud server, further reducing the risk of privacy violations and minimizing communication delays.
[0097] Issues corresponding to [Appendix 13] One of the purposes of this disclosure is to optimize privacy protection and convenience in the local storage of anonymized images. [Note 13] An information processing apparatus according to claim 1, wherein the communication unit includes a process of temporarily storing the anonymized image in local storage within the device for a predetermined period of time before transmitting the anonymized image to an external location, and discards the anonymized image after the predetermined period has elapsed or after the transfer to the parent terminal has been completed. This ensures the convenience of parents being able to retrieve anonymized images directly from the device later, while minimizing privacy risks from the long-term storage of unnecessary images and reducing storage capacity consumption.
[0098] Issues corresponding to [Appendix 14] One of the purposes of this disclosure is to acquire multifaceted emergency information using existing camera functions without relying on dedicated sensors. [Note 14] An information processing device according to claim 1, wherein the sensor further detects changes in the posture of the monitored object corresponding to acceleration information, and / or abnormalities in ambient sounds corresponding to sound information, by analyzing image data captured by the camera. [Explanation of Symbols]
[0099] 100 Monitoring System 10. Information Processing Device (Monitoring Device) 11 processors 12 memory 13 Sensor section 14 Camera Section 15 Image Processing Unit 16 Communications Department 17 Storage 18 Display 19 Power supply section 20 Parent terminal 30 servers 40 Networks 101 Emergency Detection Unit 102 Camera Control Unit 103 Image Processing Unit 104 Communications Department 105 Notification Department 106 Settings Management Department 107 Behavioral History Learning Department 130 Sensor information acquisition unit S100 Sensor Information Acquisition Step S101 Emergency Detection Step S102 Emergency Detection and Judgment Steps S103 Usage Restriction / Self-Protection Status Check Steps S104 Camera activation permission decision step S105 Camera Activation / Notification Steps S200 Raw Image Input Step S201 Anonymization / Deidentification Processing Step S202 Data transmission step S203 Step to immediately discard the original image
Claims
1. It is configured to acquire acceleration information, It is configured to acquire audio information, A control unit configured to automatically activate a camera and take images when an emergency is detected based on the acquired acceleration information and audio information, An image processing unit configured to perform a process to anonymize a person's face from a captured image, A communication unit configured to include a process of transmitting the anonymized image to an external source and immediately discarding the captured original image, An information processing device equipped with the following features.
2. An information processing apparatus according to claim 1, The information processing device is further configured to detect an emergency based on a plurality of pieces of information, including acceleration information, voice information, location information, and SOS button operation information, and an AI learning result based on the plurality of pieces of information, including a learning result based on the past behavior history of the information processing device.
3. An information processing apparatus according to claim 1, The image processing unit is further configured to analyze the captured image in real time using an edge AI built into the device and to anonymize the faces of people other than the person being monitored.
4. An information processing apparatus according to claim 1, The image processing unit is further configured to analyze the captured image in real time using an edge AI built into the device and to mask personally identifiable information (such as license plates and nameplates).
5. An information processing apparatus according to claim 1, The information processing apparatus is characterized in that the communication unit is further configured to transmit only low-capacity metadata or abstracted images anonymized by the image processing unit to the outside, and to immediately discard the captured original image within the device.
6. An information processing apparatus according to claim 1, The control unit is further configured to conceal the camera function from the user interface of the information processing device and to automatically activate the camera only when an emergency is detected.
7. An information processing device according to claim 5, The information processing device is characterized in that the communication unit is configured to transmit to the outside information, as anonymized low-capacity metadata, information that further includes at least one of the following: facial expression information of the person being monitored, posture information, and type information of the surrounding environment.
8. An information processing apparatus according to claim 1, The control unit is further configured to restrict the operation of the camera function based on the camera function usage mode set by the parent (including emergency-only mode and parent-request-only mode), and / or the maximum number of shots and time per day.
9. An information processing apparatus according to claim 1, The information processing device is further configured to automatically temporarily shut down the camera function when the battery level of the information processing device falls below a threshold, and / or when the AI detects excessive camera operation.
10. An information processing apparatus according to claim 1, The control unit is further configured to detect the emergency, which includes detecting a specific voice command or a specific gesture by the person being monitored.
11. An information processing apparatus according to claim 1, The image processing unit is further configured to perform a process that reduces the identifiability of people in the image by converting the entire captured image into a non-photorealistic style.
12. An information processing apparatus according to claim 1, The information processing apparatus is characterized in that the communication unit is further configured to include a process of directly transmitting the anonymized image to a parent terminal by short-range wireless communication.
13. An information processing apparatus according to claim 1, The information processing apparatus is characterized in that the communication unit includes a process of temporarily storing the anonymized image in the local storage of the device for a predetermined period of time before transmitting the anonymized image to an external location, and discards the anonymized image after the predetermined period has elapsed or after the transfer to the parent terminal has been completed.
14. An information processing apparatus according to claim 1, The information processing device is characterized in that the acceleration information and / or the audio information are obtained by analyzing image data captured by a camera to detect changes in the posture of the monitored object corresponding to the acceleration information, and / or abnormalities in ambient sounds corresponding to the audio information.
15. The processor, Acquire acceleration information, Acquire audio information, If an emergency is detected based on the acquired acceleration information and audio information, the camera will be automatically activated to take an image. The process of anonymizing the faces of people in the captured images is performed. This process includes transmitting the anonymized image to an external source and immediately discarding the original image that was captured. Information processing methods.
16. In the processor, Acquire acceleration information, To acquire audio information, If an emergency is detected based on the acquired acceleration information and audio information, the camera will be automatically activated to take an image. The system performs a process to anonymize the faces of people in the captured images. This process includes transmitting the anonymized image to an external source and immediately discarding the original image that was captured. A program that executes a process.
17. A system comprising a server and terminal devices, The aforementioned terminal device is It is configured to acquire acceleration information, It is configured to acquire audio information, A control unit configured to automatically activate a camera and take images when an emergency is detected based on the acquired acceleration information and audio information, An image processing unit configured to perform a process to anonymize a person's face from a captured image, A communication unit configured to include a process of transmitting the anonymized image to an external source and immediately discarding the captured original image, A system equipped with these features.