Information processing device, information processing system, information processing method, and program
The information processing device addresses privacy challenges in child monitoring by analyzing and deleting sensitive data on the edge, generating deletion certificates, and optimizing privacy policies, ensuring robust protection and convenience.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- MIXI INC
- Filing Date
- 2025-06-17
- Publication Date
- 2026-07-01
AI Technical Summary
Conventional child monitoring devices face challenges in managing sensitive personal information due to strict privacy regulations and user difficulty in setting optimal privacy settings, leading to insufficient protection and convenience.
An information processing device that analyzes and deletes sensitive data on an edge computing device, generating deletion certification information, and transmits only metadata to a server, using AI to dynamically generate privacy policies based on user inputs.
Ensures high levels of user privacy protection, reduces compliance risks, and enhances service convenience by automating privacy settings without specialized knowledge, while minimizing storage and communication costs.
Smart Images

Figure 2026109506000001 
Figure 2026109506000002 
Figure 2026109506000003
Abstract
Description
Technical Field
[0001] The present disclosure relates to an information processing apparatus, an information processing system, an information processing method, and a program, and particularly to a technique for dynamically generating a user's privacy setting using AI (Artificial Intelligence) and processing / deleting personal data on an edge computing device.
Background Art
[0002] Conventionally, there have been child monitoring devices having position information such as GPS and a communication function. These devices have provided functions for sending and receiving voice messages between parents and children and learning behavior patterns by AI, giving a sense of security to guardians.
[0003] However, in these conventional systems, an architecture that stores sensitive personal information such as voice data in a cloud server for a long time for convenience has been common. Such an architecture is subject to strict privacy regulations such as COPPA (Children's Online Privacy Protection Act) in the United States and GDPR (General Data Protection Regulation) in Europe. Therefore, the conventional mechanism has been insufficient.
[0004] For example, when leaving the privacy setting to the user himself / herself, the setting items are technical and difficult to understand, and it can be extremely difficult for a guardian to optimally set the operation of the device in accordance with abstract concepts such as his / her own privacy concept and the characteristics of the child.
Prior Art Documents
Patent Documents
[0005]
Patent Document 1
Non-Patent Documents
[0006]
Non-Patent Document 1
[0007] This disclosure is made in view of the aforementioned problems with the prior art, and its purpose is to provide a new information processing device, information processing system, information processing method, and program that can achieve both high levels of user privacy protection and service convenience and security in information processing. [Means for solving the problem]
[0008] To solve the above problems, an information processing device according to one aspect of the present disclosure comprises: an audio input unit that acquires ambient sound; a processor that analyzes and deletes the audio data acquired by the audio input unit and generates deletion certification information; and a communication unit that transmits the deletion certification information generated by the processor to an external party. [Brief explanation of the drawing]
[0009] [Figure 1] This is an overview diagram showing the overall configuration of an information processing system according to one embodiment of the present disclosure. [Figure 2A] This is a block diagram showing an example of the hardware configuration of the child monitoring device according to this embodiment. [Figure 2B] Block diagram showing an example of the server hardware configuration according to this embodiment. [Figure 3A] This is a block diagram showing an example of the functional configuration of the child monitoring device according to this embodiment. [Figure 3B] This block diagram shows an example of the server's functional configuration according to this embodiment. [Figure 4A] This diagram shows an example of the data structure of a privacy policy. [Figure 4B] This figure shows an example of the data structure of deletion certificate information. [Figure 5] This flowchart shows an example of the process for automatically generating a privacy policy. [Figure 6] This flowchart shows an example of the audio data processing flow on a child monitoring device. [Figure 7] This figure shows an example of a privacy settings screen displayed on a user's device. [Figure 8] This flowchart shows an example of the process for adjusting the privacy policy in relation to variations of this disclosure. [Figure 9] This flowchart shows an example of the flow of the continuous learning process of the policy generation AI related to a modified version of this disclosure. [Figure 10] This figure shows an example of a policy confirmation and adjustment screen displayed on the user's terminal in a modified version of this disclosure. [Figure 11] This figure shows an example of an activity report screen displayed on a user's terminal. [Figure 12] A sequence diagram showing the entire process of generating and applying a privacy policy according to one embodiment of this disclosure. [Figure 13] This block diagram shows an example of a detailed internal functional configuration of the on-device AI analysis unit according to this embodiment. [Modes for carrying out the invention]
[0010] The embodiments of this disclosure will be described in detail below with reference to the drawings. The components of each embodiment can be combined as appropriate to the extent technically feasible.
[0011] (Overview of the entire system) FIG. 1 is an overview diagram showing the overall configuration of an information processing system 1 according to an embodiment of the present disclosure. The information processing system 1 of the present embodiment is composed of three main elements: a child monitoring device 100 possessed by a child, a user terminal 200 operated by a guardian, and a server 300 managed by a service provider. These are communicably connected to each other via a wide-area network 400 including the Internet and a mobile phone network, and operate in cooperation. The child monitoring device 100 corresponds to an information processing device according to an aspect of the present disclosure.
[0012] The basic idea of this system is to complete the processing of sensitive raw data related to privacy (voice data in this embodiment) on the edge side as much as possible, that is, within the child monitoring device 100. The guardian inputs intuitive and abstract information such as their own privacy concept and the child's personality through a dedicated application on the user terminal 200. This user setting information is transmitted to the server 300, and the AI on the server 300 interprets this and dynamically generates a "privacy policy" which is a specific operation rule executable by the device 100. The generated privacy policy is distributed to the child monitoring device 100 via the network 400 and applied.
[0013] The child monitoring device 100 analyzes the acquired voice data on the device according to the applied privacy policy. As a result, the voice data determined to be unnecessary for storage from the perspective of privacy protection is immediately deleted on the device in a method that is difficult to restore. Then, only the minimum necessary metadata extracted from the analysis results (e.g., the text of the detected keyword) and "deletion proof information" proving that the voice data has been properly deleted according to the policy are transmitted to the server 300. This fundamentally eliminates the risk of long-term storage of sensitive information on the server that the conventional cloud-centered system has, and embodies the principle of privacy by design.
[0014] (Interpretation of the concept of constituent elements) In this specification, each constituent element of a claim is used in a manner that also encompasses higher-level and intermediate concepts, as described below. The voice input unit is not limited to the voice input unit 104, but can be more broadly understood as a biometric information acquisition unit (intermediate concept) that acquires various information to understand the user's state, such as images from a camera and heart rate from wearable sensors. Furthermore, in its broadest sense, it can be interpreted as a form of a sensor data acquisition unit (higher concept) that acquires all kinds of sensor information, including acceleration, temperature, and location information. The analysis and deletion of audio data by the processor is not limited to embodiments that analyze audio data and delete it using methods such as DoD 5220.22-M. More broadly, it includes not only deletion but also the concept of data sanitization (intermediate concept), which irreversibly transforms data into a state where individuals cannot be identified, such as anonymization, masking, and data aggregation. Furthermore, as described in Modification 13, ephemeral processing (intermediate concept), which does not record data in non-volatile memory in the first place and erases it immediately upon completion of processing, also has substantially the same effect as deletion. These are based on the idea of data lifecycle management (higher concept), which consistently manages data from generation to disposal. Deletion certification information is not limited to information about individual deletion actions as shown in Figure 4B, but more broadly includes process execution logs that prove that a series of processes such as deletion, analysis, and notification have been performed, as well as audit trails used for compliance purposes (both are intermediate concepts). These are a type of status information (higher concept) that communicates the state of a device and processing results to the outside.
[0015] (Hardware configuration) Next, the hardware configuration of each device constituting this embodiment will be described in more detail.
[0016] Figure 2A is a block diagram showing an example of the hardware configuration of the child monitoring device 100. The child monitoring device 100 is a small device intended to be carried by children on a daily basis, and at its core is a control unit 101 that manages the operation of the entire device. The control unit 101 is typically an ARM-based SoC (System-on-a-Chip) that operates with low power consumption and includes a CPU core, a DSP (Digital Signal Processor), and a processor specialized for AI processing such as an NPU (Neural Processing Unit). The storage unit 102 consists of volatile memory for storing programs and temporary data and non-volatile memory for storing the OS, applications, AI models, etc., and is composed of semiconductor memory such as LPDDR4 RAM, eMMC, or UFS. The communication unit 103 includes a cellular modem compatible with LTE and 5G, as well as Wi-Fi and Bluetooth modules, and is responsible for connecting to the network 400. The voice input unit 104 consists of a highly sensitive MEMS microphone and an audio codec including a noise-canceling circuit, which clearly acquires ambient sounds. The location information acquisition unit 105 is a GNSS receiver compatible with multiple satellite positioning systems such as GPS, GLONASS, and Galileo, and acquires highly accurate location information. In addition to these, it also includes a battery, accelerometer, speaker, LED indicator, etc. (not shown).
[0017] The user terminal 200 is typically a smartphone or tablet owned by a parent or guardian. Its hardware configuration is typical, featuring a high-resolution touchscreen display, a high-performance application processor, large memory and storage capacity, and various communication modules (Wi-Fi, Bluetooth, NFC, etc.). Parents use this touchscreen to operate a dedicated application, inputting privacy settings and reviewing activity reports.
[0018] Server 300 is typically a physical server located within a data center or a virtual server built on a cloud computing environment. As shown in Figure 2B, its hardware configuration is designed for high availability and reliability, including a multi-core server CPU (e.g., Intel Xeon, AMD EPYC), large-capacity memory with error correction (ECC RAM), storage with high-speed SSDs or HDDs in a RAID configuration, and a high-speed network interface card (NIC) for connecting to a high-bandwidth network, in order to handle simultaneous access from a large number of clients.
[0019] (Functional block configuration) Next, the functional configuration of each device in this embodiment, including their coordination and processing details, will be described in more detail. These functions are mainly realized by the control unit (processor) of each device executing programs stored in the memory unit.
[0020] Figure 3A shows an example of the functional configuration of the child monitoring device 100. The voice acquisition unit 111 continuously acquires a PCM format digital voice stream from the voice input unit 104 (microphone) and temporarily stores it in an internal ring buffer. When the policy application unit 112 receives a new privacy policy from the server 300, it verifies it and stores it in a predetermined area in the storage unit 102. At the same time, it notifies the currently operating on-device AI analysis unit 113 that the policy to be referenced has been updated and prompts it to reset the parameters of the AI model as necessary.
[0021] The on-device AI analysis unit 113 is one of the core functions of this device and reads the privacy policy provided by the policy application unit 112 as a kind of configuration file. Then, it runs a keyword spotting model on the audio data in the ring buffer using the keyword list specified in the policy, and extracts acoustic features based on emotion categories and thresholds and evaluates them with an emotion analysis model. If the analysis detects an event (keyword or specific emotion) that matches the policy, it generates structured analysis result data that includes information about that event (e.g., detected keyword "pain", emotion "fear", level "9.2", timestamp and hash value of the corresponding audio data). This analysis result data is passed to the notification control unit 116 and the data deletion unit 114 for subsequent processing.
[0022] The data deletion unit 114 receives analysis result data from the on-device AI analysis unit 113. It compares the information contained in that data with the privacy policy and, if it determines that the audio data is "subject to deletion" (for example, if it is determined that there is no urgency), it performs an overwrite deletion process on the corresponding audio data block in the ring buffer held by the audio acquisition unit 111 using the method specified in the policy (e.g., DoD 5220.22-M). Once it confirms that the deletion process has been successfully completed, it passes information including the hash value and timestamp of the deleted audio data, and the processing completion status, to the deletion certificate information generation unit 115.
[0023] When the deletion certificate information generation unit 115 receives information from the data deletion unit 114, it starts generating certificate information. It obtains the identifier of the currently applied policy from the policy application unit 112, and combines this with the information received from the data deletion unit 114 and the current time to generate structured deletion certificate information as shown in Figure 4B. The generated certificate information is placed in a queue to be sent to the server 300 via the communication unit 103.
[0024] The notification control unit 116 receives analysis result data from the on-device AI analysis unit 113, and if it determines that the information is subject to "urgent notification" according to the policy, it activates the notification process. In addition to the analysis result data, it packages additional information such as the current location information obtained from the location information acquisition unit 105 and the device's battery level, and sends a high-priority urgent notification request to the server 300 via the communication unit 103.
[0025] Figure 13 shows a more detailed internal functional configuration of the on-device AI analysis unit 113, which is the core of this embodiment. The on-device AI analysis unit 113 includes an audio buffering unit 1301 that temporarily holds audio data from the audio acquisition unit 111, a feature extraction unit 1302 that extracts acoustic features such as MFCCs from the audio data, a keyword inference unit 1303 that executes a keyword spotting model, an emotion / context inference unit 1304 that executes an emotion analysis model, and an overall judgment unit 1305 that compares the results of each inference unit with the privacy policy to determine the final action. This multi-stage configuration enables highly advanced and reliable on-device analysis.
[0026] Figure 3B shows an example of the functional configuration of server 300. The reception unit 311 is a window that receives all external communications, such as user setting information sent from user terminals 200, analysis metadata, deletion certificate information, and emergency notification requests sent from child monitoring devices 100. It verifies the legitimacy of the received data and distributes the information to subsequent functional units that perform appropriate processing according to the type of data.
[0027] When the policy generation unit 312 receives user setting information from the reception unit 311, it uses an internal AI model to generate a privacy policy optimized for that user. At this time, it optionally refers to local suspicious person information databases and weather information services provided by the National Police Agency via the external information linkage unit 313, and reflects the risk level appropriate to the context in the policy (e.g., automatically increases the level of security monitoring in areas with a high number of suspicious person reports). The generated policy is sent to the target device 100 via the reception unit 311.
[0028] The information management unit 314 securely stores the analysis metadata and deletion certificate information transmitted from device 100 in a database, linked to the user account. This database serves as a source of information for parents to view past activity history from the user terminal 200. The information management unit 314 provides an API to the application on the user terminal 200 for generating an activity report screen, such as the one shown in Figure 11.
[0029] (Processing flow) Next, the information processing flow in this embodiment will be explained in more detail using Figures 5, 6, and 12.
[0030] Figure 12 is a sequence diagram showing the entire process from the generation of a privacy policy to its application by the user terminal 200, server 300, and child monitoring device 100 working together. First, the parent operates the user terminal 200 and inputs abstract preferences regarding privacy. The user terminal 200 sends these preferences to the server 300 as "user setting information." The policy generation unit 312 of the server 300 uses AI to generate an executable "privacy policy" based on the received user setting information. The generated policy is sent to the child monitoring device 100. The policy application unit 112 of the device 100 applies the policy and responds to the server 300 with the result. In this way, the three parties work together in cooperation, making it possible to seamlessly reflect the user's abstract preferences in the concrete actions of the device.
[0031] Figure 5 is a flowchart of the automated privacy policy generation process. First, when a parent operates the user terminal 200 and enters abstract preference information from the privacy settings screen 700 as shown in Figure 7 (step S501), the entered information is sent to the server 300 as user setting information. For example, if the parent drags the slider 701 on the screen towards "Privacy Protection," the application running on the user terminal 200 changes the value of an internal parameter (e.g., privacy_preference) in response to this operation, ranging from 0.0 (emphasis on security monitoring) to 1.0 (emphasis on privacy protection). Once this operation is complete, the user setting information, including the set parameter value, is sent as a POST request over the HTTPS protocol to a predetermined API endpoint provided by the server 300 (e.g., / api / v1 / user_settings). On server 300, the reception unit 311 receives this information and passes it to the policy generation unit 312. The policy generation unit 312 takes the received user setting information as input and analyzes it using an internal AI model (step S502). This analysis can optionally include contextual information such as local security information obtained from external services by the external information linkage unit 313 (step S503). Based on the analysis results, the policy generation unit 312 generates a privacy policy consisting of specific control parameters that the child monitoring device 100 can execute (step S504) and transmits it to the target child monitoring device 100 via the communication unit 303 (step S505).
[0032] Figure 6 is a flowchart of the voice data processing on the child monitoring device 100. When the voice acquisition unit 111 of the child monitoring device 100 acquires voice data (step S601), the on-device AI analysis unit 113 starts processing. The on-device AI analysis unit 113 reads the currently valid privacy policy from the policy application unit 112 and analyzes the voice data according to its contents (step S602). Next, the on-device AI analysis unit 113 compares the analysis results obtained in step S602 (detection keywords, emotion level, etc.) with the privacy policy and determines whether the voice data is urgent or not (step S603).
[0033] If the analysis result falls under the category of "Emergency Notification Target" in the policy (YES in step S603), the on-device AI analysis unit 113 requests processing from the notification control unit 116. The notification control unit 116 performs exception processing, such as sending an emergency notification to the guardian via the server 300 (step S604).
[0034] On the other hand, if the analysis result does not fall under the category of "item requiring urgent notification" (NO in step S603), the on-device AI analysis unit 113 extracts metadata from the voice data that should be reported to the server later (e.g., some words that have been transcribed) according to the policy (step S605). Then, the on-device AI analysis unit 113 instructs the data deletion unit 114 to delete the data. The data deletion unit 114 completely deletes the original raw voice data from the storage unit 102 using a method that is difficult to recover as specified in the policy (step S606). After the deletion is complete, the deletion certificate information generation unit 115 generates deletion certificate information upon receiving notification from the data deletion unit 114 (step S607). Finally, the metadata extracted in step S605 and the deletion certificate information generated in step S607 are sent to the server 300 via the communication unit 103 (step S608).
[0035] (modified version) This disclosure is not limited to the embodiments described above, and various modifications are possible without departing from its spirit. Several modifications are given below.
[0036] (Particularly advantageous embodiments that provide a basis for inventiveness and feasibility) One particularly advantageous embodiment of this disclosure addresses a challenge that has been difficult to solve in the prior art: enabling users without specialized knowledge to highly reconcile their privacy values with the conflicting requirement of ensuring the safety of their children. To this end, this embodiment includes a series of coordinated configurations for translating the user's abstract intentions into reliable, concrete device controls. Specifically, the AI model used in the policy generation unit 312 of server 300 is first trained using high-quality training data created by a panel of experts. This training data consists of input data and ground truth label data. The input data is a vector representation of hundreds of "persona scenarios" that combine various child personalities, parental values, living environments, etc. On the other hand, the ground truth label data is the optimal combination of dozens of control parameters (optimal policy) for each persona scenario, determined by consensus among a panel of experts consisting of child psychologists, privacy experts, IT security experts, etc., based on their respective expertise. Next, when the policy generation unit 312 receives user setting information from the user terminal 200, including abstract information such as "the child's personality is prone to worry" and "they value privacy more," it uses the trained AI model to dynamically generate a privacy policy that best suits the user's input. This generation process does not adjust a single parameter, but rather coordinately and non-linearly adjusts multiple control parameters, such as a list of detection keywords, a threshold for the sentiment analysis model, and a data deletion method. The child monitoring device 100 then performs analysis based on a policy generated specifically for this user. This configuration allows the user to implement their vague ideas as reliable, concrete device settings, as if they had received consultation from a dedicated expert. This has the remarkable effect of providing users with a high level of satisfaction and unprecedented peace of mind, something that cannot be predicted from a simple combination of functions.
[0037] (First variation: Changing the policy generator) The purpose of this modified version is to enable autonomous operation even when communication with the server 300 is not possible, thereby ensuring privacy protection. To this end, the storage unit 102 of the child monitoring device 100 stores a lightweight version of the AI model of the policy generation unit 312 of the server 300 (for example, a version with reduced model size using techniques such as quantization or knowledge distillation), or a simpler rule-based inference engine. The user inputs abstract information (e.g., "normal mode," "privacy priority mode," etc.) by directly operating buttons or a small display on the device 100. When the control unit 101 of the device 100 receives this input, it starts the stored policy generation engine, generates a privacy policy on the spot, and sets it in the policy application unit 112. This results in benefits such as reduced communication costs, ensured usability in offline environments, and redundancy in the event of server failure.
[0038] (Second variation: User feedback and manual adjustments) The purpose of this modified version is to increase the transparency of policies generated by AI and improve user control and satisfaction. To this end, a dedicated application running on the user terminal 200 is modified to include two functions: a "policy visualization unit" that displays the content of the policy received from the server 300 in a format that is easy for humans to understand, and a "policy adjustment reception unit" that accepts adjustment instructions from the user. Figure 8 shows a flowchart of this modified version, and Figure 10 shows an example of the policy confirmation / adjustment screen 1000. After the server 300 generates a policy using AI (step S801), it sends the policy itself, along with visualization data summarizing its content, to the user terminal 200 (step S802). Based on this data, the policy visualization unit on the user terminal 200 displays a policy summary 1001 (e.g., "Detects 'Help'") as shown in Figure 10 (step S803), and asks the user whether adjustments are necessary (step S804). If the user is satisfied with the displayed content, they select the "Apply as is" button 1004, and the process ends. If adjustments are needed, selecting the "Adjust Policy" button 1003 displays an input field 1002 for adding keywords and a slider (not shown) for changing thresholds, allowing for fine-tuning of the policy (step S805). The adjustments are formatted by the policy adjustment reception unit and sent to the server 300 (step S806). The server 300 reflects the adjustments in the original policy and sends and applies the final version of the policy to the device 100 (step S807). This entire process allows users to enjoy the convenience of AI automation while retaining final decision-making power, significantly increasing confidence in the system.
[0039] (Third variation: Continuous learning loop) The purpose of this modified version is to continuously improve the policy generation AI based on actual usage data, thereby realizing a truly personalized AI that better fits the unique circumstances of each household and the sensibilities of parents. To this end, a function to store and manage user feedback information as training data is added to the information management unit 314 of the server 300, and a function to periodically retrain the policy generation unit 312 is added. Figure 9 shows a flowchart of the learning process of this modified version. When an emergency notification is sent from the child monitoring device 100 (step S901), the user terminal 200 displays the notification content along with a feedback input screen (step S902). Parents input feedback such as whether the notification indicated a truly dangerous situation ("Action Required") or whether it was an overreaction ("No Problem") (step S903). This feedback information (e.g., "No problem") is stored as training data for server 300, along with the user settings information that caused it (e.g., "anxious"), the analysis result (e.g., the keyword "no"), and the context (e.g., location "home") (steps S904, S905). Server 300 performs retraining (fine-tuning) of the policy generation AI model based on a certain amount of such training data accumulated or triggered by a weekly event (step S906). This allows the AI to learn, for example, that "parents tend not to perceive an emergency even if the word 'no' is uttered at home," and subsequently, it can perform adaptive actions such as lowering the notification priority in similar situations.
[0040] (Fourth variation: Enhanced proof through hardware) The purpose of this modification is to enhance the reliability of the deletion certificate to the highest cryptographic level and to ensure its legal evidentiary value in the event of any unforeseen circumstances. To this end, a tamper-resistant hardware security module, such as a TPM (Trusted Platform Module), is added to the hardware configuration of the child monitoring device 100 (Figure 2A). This module securely stores a device-specific secret key that is written during manufacturing and is extremely difficult to extract from the outside. The processing flow in this configuration is as follows: First, the deletion certificate information generation unit 115 generates certificate information, including hash values and timestamps, as usual. Next, it passes the generated certificate information to the TPM in the device. The TPM uses its internally held secret key to generate a digital signature for the entire received certificate information. Finally, the deletion certificate information generation unit 115 adds this digital signature to the original certificate information to create the final deletion certificate information and sends it to the server 300. The receiving party (server 300 or a third party conducting an audit) can verify this signature using the corresponding public key, mathematically confirming that the certification information was "undoubtedly issued from that device at that time and has not been tampered with since." This strongly ensures the non-repudiation nature of deletion actions.
[0041] (Fifth variation: Application to data other than audio) The architecture described in this disclosure—"policy generation from abstract intentions," "processing and deletion at the edge," and "deletion proof"—is not limited to voice data but can be applied across other types of sensitive personal information that a device may acquire. For example, an imaging unit (camera) and an action sensor (accelerometer) can be added to device 100, and an image analysis model (object recognition, face detection) and an action recognition model (fall detection, activity level estimation) can be installed in the on-device AI analysis unit 113. When a user selects "prioritize children's privacy" on the privacy settings screen, the AI interprets this and generates the aforementioned policy for voice data, as well as simultaneously generating multiple policies depending on the data type. For image data, this might include policies such as "images showing faces other than registered family members will be blurred and deleted so that individuals cannot be identified," and for action logs, "daily walking and play logs will be aggregated every hour, and only the average value will be recorded, while the original detailed logs will be deleted." This allows users to automatically apply protection based on a consistent privacy philosophy to various types of personal information handled by the device by simply making one abstract setting, significantly improving convenience.
[0042] (Sixth variation: Hybrid analysis processing) This modification aims to suppress the device's computing resources and power consumption while enabling precise analysis by high-performance AI on the cloud when necessary. Therefore, the analysis process is divided into two stages: primary screening performed on the device and secondary analysis performed on the server. The on-device AI analysis unit 113 of the child monitoring device 100 continuously monitors audio data with a lightweight model and detects "cautionary" level events (e.g., keywords that are not urgent but require contextual understanding, such as "stranger"). If detected, the device does not immediately delete the audio data but sends a confirmation notification to the user terminal 200 asking, "We have detected a conversation of concern. Do you want to allow detailed analysis by expert AI?" Only if the user grants permission via the application, the relevant portion of the audio data is encrypted and sent to the server 300. The server 300 uses a large-scale natural language processing model to perform more advanced contextual understanding and intent interpretation, and reports the results (e.g., "possibility of persistent harassment from a third party") to the guardian. After this secondary analysis is complete, the audio data on the server is immediately deleted, and a deletion certificate is issued. This configuration allows for a higher level of security while reducing device battery consumption and giving users the right to self-determination regarding their privacy.
[0043] (Variation 7: Context-aware dynamic policy switching) This modified version aims to automatically apply the most appropriate privacy policy according to the child's situation (context) and achieve granular privacy protection. To this end, multiple situation-specific policies, such as "Home Policy," "School Policy," and "Park Policy," are stored in the device 100 in advance. The control unit 101 of the device 100 functions as a "context estimation unit" that integrates information from built-in sensors (location information acquisition unit 105, Wi-Fi information from the communication unit 103, time information from the memory unit 102, an acceleration sensor not shown, etc.) to estimate the current context in real time. For example, if the context estimation unit determines from GPS information that the child is within the registered "school" geofence and from time information that it is currently "in class," it issues an instruction to the policy application unit 112, which automatically switches to the "School / In Class Policy," including settings such as suppressing notification levels and lowering microphone sensitivity. This configuration eliminates the need for the user to manually change settings and seamlessly achieves optimal privacy protection according to the situation.
[0044] (Variation 8: Multiple devices and multiple users working together) This modified version aims to address situations involving multiple users and guardians, such as siblings or family members, and to achieve consistent monitoring for the entire group. Therefore, the server 300 has a "group management function" that manages multiple child monitoring devices 100 and multiple user terminals 200 as a single "family group." If device A detects an emergency, the notification is simultaneously sent to all registered guardian terminals (father, mother, etc.) in the group. When any guardian responds to the notification, their status (e.g., "Mother is responding") is shared with other guardians' terminals to prevent duplicate responses. Furthermore, each device 100 is equipped with an "inter-device communication unit" that uses short-range wireless communication functions such as Bluetooth to detect the presence of other family devices. For example, if older brother's device A and younger brother's device B are nearby, the devices recognize each other's presence and reflect that context ("brothers are together") in the policy. Specifically, adaptive control becomes possible, such as temporarily lowering the sensitivity of keyword detection to prevent unnecessary notifications from being triggered by casual conversations between siblings. This will broaden the monitoring network, facilitate smoother communication among parents, and enable more realistic and flexible operation.
[0045] (9th variation: Selective analysis based on speaker identification) The purpose of this modification is to identify participants in a conversation and to more precisely control what is protected by privacy. To this end, the on-device AI analysis unit 113 of the child monitoring device 100 is equipped with speaker identification or speaker diarization functions. Prior to use, the child who is using the device, or family members (parents, siblings, etc.), are registered as "trusted speakers" by the parent or guardian. When the on-device AI analysis unit 113 acquires audio, it first performs speaker identification to determine whether the speaker is a "registered family member" or an "unregistered third party." The privacy policy then includes rules such as "conversations between trusted speakers will not be analyzed or notified in principle and will be deleted immediately" and "only when the voice of an unregistered third party is detected will keyword and sentiment analysis be performed in detail." This prevents the AI from excessively intervening in private conversations between family members, reduces the user's psychological resistance, and allows monitoring to be concentrated on interactions with external individuals, further improving the balance between privacy protection and safety.
[0046] (Tenth variation: Blockchain utilization for deletion certificates) This modification aims to maximize the reliability, transparency, and auditability of deletion certificate information. To this end, server 300 participates as a node in a consortium-type or private blockchain network jointly managed by service providers, parental representatives, and third-party auditing bodies. When the child monitoring device 100 generates deletion certificate information and sends it to server 300, the information management unit 314 calculates a hash value from that certificate information, creates a transaction containing that hash value, and records (commits) it to the blockchain. Once a hash value is recorded on the blockchain, it becomes virtually impossible to tamper with it later. This ensures that even the service provider itself cannot illegally alter past deletion records. When an audit becomes necessary, auditors can objectively verify the integrity of the data by comparing the records on the blockchain with the original deletion certificate information held by the server. This mechanism is an extremely powerful technical means of fulfilling the "accountability" required by GDPR and other regulations, and significantly improves the trustworthiness of businesses.
[0047] (11th variation: Reinforcement of the interpretation of deletion certificates and their generation timing) As used herein, "deletion certification information" is not limited to information generated for each individual deletion act. For example, the technical concept of this disclosure also includes a method of aggregating information on multiple deletion acts performed within a predetermined period and generating and transmitting it as a single "deletion summary report." Furthermore, the phrase "delete and generate deletion certification information" is intended to encompass not only sequential processing immediately after a deletion act, but also methods in which information proving the deletion act is generated later, after a predetermined delay or batch processing period.
[0048] (Twelfth variation: Expanding the role of AI and incorporating fixed-policy options) The generation of privacy policies is not limited to a form in which AI calculates control parameters from scratch. For example, it is conceivable that the AI could present a recommended policy from among several "base policies" held by the server that best match the user's intentions, and the user could then select this policy. Furthermore, a form in which the user directly selects from several fixed policy sets supervised by experts, without the intervention of AI, can also be included within the scope of the technical concept as one way of solving the problem of this disclosure, which is that users without specialized knowledge can achieve optimal settings based on abstract intentions.
[0049] (13th variation: Effective deletion by ephemeral processing) In this specification, "deletion" of audio data is not limited to the act of erasing data recorded on a storage medium, but encompasses any technical means that manage the entire lifecycle of the data and make it unrecoverable upon completion of processing. For example, an ephemeral processing architecture in which audio data is processed in a secure area on volatile memory within the processor without writing it to non-volatile storage, and the data is physically destroyed upon completion of processing, is also one technical implementation of "deletion" in this disclosure.
[0050] (14th variation: Collaboration with childcare and education support services) In this modified version, the service value is expanded from a monitoring function to professional childcare support. The data stored on server 300 consists only of metadata and deletion certification information, excluding raw audio. By statistically processing this data, it is possible to understand the emotional tendencies of children (e.g., "the frequency of detecting keywords indicating negative emotions has increased by 20% compared to last month") in an anonymized form. Only with the user's explicit consent can this anonymized statistical trend data be shared with affiliated local public health nurses, childcare counselors, or educational institutions. This allows professionals to detect early signs of families that may need support without identifying individuals, and to notify the user terminal 200 at the appropriate time with push notifications of support information and guidance to consultation services, such as "Have you noticed anything concerning about your child recently?"
[0051] (15th variation: A new business model through collaboration with insurance services) This modified version leverages the high reliability and verifiability of this disclosure to create a new business model. The "deletion certificate information" in this disclosure (especially when enhanced with TPM in the fourth modified version or blockchain in the tenth modified version) serves as strong evidence that businesses are properly fulfilling their privacy protection obligations. This mechanism can be used in conjunction with cyber insurance services that address personal data breaches. For example, a service model could be created that offers discounts on insurance premiums to businesses that implement this system and allow regular audits of deletion certificate logs. This would make the adoption of privacy protection technology an economic incentive rather than merely a cost, and is expected to promote the widespread adoption of the technology.
[0052] (16th variation: Real-time visualization of privacy impact) This modified version further improves the UI / UX to allow users to configure privacy settings more intuitively and with greater confidence. On the user terminal 200's privacy settings screen (Figure 7), when a parent moves a slider or changes a setting, the system simulates in real time how much the change will affect privacy and displays it graphically. For example, when the slider is moved to "Prioritize Security Monitoring," specific predicted values such as "With this setting, an average of approximately 3 minutes of conversations per day will be analyzed, and of that, approximately 30 seconds may result in notifications" and performance indicators such as "Average time to data deletion: 1.5 seconds" are displayed. This allows users to intuitively understand the consequences of their choices and make more informed decisions.
[0053] (Display activity report) Figure 11 shows an example of an activity report screen 1100 displayed on the user terminal 200 based on information managed by the information management unit 314 of server 300. This screen displays an overview of detected keywords 1101 and a message 1102 indicating that audio data has been deleted for privacy protection. When a parent taps the "Detailed Log" button 1103, the application sends a GET request to the / api / v1 / logs / deletion_proofs endpoint of server 300, including an authentication token in the header. The information management unit 314 of server 300 receives this and returns a list of deletion certificates 420 associated with the user in JSON format. The application parses this list and transitions to a screen displaying a list of timestamps and applicable policy IDs for each certificate. When the user taps an item in the list, details of the individual certificate are displayed, as shown in Figure 4B. This allows parents to understand an overview of their child's situation while confirming that their privacy is protected.
[0054] [Note]
[0055] [General tasks] One of the purposes of this disclosure is to provide technology that enables a high level of user privacy protection in information processing while simultaneously achieving both the convenience and security of the service.
[0056] Issues corresponding to [Appendix 1] One of the purposes of this disclosure is to provide a basic mechanism that completes processing within the device that acquires sensitive audio data, thereby ensuring transparency in the processing. [Note 1] An information processing device comprising: an audio input unit for acquiring ambient sound; a processor for analyzing and deleting the audio data acquired by the audio input unit and generating deletion certification information; and a communication unit for transmitting the deletion certification information generated by the processor to an external source. According to the above-described information processing device, voice data is analyzed and deleted within the device, and proof of this is transmitted, thus ensuring transparency of the process while protecting privacy. More specifically, this makes it possible to achieve both the convenience and security of the service while highly protecting user privacy. For example, by immediately deleting sensitive biometric information on the device as a rule and proving its execution, compliance risks related to privacy laws can be reduced. In addition, since the AI generates optimal privacy settings from the user's abstract intentions without requiring specialized knowledge, it is possible to reduce the input burden on the user and improve usability. Furthermore, this disclosed architecture also brings significant advantages to service providers. Because there is no need to send and store large amounts of data such as raw voice data on the server, server storage costs and computing costs for data processing can be significantly reduced compared to conventional cloud-centric systems. Also, since communication from the device to the server consists only of lightweight metadata and deletion certification information, network bandwidth can be used efficiently, enabling stable service operation even when many devices are connected simultaneously. In addition, because voice analysis is completed on the device, there is no delay (latency) due to round-trip communication with the cloud, which contributes to improved responsiveness (real-time performance) and enables faster detection of emergencies.
[0057] Issues corresponding to [Appendix 2] One of the purposes of this disclosure is to systematically analyze audio data based on predetermined rules. [Note 2] The processor is an information processing device as described in Appendix 1, which analyzes the voice data based on a predetermined privacy policy. This enables data processing based on consistent rules, rather than being ad-hoc.
[0058] Issues corresponding to [Appendix 3] One of the purposes of this disclosure is to automate the generation of complex privacy policies and reduce the burden on users. [Note 3] The aforementioned privacy policy is automatically generated using AI by the information processing device described in Appendix 2. This makes it easy to obtain appropriate policies for different situations, even without specialized knowledge.
[0059] Issues corresponding to [Appendix 4] One of the purposes of this disclosure is to generate a specific privacy policy from the user's non-expert, intuitive input. [Note 4] The information processing device described in Appendix 3 generates the aforementioned privacy policy based on user setting information, which includes abstract information about the user of the information processing device and information regarding privacy preferences. This enables personalized privacy settings that reflect the user's true intentions.
[0060] Issues corresponding to [Appendix 5] One of the purposes of this disclosure is to enhance the evidentiary value of data deletion and ensure traceability. [Note 5] The information processing device described in any one of the appendices 2 to 4, wherein the deletion certification information includes the identifier of the privacy policy used in the analysis. This makes it possible to objectively track and prove when and under what rules data was deleted.
[0061] Issues corresponding to [Appendix 6] One of the purposes of this disclosure is to make a more specific decision on whether or not to delete audio data, depending on its content. [Note 6] The information processing apparatus according to any one of the appendices 2 to 5, wherein the processor determines whether or not to delete the audio data based on the keywords or emotional intensity contained in the audio data. This allows for the concretization of policy-based deletion decision logic, enabling more flexible control.
[0062] Issues corresponding to [Appendix 7] One of the purposes of this disclosure is to clarify the processing flow that ensures only low-privacy-risk information is retained and high-risk raw data is reliably deleted. [Note 7] The information processing apparatus according to any one of the appendices 1 to 6, wherein the processor, when it is determined that the voice data is a normal conversation, extracts metadata from the voice data and then performs the deletion. This allows us to retain useful information while minimizing the risk of privacy violations from raw audio data.
[0063] Issues corresponding to [Appendix 8] One of the purposes of this disclosure is to reduce the risk of deleted data being recovered and to further enhance privacy protection. [Note 8] The information processing apparatus according to any one of the appendices 1 to 7, wherein the processor performs the deletion of the audio data in a manner that makes it difficult to recover the data. This allows us to provide users with a high level of reassurance and trust.
[0064] Issues corresponding to [Appendix 9] One of the purposes of this disclosure is to provide a balanced system that prioritizes privacy protection while also being able to appropriately respond to situations that threaten user safety. [Note 9] The information processing apparatus according to any one of the appendices 1 to 8, wherein the processor, when it is determined that the voice data is urgent, does not perform the deletion and sends a notification to an external device. This allows for exceptions that ensure the safety of users who truly need protection, rather than solely focusing on privacy protection. [Explanation of symbols]
[0065] 1… Information processing system 100... Child monitoring device (information processing device) 101, 301… Control Unit (Processor) 102, 302...Storage section 103, 303… Communications Department 104...Voice input section 105...Position information acquisition unit 106...TPM (Trusted Platform Module) 111...Speech acquisition unit 112...Policy Application Section 113...On-device AI Analysis Department 114...Data Deletion Section 115...Deletion Certificate Information Generation Unit 116... Notification Control Unit 200... User terminals 300... Server 400…Network 410…Privacy Policy 420... Deletion certificate information 700...Privacy settings screen 701...Slider 1000...Policy confirmation / adjustment screen 1001…Policy Summary 1002... Input field for adding keywords 1003... "Adjust policy" button 1004... "Apply as is" button 1100... Activity report screen 1101…Keyword Overview 1102...Data deletion message 1103...Detailed log confirmation button 1301...Audio buffering section 1302...Feature extraction unit 1303...Keyword Inference Unit 1304…Emotion / Context Reasoning Department 1305... Overall Judging Section
Claims
1. A voice input unit that acquires ambient sounds, A processor that analyzes and deletes the audio data acquired by the aforementioned audio input unit and generates deletion certificate information, A communication unit that transmits the deletion certificate information generated by the aforementioned processor to an external party, An information processing device equipped with the following features.
2. The processor analyzes the voice data based on a predetermined privacy policy. The information processing apparatus according to claim 1.
3. The aforementioned privacy policy is automatically generated using AI. The information processing apparatus according to claim 2.
4. The aforementioned privacy policy is generated based on user setting information, which includes abstract information about the user of the information processing device and information regarding their privacy preferences. The information processing apparatus according to claim 3.
5. The aforementioned deletion certification information includes the identifier of the privacy policy used in the analysis, The information processing apparatus according to claim 2.
6. The processor determines whether or not to delete the audio data based on the keywords or emotional intensity contained in the audio data. The information processing apparatus according to claim 2.
7. If the processor determines that the audio data is a normal conversation, it extracts metadata from the audio data and then performs the deletion. The information processing apparatus according to claim 1.
8. The processor deletes the audio data in a manner that makes it difficult to recover the data. The information processing apparatus according to claim 1.
9. If the processor determines that the audio data is urgent, it will not perform the deletion and will send a notification to an external device. The information processing apparatus according to claim 1.
10. The information processing apparatus according to claim 1, wherein the voice input unit is a biometric information acquisition unit that acquires biometric information including at least one of the user's voice, heart rate, or body movement.
11. The information processing apparatus according to claim 1, wherein the deletion by the processor is a data sanitization process that includes a process for anonymizing, masking, or summarizing the audio data.
12. The information processing device according to claim 1, wherein the deletion certification information is configured as an audit trail including the entity that performed the deletion, the time, and a unique identifier for the target data.
13. It also features a tamper-resistant security module, The processor uses the private key held by the security module to digitally sign the deletion certificate information. The information processing apparatus according to claim 1.
14. The processor, The audio data is acquired from the audio input unit. The acquired audio data is analyzed and deleted, and deletion certificate information is generated. The generated deletion certificate information is sent to an external party. Information processing methods.
15. In the processor, The audio input unit acquires audio data. The acquired audio data is analyzed and deleted, and deletion certificate information is generated. The generated deletion certificate information is sent to an external party. A program that executes a process.
16. Equipped with an information processing device and a server, The information processing device includes an audio input unit that acquires ambient sound, a processor that analyzes and deletes the audio data acquired by the audio input unit and generates deletion certification information, and a communication unit that transmits the deletion certification information generated by the processor to an external source. Equipped with, The server receives the user settings information, generates the privacy policy using AI, and transmits it to the information processing device. Information processing system.
17. The information processing system according to claim 16, characterized in that the server transmits a summary of the contents of the privacy policy generated by the AI to a user terminal operated by the user, and receives an approval instruction from the user, before transmitting the privacy policy generated by the AI to the information processing device.
18. The information processing system according to claim 16, characterized in that the server collects feedback information from users regarding notifications from the information processing device and uses the feedback information to retrain the AI.
19. The information processing system according to claim 16, characterized in that the server generates a transaction based on the deletion certificate information received from the information processing device and records it on the blockchain network.