system

The system addresses real-time anomaly detection and identification of suspicious persons using facial recognition and sensor technologies, enhancing community safety by integrating with external organizations for rapid response.

JP2026105345APending Publication Date: 2026-06-26SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-16
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Conventional security systems are inadequate in identifying suspicious persons and detecting anomalies in real-time, leading to insufficient safety measures for elderly and families, especially in preventing intrusions.

Method used

A system utilizing facial recognition algorithms, vibration and opening/closing sensors, and an information processing device to detect and respond to anomalies, integrating with external organizations for enhanced community safety.

Benefits of technology

Ensures user safety by promptly identifying suspicious individuals and detecting abnormalities, reducing intrusion risk through immediate alerts and coordinated responses.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026105345000001_ABST
    Figure 2026105345000001_ABST
Patent Text Reader

Abstract

We provide the system. [Solution] A processing device that uses a facial recognition algorithm to detect human features from video data and compares them with a subject list, A device that monitors abnormalities in entrances and windows using a vibration detection unit and an opening / closing detection unit, A system that analyzes a person's behavior and facial expressions to quantify the degree of danger and issues a warning based on that evaluation, A means by which the information processing unit shares information about abnormal individuals with multiple external organizations, A means of transmitting a warning to a mobile communication device when an anomaly is detected, A system that includes this.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, the method including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] In recent years, the intrusion into houses and visits by suspicious persons have increased, and in particular, it is difficult for the elderly and families to live their daily lives safely. Regarding this problem, there is a demand for a system that can detect intruders and suspicious persons in advance and respond appropriately. However, conventional security systems are insufficient in identifying suspicious persons and immediately detecting abnormalities, and have not achieved an improvement in the safety of the entire area. In order to solve such problems, new technical means are required.

Means for Solving the Problems

[0005] This invention provides a system that uses a facial recognition algorithm to detect and match specific individuals from video data. Furthermore, vibration and opening / closing sensors installed on windows and doors enable immediate detection of anomalies. By analyzing behavior and facial expressions, the system assesses the level of risk posed by visitors and promptly issues alarms as needed. In addition, it has an information processing device that has the function of sharing information with external organizations, supporting the improvement of overall community safety. Through this series of measures, user safety can be ensured and the risk of intrusion can be significantly reduced.

[0006] A "face recognition algorithm" is a technology that identifies a person's face from video data, analyzes its features, and compares and matches them against a specific database.

[0007] A "vibration sensor" is a device that detects vibrations in an object and senses any abnormalities.

[0008] An "open / close sensor" is a device that detects the open / closed state of doors and windows and alerts the user if there is any abnormal opening or closing.

[0009] "Suspicious person information" refers to data about individuals or behaviors that could pose a threat, and it is information that requires vigilance or response based on this information.

[0010] An "information processing device" is a part of a system that has functions for collecting, analyzing, managing, and sharing information with external parties.

[0011] "Behavioral analysis" is a technique that collects data on a person's movements and behavior, and uses that data to infer situations and intentions.

[0012] "Facial expression analysis" is a technology that recognizes a person's facial expressions and infers their emotions and intentions.

[0013] "Risk assessment" is the process of quantifying potential threats and risks based on behavior and facial expressions, and determining their importance. [Brief explanation of the drawing]

[0014] [Figure 1] It is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.

MODE FOR CARRYING OUT THE INVENTION

[0015] Hereinafter, an example of an embodiment of the system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0016] First, the terms used in the following description will be explained.

[0017] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0018] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0019] In the following embodiments, the numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.

[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0022] [First Embodiment]

[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0035] This invention is constructed as a security system for monitoring living spaces. The system consists of multiple terminals and a server, each terminal equipped with a surveillance camera and sensors. The operation of the system is described below in natural language.

[0036] First, the device acquires real-time video footage from the surveillance camera on which it is installed and sends the video data to the server. The server processes the received video data, uses a facial recognition algorithm to identify people's faces, and compares them with a pre-registered database. If a person is identified as a suspicious individual, the server immediately sends an alert to the user and, if necessary, contacts the security company.

[0037] Next, vibration sensors and open / close sensors connected to the terminal constantly monitor for abnormalities in windows and doors. If a sensor detects an abnormality, the terminal reports that information to the server. Based on that information, the server issues an alarm to the user and the security company.

[0038] Furthermore, by analyzing the visitor's behavior and facial expressions in the video, the server scores the visitor's level of danger. Based on this score, it automatically determines the necessary actions and sends appropriate notifications to the user and the police.

[0039] For example, if a person exhibiting suspicious behavior is near the entrance, the terminal can detect this behavior, and if the server assesses the risk level as high, it can immediately coordinate with the security company to take countermeasures.

[0040] Finally, the server uses the collected suspicious person information and anomaly detection logs to create regional alert information via a generating AI, and shares this information with local governments and relevant crime prevention organizations. This enhances overall regional security and promotes a sense of security among residents. In this configuration, the system of the present invention can provide advanced crime prevention functions.

[0041] The following describes the processing flow.

[0042] Step 1:

[0043] The terminal acquires video data from the surveillance camera in real time. The acquired data is sent to the server based on the communication protocol.

[0044] Step 2:

[0045] The server analyzes the received video data and applies a facial recognition algorithm. It then compares the data with a database to check if the identified person is on the registered list of suspicious individuals.

[0046] Step 3:

[0047] If the server detects a suspicious person, it will immediately notify the user and, if necessary, send an alert to the security company. This notification will include information such as the suspicious person's facial image, the date and time of detection, and the location where the incident occurred.

[0048] Step 4:

[0049] The terminal monitors data from vibration sensors and open / close sensors installed on windows and doors, and if there is any abnormal vibration or opening / closing, it determines it to be an anomaly.

[0050] Step 5:

[0051] A terminal that detects an anomaly reports the information to the server. Based on the details of the anomaly, the server issues an alert to the user. Simultaneously, the security company is also automatically notified.

[0052] Step 6:

[0053] The terminal analyzes the visitor's behavior and facial expressions from video data, and the server scores the level of risk based on the results. The scoring is performed in stages based on pre-set criteria.

[0054] Step 7:

[0055] If a server is assigned a high risk score, it will send a warning notification to the police and nearby residents. This allows for a swift response.

[0056] Step 8:

[0057] The server uses AI to analyze suspicious person information and anomaly detection logs, generating local alert information. This information is shared with local governments and crime prevention organizations to improve safety throughout the community.

[0058] (Example 1)

[0059] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0060] In modern society, enhancing the safety of living spaces is a crucial issue. Conventional security systems have problems such as delayed response times and insufficient information sharing. In particular, early detection of anomalies and sharing of warning information throughout the community are often inadequate, which has been a major obstacle to improving security.

[0061] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0062] In this invention, the server includes means for extracting an individual's face from video information using video recognition technology and matching it with a registration list, means for detecting abnormalities in openings using a physical change detection device, and means for analyzing human movements and facial expressions to quantify risk and generate notifications based on the numerical results. This enables real-time anomaly detection, rapid information sharing, and effective crime prevention measures across the entire region.

[0063] "Image recognition technology" is a technology that analyzes video data acquired using cameras and sensors to identify and extract specific people or objects from that data.

[0064] "Extracting individual faces" refers to identifying the human face portion from video data and extracting that information in a usable format.

[0065] "Matching against the registration list" means comparing the extracted information with information in a pre-registered database to confirm that they match.

[0066] A "physical change detection device" is a device used to sense physical changes in the environment, and its purpose is to detect vibrations and changes in opening and closing.

[0067] "Detecting abnormalities in openings" means detecting unusual behavior or conditions in openings such as windows and doors, and determining that some kind of abnormality has occurred.

[0068] "Analyzing human actions and facial expressions" refers to the technology of analyzing data from an individual's actions and facial expressions in order to evaluate their intentions and emotions.

[0069] "Quantifying risk" means assigning a numerical score to the degree of danger in a given situation based on information obtained from analyzing movements and facial expressions.

[0070] "Generating notifications" refers to the process of creating necessary alerts and reports based on analysis results and informing relevant parties.

[0071] An "information processing device" is a device for receiving, processing, and transmitting data, and it integrates and manages all the data within a system.

[0072] "Sharing information on suspicious individuals" means sharing information about detected suspicious persons or unusual behavior with multiple relevant organizations in cooperation with them.

[0073] "Real-time data transmission" means instantly sending the latest information that needs to be considered to other devices and systems, enabling a rapid response.

[0074] This invention is a monitoring system for improving the safety of living spaces, and mainly consists of a terminal, a server, and a user. The terminal is equipped with a surveillance camera and various sensors, and is responsible for monitoring video information and physical changes in real time. Specifically, the hardware combines a high-resolution camera and physical change sensors, which enables detailed observation over a wide area.

[0075] Terminal operation

[0076] The terminal acquires video data using surveillance cameras and transmits it to the server in real time. Furthermore, it detects physical anomalies using vibration sensors and door / window sensors. Encryption technology is used for communication, and low-power communication methods such as LoRaWAN are employed, allowing for efficient data transmission.

[0077] Server Processing

[0078] The server receives data from the terminal and uses video recognition technology and facial recognition algorithms to identify individuals and compare them with a registration list. Specifically, it performs analysis using a neural network with TENSORFLOW®. In addition, it utilizes facial recognition technology based on OpenCV to analyze human movements and facial expressions. The analysis results are reflected in a risk score, and if an anomaly is detected, the user is immediately notified.

[0079] The server generates regional alert information by inputting the collected data into an AI model, and shares this information with external crime prevention organizations and local governments. This sharing enhances the overall safety of the region.

[0080] User response

[0081] Users receive alerts and notifications from the server via their smartphones or other connected devices. This enables quick responses and easy notification to security companies as needed. For example, if suspicious activity is detected, the system can quickly coordinate with security companies to take countermeasures.

[0082] Example of a prompt

[0083] "What is the optimal algorithm for identifying suspicious individuals in real-time monitoring of living spaces?"

[0084] "Please tell me how to improve the efficiency of alert transmission based on sensor anomaly detection in security systems."

[0085] "How can we improve the risk scoring algorithm based on visitor behavior analysis?"

[0086] This system is expected to provide residents with a safer and more secure living environment.

[0087] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0088] Step 1:

[0089] The terminal uses a surveillance camera to acquire video data in real time. This video data is transmitted to the server via a secure communication protocol. The input is video from the surveillance camera, and the output is an encrypted video stream transferred to the server. Specifically, a high-resolution camera continuously captures video, encodes the data, and transmits it over the network.

[0090] Step 2:

[0091] The server analyzes the received video data and extracts individual faces using video recognition technology. The input here is video data transmitted from the terminal, and the output is the extracted face region data. This process is executed by a face detection algorithm using a neural network based on TensorFlow. Specifically, it detects and extracts face regions on a frame-by-frame basis.

[0092] Step 3:

[0093] The server uses a face recognition algorithm to compare extracted face data with a registered list. The input is the detected face data, and the output is the matching result. Here, a comparison operation is performed to determine whether the face matches a face registered in the existing database. Specifically, the recognition result is generated by comparing face features.

[0094] Step 4:

[0095] The physical change detection device connected to the terminal constantly monitors for abnormalities in openings such as windows and doors. The input is real-time data from physical condition sensors, and the output is an anomaly detection trigger signal. When an anomaly is detected, the information is immediately reported to the server. Specifically, it performs continuous monitoring of vibrations and opening / closing operations and generates alerts in the event of an anomaly.

[0096] Step 5:

[0097] The server analyzes human movements and facial expressions to quantify the risk of a situation. The input is movement and facial expression information extracted from video data, and the output is a risk score. It utilizes OpenCV for facial recognition and generates an alert immediately if an anomaly is detected. Specifically, it analyzes facial expression data and performs evaluations based on established criteria.

[0098] Step 6:

[0099] The server generates necessary notifications based on the analysis results and sends them to the user's smart device. Inputs are risk scores and anomaly detection information, while output is an alert message to the user. Specifically, it generates alerts when pre-set thresholds are exceeded and sends notifications via SMS or a dedicated app.

[0100] Step 7:

[0101] The server generates regional alert information and shares it with external organizations. Input consists of all collected anomaly information and analysis data from a generating AI model; output is regional alert information. This information is transmitted in JSON format and shared with crime prevention agencies and local governments. Specifically, it performs information analysis and documentation using a generating AI model, and shares information via secure data communication.

[0102] (Application Example 1)

[0103] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0104] In modern society, ensuring the safety of living spaces is crucial. However, conventional security systems have faced challenges in real-time anomaly detection and accurate identification of suspicious individuals, making it difficult to implement effective countermeasures. Furthermore, rapid information sharing and appropriate warning dissemination in the event of an anomaly are often insufficient, highlighting the need for the establishment of new technologies to address these issues.

[0105] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0106] In this invention, the server includes a processing unit that uses a facial recognition algorithm to detect the characteristics of a person from video data and compares them with a list of subjects; a device that uses a vibration detection unit and an opening / closing detection unit to monitor abnormalities in entrances and windows; and a mechanism that analyzes a person's behavior and facial expressions to quantify the degree of danger and issue a warning based on that evaluation. This makes it possible to quickly and accurately detect abnormal situations in living spaces and take appropriate action.

[0107] A "face recognition algorithm" is a computational method that detects faces of people in video data, analyzes their features, and compares them with a pre-registered catalog.

[0108] A "vibration detection unit" is a device that senses the vibration of an object and uses that data to detect anomalies.

[0109] A "door / window opening / closing detection unit" is a device that senses the open / closed state of a door or window and checks whether there is any abnormality.

[0110] An "information processing unit" is a central device that has the function of centrally managing, analyzing, and sharing multiple data sets.

[0111] "Means for transmitting warnings to mobile communication devices when an anomaly is detected" refers to a processing mechanism for transmitting relevant information to mobile communication terminals such as mobile phones when some kind of anomaly is detected.

[0112] A "generative model" is a machine learning algorithm that creates new information and insights based on given data.

[0113] To implement this invention, the following system needs to be constructed. The server plays a central role, processing video data acquired in real time from surveillance cameras using a facial recognition algorithm and analyzing the characteristics of individuals. The results of this analysis are compared with a pre-registered catalog to determine whether or not there are any suspicious individuals. The software used includes OpenCV and TensorFlow, a library specifically for machine learning. The server also receives data from vibration detection and opening / closing detection units and processes it to detect abnormalities in entrances and windows. The data from these sensors is transmitted to the server via a microcontroller such as Arduino or Raspberry Pi.

[0114] When an anomaly is detected, the server immediately transmits an alert to a smartphone, a mobile communication device. The notification is sent via a dedicated application on the mobile device. This application allows users to check anomaly information in real time and take necessary countermeasures quickly. For example, notifications are sent using Google Cloud Messaging. In addition, a generative AI model is used to generate regional alert information and share it with external organizations. As a result, overall regional security is enhanced.

[0115] As a concrete example, consider a scenario where a suspicious person is loitering near a user's home while they are traveling. This system analyzes video data to detect the suspicious person and immediately sends an alert to the user's smartphone, enabling a quick response. An example of a prompt to the generated AI model is: "I want to develop an application that detects suspicious people from surveillance camera footage and warns the user based on the results of scoring their behavior. In particular, please tell me how to optimize the algorithm that analyzes the visitor's facial expressions and movements."

[0116] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0117] Step 1:

[0118] The server acquires video data in real time from the surveillance camera. It takes the raw video stream from the camera as input and stores the frames in buffer memory for processing. This stored data is then used to prepare for subsequent face recognition processing.

[0119] Step 2:

[0120] The server executes a face recognition algorithm on the acquired video data. Specifically, it uses the OpenCV library to detect faces of people in the video. The input is video frames, and the output is a list of the location information of the detected faces. This output is then used to analyze the features of the people.

[0121] Step 3:

[0122] The server uses the analyzed facial features to compare them with pre-registered catalog information. The input is the detected facial features, and the output is either matching catalog information or a mismatch flag. If there is no match, the system marks the person as suspicious and records the information.

[0123] Step 4:

[0124] The vibration detection unit and opening / closing detection unit connected to the terminal transmit data to the server in real time. The input is the detection signal from the sensor, and the output is an anomaly detection flag. Based on this data, the server determines whether or not there is an anomaly and prepares to send a notification in the next step.

[0125] Step 5:

[0126] When an anomaly is detected, the server sends a notification to a mobile communication device. Specifically, it uses Google Cloud Messaging to send an alert to the user's smartphone. The input is an anomaly detection flag and the characteristics of the suspicious person, and the output is an alert message. The user can receive this message and check the situation.

[0127] Step 6:

[0128] The server uses a generative AI model to create local alert information. This utilizes collected anomaly detection data and matching results as input. The output is alert information to be shared with external organizations. This information will raise crime prevention awareness throughout the community and strengthen response capabilities.

[0129] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0130] This invention is a security system for ensuring the safety of living spaces, and incorporates an emotion engine to analyze the user's emotions. The system consists of multiple terminals and a server, and the terminals are equipped with surveillance cameras, vibration sensors, door / window sensors, and the emotion engine.

[0131] First, the terminal acquires video data from the surveillance camera in real time. Using a facial recognition algorithm, the server detects faces in this data and compares them with a pre-registered database. If a suspicious person is identified, the server immediately sends an alert to the user and, if necessary, notifies the security company.

[0132] Next, the terminal's vibration sensor and open / close sensor monitor for any abnormalities in the windows and doors. If any abnormality is detected, the terminal sends the information to the server, and an alarm is immediately issued.

[0133] Furthermore, an emotion engine built into the device analyzes the user's facial expressions in the video. Based on this analysis, the server infers the user's emotional state and adjusts the content and urgency of notifications as needed. For example, if the server detects that the user's emotions are unstable, it will send the user a faster and more detailed notification and also prompt notification to the security company.

[0134] In addition, the emotion engine utilizes the analyzed emotional information and uses it as feedback when sharing information with external organizations. This allows external organizations to take more appropriate measures, taking into account the user's emotional state.

[0135] For example, if the emotion engine analyzes that a user is feeling fear upon encountering a suspicious person, an enhanced security mode may be automatically selected, prompting a rapid response. In this way, the system can provide advanced security features to protect the user's safety.

[0136] The following describes the processing flow.

[0137] Step 1:

[0138] The device acquires real-time video data of the living space via a surveillance camera. The acquired video is transmitted to the server via a secure communication path.

[0139] Step 2:

[0140] The server receives the video data and applies a facial recognition algorithm to detect people in the video. The detected facial data is then compared with a database to check for the presence of suspicious individuals.

[0141] Step 3:

[0142] If the server identifies a suspicious person, it will immediately send an alert to the user. The alert will include information about the detected suspicious person, their location, and the time.

[0143] Step 4:

[0144] The device's vibration and open / close sensors constantly monitor the status of windows and doors, detecting any abnormal vibrations or opening / closing events.

[0145] Step 5:

[0146] A terminal that detects an anomaly reports the information to the server. The server processes the information, immediately issues an alarm, and notifies the security company.

[0147] Step 6:

[0148] The device analyzes the user's facial expressions from the video it captures using an emotion engine. The server then evaluates the user's emotional state based on the analyzed data.

[0149] Step 7:

[0150] If the user's emotional state indicates anxiety or fear, the server will increase the urgency of the notification and provide detailed information to both the user and the security company.

[0151] Step 8:

[0152] By using the emotional information analyzed by the emotion engine as feedback when sharing information with external organizations, we can discuss countermeasures that take user emotions into consideration.

[0153] (Example 2)

[0154] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0155] In modern society, efficiently ensuring the safety of living spaces requires the rapid identification of suspicious individuals and the real-time detection of abnormalities in windows and doors. Furthermore, a key challenge is to further enhance safety by providing appropriate response measures that take into account the user's emotional state.

[0156] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0157] In this invention, the server includes means for recognizing a person's face from video information using video processing technology and comparing it with registered information; means for detecting abnormalities in openings using a vibration detection device and an opening detection device; and means for analyzing a person's emotional state to evaluate the degree of urgency and issuing an alarm according to the evaluation result. This makes it possible to quickly and effectively detect and respond to abnormalities in living spaces and enhance user safety.

[0158] "Image processing technology" refers to the technology of analyzing digital video data to extract or process specific information.

[0159] "Face recognition" means detecting the facial features of a person within video data and identifying them uniquely.

[0160] "Registration information" refers to identification information of individuals that is pre-stored in the system and is used to verify the identification results.

[0161] A "vibration detection device" is a device that senses vibrations in objects or structures and detects changes in those vibrations.

[0162] An "opening detection device" is a sensor device that detects when a door or window is opened or closed.

[0163] "Analyzing emotional states" is the process of scientifically evaluating a person's emotional state based on their facial expressions, tone of voice, and other factors.

[0164] "Assessing the urgency" means determining how quickly a response is needed when a particular situation occurs.

[0165] "Issuing an alarm" means sending a warning signal when certain conditions are met to alert those involved.

[0166] An "external organization" is a separate organization that shares and collaborates on information related to the system, distinct from the organization to which the system operators or users belong.

[0167] This invention relates to a security system composed of multiple terminals and a server, which includes a function to analyze human emotions in order to ensure the safety of living spaces. First, the terminals acquire video data in real time using surveillance cameras. Changes in the environment are monitored using emotion engines, vibration detection devices, open detection devices, etc., installed in the terminals. In this process, a general image recognition library can be used as the video processing technology for analyzing the video data.

[0168] The server executes a facial recognition algorithm to identify individuals from the acquired video data and compare them with registered information. If a suspicious person is identified as a result of the comparison with registered information, an alert is sent to the user, and external organizations are notified as necessary. Furthermore, for sentiment analysis, a deep learning library, for example, is used to analyze the user's emotional state, and the alarm content is adjusted based on the results.

[0169] For example, if the terminal detects abnormal vibrations in a window via a vibration detection device, it immediately sends that information to the server. This allows the server to issue an alarm. Furthermore, based on the results of emotion analysis, if it determines that the user is in an unstable state, it takes appropriate action according to the urgency of the situation.

[0170] As a concrete example, a prompt message for the generating AI model could be input as, "If the user rapidly shows signs of anxiety, please tell me how to immediately send a notification and how to enhance the security mode accordingly," allowing the system to derive an appropriate response. In this way, the system of the present invention provides advanced security functions that enhance safety in living spaces.

[0171] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0172] Step 1:

[0173] The terminal acquires video data in real time using a surveillance camera. The input is live video from the camera, and the output is a video file or stream. This video data is temporarily stored within the terminal and prepared for subsequent processing.

[0174] Step 2:

[0175] The server receives video data transmitted from the terminal. The input is video data transmitted from the terminal over the network, and the output is data stored in the server's storage. The server then prepares to analyze this data.

[0176] Step 3:

[0177] The server executes a face recognition algorithm to recognize a person's face from video data. The input is video data stored on the server, and the output is face coordinate information and identifiers. During this process, video processing technology is used to analyze the data and compare it with registered information.

[0178] Step 4:

[0179] The server compares the recognized face with registered information. The input is the identifier of the recognized face, and the output is the result of the comparison. If there is a mismatch, a suspicious person is identified, and the suspicious person's information is stored on the server.

[0180] Step 5:

[0181] The server sends an alert to the user based on the results of the suspicious person identification. The input is the information about the suspicious person, and the output is a push notification or email to the user's device. This allows the user to quickly detect danger.

[0182] Step 6:

[0183] The terminal uses vibration and open detection devices to monitor windows and doors for abnormalities. Input is real-time data from the sensors, and output is the result of any detected abnormalities. When an abnormality is detected, the server is immediately notified.

[0184] Step 7:

[0185] The server receives anomaly notifications and issues alarms. The input is anomaly notifications from sensors, and the output is anomaly notifications to the user, such as audio or light alarms. This allows the user to respond quickly to anomalies.

[0186] Step 8:

[0187] The emotion engine installed in the device analyzes the user's facial expressions from video. The input is the user's face from the video data, and the output is an evaluation of their emotional state. The emotion analysis engine identifies the user's anxiety and fear and sends the results to the server.

[0188] Step 9:

[0189] The server evaluates the urgency based on the sentiment analysis results and adjusts the content of the notification. The input is the sentiment analysis results, and the output is the adjusted urgency and content of the notification. As an example of this process, one could input the prompt sentence, "If a user rapidly shows signs of anxiety, please tell me how to send an immediate notification and how to strengthen security mode accordingly," into the generating AI model.

[0190] (Application Example 2)

[0191] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0192] In modern society, improving the security of living spaces is an urgent issue. In particular, it is necessary to detect intruders and illegal activities in advance and respond quickly. However, existing security systems lack sufficient functions for adjusting alarms based on the user's emotional state and for real-time information sharing. Therefore, there is a need to provide security systems that integrate more advanced sensor technology and emotional analysis technology.

[0193] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0194] In this invention, the server includes means for detecting a person's face from image data using facial recognition technology and comparing it with a registered list of target persons; means for detecting abnormalities in openings using vibration detectors and opening / closing detectors; means for analyzing a person's behavior and facial expressions to evaluate their emotional state and adjusting alarms based on the evaluation results; means for transmitting warnings to the user's mobile device in real time and providing notification content tailored to the user's emotional state; and means for the information processing device to share suspicious person information with multiple external organizations. This enables flexible and rapid responses tailored to the user's emotional state while maintaining a high level of safety in the living space.

[0195] "Facial recognition technology" is a technology that automatically detects a person's face from image data and is used to identify individuals.

[0196] A "vibration detector" is a sensor that detects vibrations and is a device used to detect abnormal movement of objects or structures.

[0197] An "open / close detector" is a device that detects the open / closed status of windows and doors, and is used to detect attempts at unauthorized opening.

[0198] "Emotional state" refers to the emotional state exhibited by a person, and includes psychological or physiological responses.

[0199] An "information processing device" refers to a computer or server, which is a device used for receiving, analyzing, and transmitting data.

[0200] "External organizations" refer to groups or institutions other than those to which the user belongs, and include other organizations responsible for sharing crime prevention information and implementing countermeasures.

[0201] A "personal information terminal" refers to a portable computing device, such as a smartphone or tablet, used for sending and receiving information.

[0202] This invention is a security system for enhancing the safety of living spaces. The system mainly consists of a server, terminals (surveillance cameras, vibration detectors, door / window detectors), and a user's portable information terminal.

[0203] The server processes image data received from the camera using facial recognition technology and compares it against a registered list of individuals. It also analyzes data from vibration and opening / closing detectors to detect abnormalities in openings. Furthermore, the server uses information obtained from the video data to analyze the emotional state of individuals and adjusts the alarm based on that state. This allows for a stronger alarm and prompt action if the user experiences anxiety or fear.

[0204] Furthermore, the mobile device receives information from the server in real time and provides alerts and notifications to the user. The information displayed on the mobile device is customized according to the user's emotional state. This is achieved using software such as the Affectiva SDK for emotional analysis.

[0205] If a terminal detects an anomaly, the server will share information about the suspicious person with external organizations. This will strengthen the local crime prevention system.

[0206] For example, if the front door open / close detector detects an anomaly while the user is away from home at night, the server immediately sends an alert to the user's mobile device, allowing the user to remotely check the situation as needed. In this process, a generative AI model is used to analyze emotions and adjust the notification content accordingly.

[0207] An example of a prompt message is, "Please tell me how to implement an alarm function in a home security system that takes into account the user's emotional state."

[0208] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0209] Step 1:

[0210] The terminal acquires video data from surveillance cameras in real time. This data is then analyzed using facial recognition technology to extract feature points and compare them with a list of individuals. The input is video data, and the output is the determination of whether the person is registered or not.

[0211] Step 2:

[0212] The terminal's vibration and open / close sensors monitor the state of their respective targets (windows and doors) and detect anomalies. Input is sensor data, and output is whether or not an anomaly was detected. If an anomaly is detected, the sensor information is sent to the server.

[0213] Step 3:

[0214] The server receives video and sensor data and analyzes the user's facial expression data using the Affectiva SDK for emotion analysis. This analysis takes facial expression data as input and generates an emotional state as output.

[0215] Step 4:

[0216] Based on the emotional state, the server adjusts the content and urgency of the alarm. The input is the result of the emotional analysis, and the output is the adjusted alarm information. For example, if the user is feeling anxious, the alarm will be strengthened.

[0217] Step 5:

[0218] The server sends the adjusted alarm information to the user's mobile device. The mobile device receives this alarm information and displays a notification to the user. The input is the alarm information, and the output is the notification content presented to the user.

[0219] Step 6:

[0220] The server shares information about suspicious individuals with external organizations. It uses a generative AI model to create local crime prevention information and transmits it to external organizations. Inputs include anomaly detection data and emotion analysis data, while output is information shared with external organizations.

[0221] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0222] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0223] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0224] [Second Embodiment]

[0225] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0226] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0227] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0228] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0229] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0230] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0231] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0232] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0233] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0234] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0235] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0236] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0237] This invention is constructed as a security system for monitoring living spaces. The system consists of multiple terminals and a server, each terminal equipped with a surveillance camera and sensors. The operation of the system is described below in natural language.

[0238] First, the device acquires real-time video footage from the surveillance camera on which it is installed and sends the video data to the server. The server processes the received video data, uses a facial recognition algorithm to identify people's faces, and compares them with a pre-registered database. If a person is identified as a suspicious individual, the server immediately sends an alert to the user and, if necessary, contacts the security company.

[0239] Next, vibration sensors and open / close sensors connected to the terminal constantly monitor for abnormalities in windows and doors. If a sensor detects an abnormality, the terminal reports that information to the server. Based on that information, the server issues an alarm to the user and the security company.

[0240] Furthermore, by analyzing the visitor's behavior and facial expressions in the video, the server scores the visitor's level of danger. Based on this score, it automatically determines the necessary actions and sends appropriate notifications to the user and the police.

[0241] For example, if a person exhibiting suspicious behavior is near the entrance, the terminal can detect this behavior, and if the server assesses the risk level as high, it can immediately coordinate with the security company to take countermeasures.

[0242] Finally, the server uses the collected suspicious person information and anomaly detection logs to create regional alert information via a generating AI, and shares this information with local governments and relevant crime prevention organizations. This enhances overall regional security and promotes a sense of security among residents. In this configuration, the system of the present invention can provide advanced crime prevention functions.

[0243] The following describes the processing flow.

[0244] Step 1:

[0245] The terminal acquires video data from the surveillance camera in real time. The acquired data is sent to the server based on the communication protocol.

[0246] Step 2:

[0247] The server analyzes the received video data and applies a facial recognition algorithm. It then compares the data with a database to check if the identified person is on the registered list of suspicious individuals.

[0248] Step 3:

[0249] If the server detects a suspicious person, it will immediately notify the user and, if necessary, send an alert to the security company. This notification will include information such as the suspicious person's facial image, the date and time of detection, and the location where the incident occurred.

[0250] Step 4:

[0251] The terminal monitors data from vibration sensors and open / close sensors installed on windows and doors, and if there is any abnormal vibration or opening / closing, it determines it to be an anomaly.

[0252] Step 5:

[0253] A terminal that detects an anomaly reports the information to the server. Based on the details of the anomaly, the server issues an alert to the user. Simultaneously, the security company is also automatically notified.

[0254] Step 6:

[0255] The terminal analyzes the visitor's behavior and facial expressions from video data, and the server scores the level of risk based on the results. The scoring is performed in stages based on pre-set criteria.

[0256] Step 7:

[0257] If a server is assigned a high risk score, it will send a warning notification to the police and nearby residents. This allows for a swift response.

[0258] Step 8:

[0259] The server uses AI to analyze suspicious person information and anomaly detection logs, generating local alert information. This information is shared with local governments and crime prevention organizations to improve safety throughout the community.

[0260] (Example 1)

[0261] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0262] In modern society, enhancing the safety of living spaces is a crucial issue. Conventional security systems have problems such as delayed response times and insufficient information sharing. In particular, early detection of anomalies and sharing of warning information throughout the community are often inadequate, which has been a major obstacle to improving security.

[0263] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0264] In this invention, the server includes means for extracting an individual's face from video information using video recognition technology and matching it with a registration list, means for detecting abnormalities in openings using a physical change detection device, and means for analyzing human movements and facial expressions to quantify risk and generate notifications based on the numerical results. This enables real-time anomaly detection, rapid information sharing, and effective crime prevention measures across the entire region.

[0265] "Image recognition technology" is a technology that analyzes video data acquired using cameras and sensors to identify and extract specific people or objects from that data.

[0266] "Extracting individual faces" refers to identifying the human face portion from video data and extracting that information in a usable format.

[0267] "Matching against the registration list" means comparing the extracted information with information in a pre-registered database to confirm that they match.

[0268] A "physical change detection device" is a device used to sense physical changes in the environment, and its purpose is to detect vibrations and changes in opening and closing.

[0269] "Detecting abnormalities in openings" means detecting unusual behavior or conditions in openings such as windows and doors, and determining that some kind of abnormality has occurred.

[0270] "Analyzing human actions and facial expressions" refers to the technology of analyzing data from an individual's actions and facial expressions in order to evaluate their intentions and emotions.

[0271] "Quantifying risk" means assigning a numerical score to the degree of danger in a given situation based on information obtained from analyzing movements and facial expressions.

[0272] "Generating notifications" refers to the process of creating necessary alerts and reports based on analysis results and informing relevant parties.

[0273] An "information processing device" is a device for receiving, processing, and transmitting data, and it integrates and manages all the data within a system.

[0274] "Sharing information on suspicious individuals" means sharing information about detected suspicious persons or unusual behavior with multiple relevant organizations in cooperation with them.

[0275] "Real-time data transmission" means instantly sending the latest information that needs to be considered to other devices and systems, enabling a rapid response.

[0276] This invention is a monitoring system for improving the safety of living spaces, and mainly consists of a terminal, a server, and a user. The terminal is equipped with a surveillance camera and various sensors, and is responsible for monitoring video information and physical changes in real time. Specifically, the hardware combines a high-resolution camera and physical change sensors, which enables detailed observation over a wide area.

[0277] Terminal operation

[0278] The terminal acquires video data using surveillance cameras and transmits it to the server in real time. Furthermore, it detects physical anomalies using vibration sensors and door / window sensors. Encryption technology is used for communication, and low-power communication methods such as LoRaWAN are employed, allowing for efficient data transmission.

[0279] Server Processing

[0280] The server receives data from the terminal and uses video recognition technology and face recognition algorithms to identify a person and compare them with the registration list. Specifically, it performs analysis using a neural network by leveraging TensorFlow. Additionally, to analyze human actions and expressions, it utilizes emotion recognition technology based on OpenCV. The analysis results are reflected in the risk scoring, and if an anomaly is detected, an immediate notification is sent to the user.

[0281] The server generates regional security information by inputting the collected data into a generated AI model and shares the information with external security agencies and local governments. This sharing enhances the safety of the entire region.

[0002] 91>

[0282] User Response

[0283] Users receive alerts and notifications from the server via a smartphone or other connected devices. This enables quick response and allows for easy notification to a security company if necessary. As a specific example, when suspicious behavior is observed, it is possible to quickly take measures in cooperation with a security company.

[0284] Examples of Prompt Sentences

[0285] "What is the optimal algorithm for identifying suspicious persons in real-time monitoring of living spaces?"

[0286] "Please teach me how to improve the efficiency of alert transmission by sensor anomaly detection in a security system."

[0287] "How should the risk scoring algorithm based on visitor behavior analysis be improved?"

[0288] It is expected that this system will provide an environment in which residents can live more安心 (securely).

[0289] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0290] Step 1:

[0291] The terminal uses a surveillance camera to acquire video data in real time. This video data is transmitted to the server via a secure communication protocol. The input is video from the surveillance camera, and the output is an encrypted video stream transferred to the server. Specifically, a high-resolution camera continuously captures video, encodes the data, and transmits it over the network.

[0292] Step 2:

[0293] The server analyzes the received video data and extracts individual faces using video recognition technology. The input here is video data transmitted from the terminal, and the output is the extracted face region data. This process is executed by a face detection algorithm using a neural network based on TensorFlow. Specifically, it detects and extracts face regions on a frame-by-frame basis.

[0294] Step 3:

[0295] The server uses a face recognition algorithm to compare extracted face data with a registered list. The input is the detected face data, and the output is the matching result. Here, a comparison operation is performed to determine whether the face matches a face registered in the existing database. Specifically, the recognition result is generated by comparing face features.

[0296] Step 4:

[0297] The physical change detection device connected to the terminal constantly monitors for abnormalities in openings such as windows and doors. The input is real-time data from physical condition sensors, and the output is an anomaly detection trigger signal. When an anomaly is detected, the information is immediately reported to the server. Specifically, it performs continuous monitoring of vibrations and opening / closing operations and generates alerts in the event of an anomaly.

[0298] Step 5:

[0299] The server analyzes human movements and facial expressions to quantify the risk of a situation. The input is movement and facial expression information extracted from video data, and the output is a risk score. It utilizes OpenCV for facial recognition and generates an alert immediately if an anomaly is detected. Specifically, it analyzes facial expression data and performs evaluations based on established criteria.

[0300] Step 6:

[0301] The server generates necessary notifications based on the analysis results and sends them to the user's smart device. Inputs are risk scores and anomaly detection information, while output is an alert message to the user. Specifically, it generates alerts when pre-set thresholds are exceeded and sends notifications via SMS or a dedicated app.

[0302] Step 7:

[0303] The server generates regional alert information and shares it with external organizations. Input consists of all collected anomaly information and analysis data from a generating AI model; output is regional alert information. This information is transmitted in JSON format and shared with crime prevention agencies and local governments. Specifically, it performs information analysis and documentation using a generating AI model, and shares information via secure data communication.

[0304] (Application Example 1)

[0305] Next, Application Example 1 will be described. In the following description, the data processing device 12 is referred to as a "server", and the smart glasses 214 are referred to as a "terminal".

[0306] In modern society, it is important to ensure the safety of living spaces. However, conventional security systems have problems in that it is difficult to detect abnormalities in real time and accurately identify suspicious persons, and it is difficult to take effective countermeasures. In addition, rapid information sharing and appropriate warning transmission when an abnormality occurs are often insufficient, and the establishment of new technologies to solve this is required.

[0307] The specific processing by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0308] In this invention, the server includes a processing device that detects the characteristics of a person from video data using a face recognition algorithm and matches them with a target person directory, a device that monitors abnormalities in entrances and windows using a vibration detection unit and an opening / closing detection unit, and a mechanism that analyzes the actions and expressions of a person, quantifies the degree of danger, and issues a warning based on the evaluation. As a result, it becomes possible to quickly and accurately detect abnormal situations in the living space and take appropriate actions.

[0309] The "face recognition algorithm" is a calculation method for detecting the face of a person in video data, analyzing the characteristics, and matching them with a previously registered directory.

[0310] The "vibration detection unit" is a device that senses the vibration of an object and uses the data for detecting abnormalities.

[0311] The "opening / closing detection unit" is a device that senses the opening / closing state of a door or window and checks whether there is an abnormality.

[0312] The "information processing unit" is a central device that centrally manages a plurality of data and has functions for analysis and sharing.

[0313] "Means for transmitting warnings to mobile communication devices when an anomaly is detected" refers to a processing mechanism for transmitting relevant information to mobile communication terminals such as mobile phones when some kind of anomaly is detected.

[0314] A "generative model" is a machine learning algorithm that creates new information and insights based on given data.

[0315] To implement this invention, the following system needs to be constructed. The server plays a central role, processing video data acquired in real time from surveillance cameras using a facial recognition algorithm and analyzing the characteristics of individuals. The results of this analysis are compared with a pre-registered catalog to determine whether or not there are any suspicious individuals. The software used includes OpenCV and TensorFlow, a library specifically for machine learning. The server also receives data from vibration detection and opening / closing detection units and processes it to detect abnormalities in entrances and windows. The data from these sensors is transmitted to the server via a microcontroller such as Arduino or Raspberry Pi.

[0316] When an anomaly is detected, the server immediately transmits an alert to a smartphone, a mobile communication device. The notification is sent via a dedicated application on the mobile device. This application allows users to check anomaly information in real time and take necessary countermeasures quickly. For example, Google Cloud Messaging is used to send notifications. In addition, a generative AI model is used to generate regional alert information and share it with external organizations. As a result, overall regional security is enhanced.

[0317] As a concrete example, consider a scenario where a suspicious person is loitering near a user's home while they are traveling. This system analyzes video data to detect the suspicious person and immediately sends an alert to the user's smartphone, enabling a quick response. An example of a prompt to the generated AI model is: "I want to develop an application that detects suspicious people from surveillance camera footage and warns the user based on the results of scoring their behavior. In particular, please tell me how to optimize the algorithm that analyzes the visitor's facial expressions and movements."

[0318] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0319] Step 1:

[0320] The server acquires video data in real time from the surveillance camera. It takes the raw video stream from the camera as input and stores the frames in buffer memory for processing. This stored data is then used to prepare for subsequent face recognition processing.

[0321] Step 2:

[0322] The server executes a face recognition algorithm on the acquired video data. Specifically, it uses the OpenCV library to detect faces of people in the video. The input is video frames, and the output is a list of the location information of the detected faces. This output is then used to analyze the features of the people.

[0323] Step 3:

[0324] The server uses the analyzed facial features to compare them with pre-registered catalog information. The input is the detected facial features, and the output is either matching catalog information or a mismatch flag. If there is no match, the system marks the person as suspicious and records the information.

[0325] Step 4:

[0326] The vibration detection unit and opening / closing detection unit connected to the terminal transmit data to the server in real time. The input is the detection signal from the sensor, and the output is an anomaly detection flag. Based on this data, the server determines whether or not there is an anomaly and prepares to send a notification in the next step.

[0327] Step 5:

[0328] When an anomaly is detected, the server sends a notification to a mobile communication device. Specifically, it uses Google Cloud Messaging to send an alert to the user's smartphone. The input is an anomaly detection flag and the characteristics of the suspicious person, and the output is an alert message. The user can receive this message and check the situation.

[0329] Step 6:

[0330] The server uses a generative AI model to create local alert information. This utilizes collected anomaly detection data and matching results as input. The output is alert information to be shared with external organizations. This information will raise crime prevention awareness throughout the community and strengthen response capabilities.

[0331] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0332] This invention is a security system for ensuring the safety of living spaces, and incorporates an emotion engine to analyze the user's emotions. The system consists of multiple terminals and a server, and the terminals are equipped with surveillance cameras, vibration sensors, door / window sensors, and the emotion engine.

[0333] First, the terminal acquires video data from the surveillance camera in real time. Using a facial recognition algorithm, the server detects faces in this data and compares them with a pre-registered database. If a suspicious person is identified, the server immediately sends an alert to the user and, if necessary, notifies the security company.

[0334] Next, the terminal's vibration sensor and open / close sensor monitor for any abnormalities in the windows and doors. If any abnormality is detected, the terminal sends the information to the server, and an alarm is immediately issued.

[0335] Furthermore, an emotion engine built into the device analyzes the user's facial expressions in the video. Based on this analysis, the server infers the user's emotional state and adjusts the content and urgency of notifications as needed. For example, if the server detects that the user's emotions are unstable, it will send the user a faster and more detailed notification and also prompt notification to the security company.

[0336] In addition, the emotion engine utilizes the analyzed emotional information and uses it as feedback when sharing information with external organizations. This allows external organizations to take more appropriate measures, taking into account the user's emotional state.

[0337] For example, if the emotion engine analyzes that a user is feeling fear upon encountering a suspicious person, an enhanced security mode may be automatically selected, prompting a rapid response. In this way, the system can provide advanced security features to protect the user's safety.

[0338] The following describes the processing flow.

[0339] Step 1:

[0340] The device acquires real-time video data of the living space via a surveillance camera. The acquired video is transmitted to the server via a secure communication path.

[0341] Step 2:

[0342] The server receives the video data and applies a facial recognition algorithm to detect people in the video. The detected facial data is then compared with a database to check for the presence of suspicious individuals.

[0343] Step 3:

[0344] If the server identifies a suspicious person, it will immediately send an alert to the user. The alert will include information about the detected suspicious person, their location, and the time.

[0345] Step 4:

[0346] The device's vibration and open / close sensors constantly monitor the status of windows and doors, detecting any abnormal vibrations or opening / closing events.

[0347] Step 5:

[0348] A terminal that detects an anomaly reports the information to the server. The server processes the information, immediately issues an alarm, and notifies the security company.

[0349] Step 6:

[0350] The device analyzes the user's facial expressions from the video it captures using an emotion engine. The server then evaluates the user's emotional state based on the analyzed data.

[0351] Step 7:

[0352] If the user's emotional state indicates anxiety or fear, the server will increase the urgency of the notification and provide detailed information to both the user and the security company.

[0353] Step 8:

[0354] By using the emotional information analyzed by the emotion engine as feedback when sharing information with external organizations, we can discuss countermeasures that take user emotions into consideration.

[0355] (Example 2)

[0356] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0357] In modern society, efficiently ensuring the safety of living spaces requires the rapid identification of suspicious individuals and the real-time detection of abnormalities in windows and doors. Furthermore, a key challenge is to further enhance safety by providing appropriate response measures that take into account the user's emotional state.

[0358] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0359] In this invention, the server includes means for recognizing a person's face from video information using video processing technology and comparing it with registered information; means for detecting abnormalities in openings using a vibration detection device and an opening detection device; and means for analyzing a person's emotional state to evaluate the degree of urgency and issuing an alarm according to the evaluation result. This makes it possible to quickly and effectively detect and respond to abnormalities in living spaces and enhance user safety.

[0360] "Image processing technology" refers to the technology of analyzing digital video data to extract or process specific information.

[0361] "Face recognition" means detecting the facial features of a person within video data and identifying them uniquely.

[0362] "Registration information" refers to identification information of individuals that is pre-stored in the system and is used to verify the identification results.

[0363] A "vibration detection device" is a device that senses vibrations in objects or structures and detects changes in those vibrations.

[0364] An "opening detection device" is a sensor device that detects when a door or window is opened or closed.

[0365] "Analyzing emotional states" is the process of scientifically evaluating a person's emotional state based on their facial expressions, tone of voice, and other factors.

[0366] "Assessing the urgency" means determining how quickly a response is needed when a particular situation occurs.

[0367] "Issuing an alarm" means sending a warning signal when certain conditions are met to alert those involved.

[0368] An "external organization" is a separate organization that shares and collaborates on information related to the system, distinct from the organization to which the system operators or users belong.

[0369] This invention relates to a security system composed of multiple terminals and a server, which includes a function to analyze human emotions in order to ensure the safety of living spaces. First, the terminals acquire video data in real time using surveillance cameras. Changes in the environment are monitored using emotion engines, vibration detection devices, open detection devices, etc., installed in the terminals. In this process, a general image recognition library can be used as the video processing technology for analyzing the video data.

[0370] The server executes a facial recognition algorithm to identify individuals from the acquired video data and compare them with registered information. If a suspicious person is identified as a result of the comparison with registered information, an alert is sent to the user, and external organizations are notified as necessary. Furthermore, for sentiment analysis, a deep learning library, for example, is used to analyze the user's emotional state, and the alarm content is adjusted based on the results.

[0371] For example, if the terminal detects abnormal vibrations in a window via a vibration detection device, it immediately sends that information to the server. This allows the server to issue an alarm. Furthermore, based on the results of emotion analysis, if it determines that the user is in an unstable state, it takes appropriate action according to the urgency of the situation.

[0372] As a concrete example, a prompt message for the generating AI model could be input as, "If the user rapidly shows signs of anxiety, please tell me how to immediately send a notification and how to enhance the security mode accordingly," allowing the system to derive an appropriate response. In this way, the system of the present invention provides advanced security functions that enhance safety in living spaces.

[0373] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0374] Step 1:

[0375] The terminal acquires video data in real time using a surveillance camera. The input is live video from the camera, and the output is a video file or stream. This video data is temporarily stored within the terminal and prepared for subsequent processing.

[0376] Step 2:

[0377] The server receives video data transmitted from the terminal. The input is video data transmitted from the terminal over the network, and the output is data stored in the server's storage. The server then prepares to analyze this data.

[0378] Step 3:

[0379] The server executes a face recognition algorithm to recognize a person's face from video data. The input is video data stored on the server, and the output is face coordinate information and identifiers. During this process, video processing technology is used to analyze the data and compare it with registered information.

[0380] Step 4:

[0381] The server compares the recognized face with registered information. The input is the identifier of the recognized face, and the output is the result of the comparison. If there is a mismatch, a suspicious person is identified, and the suspicious person's information is stored on the server.

[0382] Step 5:

[0383] The server sends an alert to the user based on the results of the suspicious person identification. The input is the information about the suspicious person, and the output is a push notification or email to the user's device. This allows the user to quickly detect danger.

[0384] Step 6:

[0385] The terminal uses vibration and open detection devices to monitor windows and doors for abnormalities. Input is real-time data from the sensors, and output is the result of any detected abnormalities. When an abnormality is detected, the server is immediately notified.

[0386] Step 7:

[0387] The server receives anomaly notifications and issues alarms. The input is anomaly notifications from sensors, and the output is anomaly notifications to the user, such as audio or light alarms. This allows the user to respond quickly to anomalies.

[0388] Step 8:

[0389] The emotion engine installed in the device analyzes the user's facial expressions from video. The input is the user's face from the video data, and the output is an evaluation of their emotional state. The emotion analysis engine identifies the user's anxiety and fear and sends the results to the server.

[0390] Step 9:

[0391] The server evaluates the urgency based on the sentiment analysis results and adjusts the content of the notification. The input is the sentiment analysis results, and the output is the adjusted urgency and content of the notification. As an example of this process, one could input the prompt sentence, "If a user rapidly shows signs of anxiety, please tell me how to send an immediate notification and how to strengthen security mode accordingly," into the generating AI model.

[0392] (Application Example 2)

[0393] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0394] In modern society, improving the security of living spaces is an urgent issue. In particular, it is necessary to detect intruders and illegal activities in advance and respond quickly. However, existing security systems lack sufficient functions for adjusting alarms based on the user's emotional state and for real-time information sharing. Therefore, there is a need to provide security systems that integrate more advanced sensor technology and emotional analysis technology.

[0395] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0396] In this invention, the server includes means for detecting a person's face from image data using facial recognition technology and comparing it with a registered list of target persons; means for detecting abnormalities in openings using vibration detectors and opening / closing detectors; means for analyzing a person's behavior and facial expressions to evaluate their emotional state and adjusting alarms based on the evaluation results; means for transmitting warnings to the user's mobile device in real time and providing notification content tailored to the user's emotional state; and means for the information processing device to share suspicious person information with multiple external organizations. This enables flexible and rapid responses tailored to the user's emotional state while maintaining a high level of safety in the living space.

[0397] "Facial recognition technology" is a technology that automatically detects a person's face from image data and is used to identify individuals.

[0398] A "vibration detector" is a sensor that detects vibrations and is a device used to detect abnormal movement of objects or structures.

[0399] An "open / close detector" is a device that detects the open / closed status of windows and doors, and is used to detect attempts at unauthorized opening.

[0400] "Emotional state" refers to the emotional state exhibited by a person, and includes psychological or physiological responses.

[0401] An "information processing device" refers to a computer or server, which is a device used for receiving, analyzing, and transmitting data.

[0402] "External organizations" refer to groups or institutions other than those to which the user belongs, and include other organizations responsible for sharing crime prevention information and implementing countermeasures.

[0403] A "personal information terminal" refers to a portable computing device, such as a smartphone or tablet, used for sending and receiving information.

[0404] This invention is a security system for enhancing the safety of living spaces. The system mainly consists of a server, terminals (surveillance cameras, vibration detectors, door / window detectors), and a user's portable information terminal.

[0405] The server processes image data received from the camera using facial recognition technology and compares it against a registered list of individuals. It also analyzes data from vibration and opening / closing detectors to detect abnormalities in openings. Furthermore, the server uses information obtained from the video data to analyze the emotional state of individuals and adjusts the alarm based on that state. This allows for a stronger alarm and prompt action if the user experiences anxiety or fear.

[0406] Furthermore, the mobile device receives information from the server in real time and provides alerts and notifications to the user. The information displayed on the mobile device is customized according to the user's emotional state. This is achieved using software such as the Affectiva SDK for emotional analysis.

[0407] If a terminal detects an anomaly, the server will share information about the suspicious person with external organizations. This will strengthen the local crime prevention system.

[0408] For example, if the front door open / close detector detects an anomaly while the user is away from home at night, the server immediately sends an alert to the user's mobile device, allowing the user to remotely check the situation as needed. In this process, a generative AI model is used to analyze emotions and adjust the notification content accordingly.

[0409] An example of a prompt message is, "Please tell me how to implement an alarm function in a home security system that takes into account the user's emotional state."

[0410] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0411] Step 1:

[0412] The terminal acquires video data from surveillance cameras in real time. This data is then analyzed using facial recognition technology to extract feature points and compare them with a list of individuals. The input is video data, and the output is the determination of whether the person is registered or not.

[0413] Step 2:

[0414] The terminal's vibration and open / close sensors monitor the state of their respective targets (windows and doors) and detect anomalies. Input is sensor data, and output is whether or not an anomaly was detected. If an anomaly is detected, the sensor information is sent to the server.

[0415] Step 3:

[0416] The server receives video and sensor data and analyzes the user's facial expression data using the Affectiva SDK for emotion analysis. This analysis takes facial expression data as input and generates an emotional state as output.

[0417] Step 4:

[0418] Based on the emotional state, the server adjusts the content and urgency of the alarm. The input is the result of the emotional analysis, and the output is the adjusted alarm information. For example, if the user is feeling anxious, the alarm will be strengthened.

[0419] Step 5:

[0420] The server sends the adjusted alarm information to the user's mobile device. The mobile device receives this alarm information and displays a notification to the user. The input is the alarm information, and the output is the notification content presented to the user.

[0421] Step 6:

[0422] The server shares information about suspicious individuals with external organizations. It uses a generative AI model to create local crime prevention information and transmits it to external organizations. Inputs include anomaly detection data and emotion analysis data, while output is information shared with external organizations.

[0423] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0424] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0425] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0426] [Third Embodiment]

[0427] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0428] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0429] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0430] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0431] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0432] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0433] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0434] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0435] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0436] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0437] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0438] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0439] This invention is constructed as a security system for monitoring living spaces. The system consists of multiple terminals and a server, each terminal equipped with a surveillance camera and sensors. The operation of the system is described below in natural language.

[0440] First, the device acquires real-time video footage from the surveillance camera on which it is installed and sends the video data to the server. The server processes the received video data, uses a facial recognition algorithm to identify people's faces, and compares them with a pre-registered database. If a person is identified as a suspicious individual, the server immediately sends an alert to the user and, if necessary, contacts the security company.

[0441] Next, vibration sensors and open / close sensors connected to the terminal constantly monitor for abnormalities in windows and doors. If a sensor detects an abnormality, the terminal reports that information to the server. Based on that information, the server issues an alarm to the user and the security company.

[0442] Furthermore, by analyzing the visitor's behavior and facial expressions in the video, the server scores the visitor's level of danger. Based on this score, it automatically determines the necessary actions and sends appropriate notifications to the user and the police.

[0443] For example, if a person exhibiting suspicious behavior is near the entrance, the terminal can detect this behavior, and if the server assesses the risk level as high, it can immediately coordinate with the security company to take countermeasures.

[0444] Finally, the server uses the collected suspicious person information and anomaly detection logs to create regional alert information via a generating AI, and shares this information with local governments and relevant crime prevention organizations. This enhances overall regional security and promotes a sense of security among residents. In this configuration, the system of the present invention can provide advanced crime prevention functions.

[0445] The following describes the processing flow.

[0446] Step 1:

[0447] The terminal acquires video data from the surveillance camera in real time. The acquired data is sent to the server based on the communication protocol.

[0448] Step 2:

[0449] The server analyzes the received video data and applies a facial recognition algorithm. It then compares the data with a database to check if the identified person is on the registered list of suspicious individuals.

[0450] Step 3:

[0451] If the server detects a suspicious person, it will immediately notify the user and, if necessary, send an alert to the security company. This notification will include information such as the suspicious person's facial image, the date and time of detection, and the location where the incident occurred.

[0452] Step 4:

[0453] The terminal monitors data from vibration sensors and open / close sensors installed on windows and doors, and if there is any abnormal vibration or opening / closing, it determines it to be an anomaly.

[0454] Step 5:

[0455] A terminal that detects an anomaly reports the information to the server. Based on the details of the anomaly, the server issues an alert to the user. Simultaneously, the security company is also automatically notified.

[0456] Step 6:

[0457] The terminal analyzes the visitor's behavior and facial expressions from video data, and the server scores the level of risk based on the results. The scoring is performed in stages based on pre-set criteria.

[0458] Step 7:

[0459] If a server is assigned a high risk score, it will send a warning notification to the police and nearby residents. This allows for a swift response.

[0460] Step 8:

[0461] The server uses AI to analyze suspicious person information and anomaly detection logs, generating local alert information. This information is shared with local governments and crime prevention organizations to improve safety throughout the community.

[0462] (Example 1)

[0463] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0464] In modern society, enhancing the safety of living spaces is a crucial issue. Conventional security systems have problems such as delayed response times and insufficient information sharing. In particular, early detection of anomalies and sharing of warning information throughout the community are often inadequate, which has been a major obstacle to improving security.

[0465] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0466] In this invention, the server includes means for extracting an individual's face from video information using video recognition technology and matching it with a registration list, means for detecting abnormalities in openings using a physical change detection device, and means for analyzing human movements and facial expressions to quantify risk and generate notifications based on the numerical results. This enables real-time anomaly detection, rapid information sharing, and effective crime prevention measures across the entire region.

[0467] "Image recognition technology" is a technology that analyzes video data acquired using cameras and sensors to identify and extract specific people or objects from that data.

[0468] "Extracting individual faces" refers to identifying the human face portion from video data and extracting that information in a usable format.

[0469] "Matching against the registration list" means comparing the extracted information with information in a pre-registered database to confirm that they match.

[0470] A "physical change detection device" is a device used to sense physical changes in the environment, and its purpose is to detect vibrations and changes in opening and closing.

[0471] "Detecting abnormalities in openings" means detecting unusual behavior or conditions in openings such as windows and doors, and determining that some kind of abnormality has occurred.

[0472] "Analyzing human actions and facial expressions" refers to the technology of analyzing data from an individual's actions and facial expressions in order to evaluate their intentions and emotions.

[0473] "Quantifying risk" means assigning a numerical score to the degree of danger in a given situation based on information obtained from analyzing movements and facial expressions.

[0474] "Generating notifications" refers to the process of creating necessary alerts and reports based on analysis results and informing relevant parties.

[0475] An "information processing device" is a device for receiving, processing, and transmitting data, and it integrates and manages all the data within a system.

[0476] "Sharing information on suspicious individuals" means sharing information about detected suspicious persons or unusual behavior with multiple relevant organizations in cooperation with them.

[0477] "Real-time data transmission" means instantly sending the latest information that needs to be considered to other devices and systems, enabling a rapid response.

[0478] This invention is a monitoring system for improving the safety of living spaces, and mainly consists of a terminal, a server, and a user. The terminal is equipped with a surveillance camera and various sensors, and is responsible for monitoring video information and physical changes in real time. Specifically, the hardware combines a high-resolution camera and physical change sensors, which enables detailed observation over a wide area.

[0479] Terminal operation

[0480] The terminal acquires video data using surveillance cameras and transmits it to the server in real time. Furthermore, it detects physical anomalies using vibration sensors and door / window sensors. Encryption technology is used for communication, and low-power communication methods such as LoRaWAN are employed, allowing for efficient data transmission.

[0481] Server Processing

[0482] The server receives data from the terminal and uses video recognition technology and facial recognition algorithms to identify individuals and compare them with a registered list. Specifically, it performs analysis using a neural network with TensorFlow. In addition, it utilizes facial recognition technology based on OpenCV to analyze human movements and facial expressions. The analysis results are reflected in a risk score, and if an anomaly is detected, the user is immediately notified.

[0483] The server generates regional alert information by inputting the collected data into an AI model, and shares this information with external crime prevention organizations and local governments. This sharing enhances the overall safety of the region.

[0484] User response

[0485] Users receive alerts and notifications from the server via their smartphones or other connected devices. This enables quick responses and easy notification to security companies as needed. For example, if suspicious activity is detected, the system can quickly coordinate with security companies to take countermeasures.

[0486] Example of a prompt

[0487] "What is the optimal algorithm for identifying suspicious individuals in real-time monitoring of living spaces?"

[0488] "Please tell me how to improve the efficiency of alert transmission based on sensor anomaly detection in security systems."

[0489] "How can we improve the risk scoring algorithm based on visitor behavior analysis?"

[0490] This system is expected to provide residents with a safer and more secure living environment.

[0491] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0492] Step 1:

[0493] The terminal uses a surveillance camera to acquire video data in real time. This video data is transmitted to the server via a secure communication protocol. The input is video from the surveillance camera, and the output is an encrypted video stream transferred to the server. Specifically, a high-resolution camera continuously captures video, encodes the data, and transmits it over the network.

[0494] Step 2:

[0495] The server analyzes the received video data and extracts individual faces using video recognition technology. The input here is video data transmitted from the terminal, and the output is the extracted face region data. This process is executed by a face detection algorithm using a neural network based on TensorFlow. Specifically, it detects and extracts face regions on a frame-by-frame basis.

[0496] Step 3:

[0497] The server uses a face recognition algorithm to compare extracted face data with a registered list. The input is the detected face data, and the output is the matching result. Here, a comparison operation is performed to determine whether the face matches a face registered in the existing database. Specifically, the recognition result is generated by comparing face features.

[0498] Step 4:

[0499] The physical change detection device connected to the terminal constantly monitors for abnormalities in openings such as windows and doors. The input is real-time data from physical condition sensors, and the output is an anomaly detection trigger signal. When an anomaly is detected, the information is immediately reported to the server. Specifically, it performs continuous monitoring of vibrations and opening / closing operations and generates alerts in the event of an anomaly.

[0500] Step 5:

[0501] The server analyzes human movements and facial expressions to quantify the risk of a situation. The input is movement and facial expression information extracted from video data, and the output is a risk score. It utilizes OpenCV for facial recognition and generates an alert immediately if an anomaly is detected. Specifically, it analyzes facial expression data and performs evaluations based on established criteria.

[0502] Step 6:

[0503] The server generates necessary notifications based on the analysis results and sends them to the user's smart device. Inputs are risk scores and anomaly detection information, while output is an alert message to the user. Specifically, it generates alerts when pre-set thresholds are exceeded and sends notifications via SMS or a dedicated app.

[0504] Step 7:

[0505] The server generates regional alert information and shares it with external organizations. Input consists of all collected anomaly information and analysis data from a generating AI model; output is regional alert information. This information is transmitted in JSON format and shared with crime prevention agencies and local governments. Specifically, it performs information analysis and documentation using a generating AI model, and shares information via secure data communication.

[0506] (Application Example 1)

[0507] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0508] In modern society, ensuring the safety of living spaces is crucial. However, conventional security systems have faced challenges in real-time anomaly detection and accurate identification of suspicious individuals, making it difficult to implement effective countermeasures. Furthermore, rapid information sharing and appropriate warning dissemination in the event of an anomaly are often insufficient, highlighting the need for the establishment of new technologies to address these issues.

[0509] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0510] In this invention, the server includes a processing unit that uses a facial recognition algorithm to detect the characteristics of a person from video data and compares them with a list of subjects; a device that uses a vibration detection unit and an opening / closing detection unit to monitor abnormalities in entrances and windows; and a mechanism that analyzes a person's behavior and facial expressions to quantify the degree of danger and issue a warning based on that evaluation. This makes it possible to quickly and accurately detect abnormal situations in living spaces and take appropriate action.

[0511] A "face recognition algorithm" is a computational method that detects faces of people in video data, analyzes their features, and compares them with a pre-registered catalog.

[0512] A "vibration detection unit" is a device that senses the vibration of an object and uses that data to detect anomalies.

[0513] A "door / window opening / closing detection unit" is a device that senses the open / closed state of a door or window and checks whether there is any abnormality.

[0514] An "information processing unit" is a central device that has the function of centrally managing, analyzing, and sharing multiple data sets.

[0515] "Means for transmitting warnings to mobile communication devices when an anomaly is detected" refers to a processing mechanism for transmitting relevant information to mobile communication terminals such as mobile phones when some kind of anomaly is detected.

[0516] A "generative model" is a machine learning algorithm that creates new information and insights based on given data.

[0517] To implement this invention, the following system needs to be constructed. The server plays a central role, processing video data acquired in real time from surveillance cameras using a facial recognition algorithm and analyzing the characteristics of individuals. The results of this analysis are compared with a pre-registered catalog to determine whether or not there are any suspicious individuals. The software used includes OpenCV and TensorFlow, a library specifically for machine learning. The server also receives data from vibration detection and opening / closing detection units and processes it to detect abnormalities in entrances and windows. The data from these sensors is transmitted to the server via a microcontroller such as Arduino or Raspberry Pi.

[0518] When an anomaly is detected, the server immediately transmits an alert to a smartphone, a mobile communication device. The notification is sent via a dedicated application on the mobile device. This application allows users to check anomaly information in real time and take necessary countermeasures quickly. For example, Google Cloud Messaging is used to send notifications. In addition, a generative AI model is used to generate regional alert information and share it with external organizations. As a result, overall regional security is enhanced.

[0519] As a concrete example, consider a scenario where a suspicious person is loitering near a user's home while they are traveling. This system analyzes video data to detect the suspicious person and immediately sends an alert to the user's smartphone, enabling a quick response. An example of a prompt to the generated AI model is: "I want to develop an application that detects suspicious people from surveillance camera footage and warns the user based on the results of scoring their behavior. In particular, please tell me how to optimize the algorithm that analyzes the visitor's facial expressions and movements."

[0520] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0521] Step 1:

[0522] The server acquires video data in real time from the surveillance camera. It takes the raw video stream from the camera as input and stores the frames in buffer memory for processing. This stored data is then used to prepare for subsequent face recognition processing.

[0523] Step 2:

[0524] The server executes a face recognition algorithm on the acquired video data. Specifically, it uses the OpenCV library to detect faces of people in the video. The input is video frames, and the output is a list of the location information of the detected faces. This output is then used to analyze the features of the people.

[0525] Step 3:

[0526] The server uses the analyzed facial features to compare them with pre-registered catalog information. The input is the detected facial features, and the output is either matching catalog information or a mismatch flag. If there is no match, the system marks the person as suspicious and records the information.

[0527] Step 4:

[0528] The vibration detection unit and opening / closing detection unit connected to the terminal transmit data to the server in real time. The input is the detection signal from the sensor, and the output is an anomaly detection flag. Based on this data, the server determines whether or not there is an anomaly and prepares to send a notification in the next step.

[0529] Step 5:

[0530] When an anomaly is detected, the server sends a notification to a mobile communication device. Specifically, it uses Google Cloud Messaging to send an alert to the user's smartphone. The input is an anomaly detection flag and the characteristics of the suspicious person, and the output is an alert message. The user can receive this message and check the situation.

[0531] Step 6:

[0532] The server uses a generative AI model to create local alert information. This utilizes collected anomaly detection data and matching results as input. The output is alert information to be shared with external organizations. This information will raise crime prevention awareness throughout the community and strengthen response capabilities.

[0533] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0534] This invention is a security system for ensuring the safety of living spaces, and incorporates an emotion engine to analyze the user's emotions. The system consists of multiple terminals and a server, and the terminals are equipped with surveillance cameras, vibration sensors, door / window sensors, and the emotion engine.

[0535] First, the terminal acquires video data from the surveillance camera in real time. Using a facial recognition algorithm, the server detects faces in this data and compares them with a pre-registered database. If a suspicious person is identified, the server immediately sends an alert to the user and, if necessary, notifies the security company.

[0536] Next, the terminal's vibration sensor and open / close sensor monitor for any abnormalities in the windows and doors. If any abnormality is detected, the terminal sends the information to the server, and an alarm is immediately issued.

[0537] Furthermore, an emotion engine built into the device analyzes the user's facial expressions in the video. Based on this analysis, the server infers the user's emotional state and adjusts the content and urgency of notifications as needed. For example, if the server detects that the user's emotions are unstable, it will send the user a faster and more detailed notification and also prompt notification to the security company.

[0538] In addition, the emotion engine utilizes the analyzed emotional information and uses it as feedback when sharing information with external organizations. This allows external organizations to take more appropriate measures, taking into account the user's emotional state.

[0539] For example, if the emotion engine analyzes that a user is feeling fear upon encountering a suspicious person, an enhanced security mode may be automatically selected, prompting a rapid response. In this way, the system can provide advanced security features to protect the user's safety.

[0540] The following describes the processing flow.

[0541] Step 1:

[0542] The device acquires real-time video data of the living space via a surveillance camera. The acquired video is transmitted to the server via a secure communication path.

[0543] Step 2:

[0544] The server receives the video data and applies a facial recognition algorithm to detect people in the video. The detected facial data is then compared with a database to check for the presence of suspicious individuals.

[0545] Step 3:

[0546] If the server identifies a suspicious person, it will immediately send an alert to the user. The alert will include information about the detected suspicious person, their location, and the time.

[0547] Step 4:

[0548] The device's vibration and open / close sensors constantly monitor the status of windows and doors, detecting any abnormal vibrations or opening / closing events.

[0549] Step 5:

[0550] A terminal that detects an anomaly reports the information to the server. The server processes the information, immediately issues an alarm, and notifies the security company.

[0551] Step 6:

[0552] The device analyzes the user's facial expressions from the video it captures using an emotion engine. The server then evaluates the user's emotional state based on the analyzed data.

[0553] Step 7:

[0554] If the user's emotional state indicates anxiety or fear, the server will increase the urgency of the notification and provide detailed information to both the user and the security company.

[0555] Step 8:

[0556] By using the emotional information analyzed by the emotion engine as feedback when sharing information with external organizations, we can discuss countermeasures that take user emotions into consideration.

[0557] (Example 2)

[0558] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0559] In modern society, efficiently ensuring the safety of living spaces requires the rapid identification of suspicious individuals and the real-time detection of abnormalities in windows and doors. Furthermore, a key challenge is to further enhance safety by providing appropriate response measures that take into account the user's emotional state.

[0560] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0561] In this invention, the server includes means for recognizing a person's face from video information using video processing technology and comparing it with registered information; means for detecting abnormalities in openings using a vibration detection device and an opening detection device; and means for analyzing a person's emotional state to evaluate the degree of urgency and issuing an alarm according to the evaluation result. This makes it possible to quickly and effectively detect and respond to abnormalities in living spaces and enhance user safety.

[0562] "Image processing technology" refers to the technology of analyzing digital video data to extract or process specific information.

[0563] "Face recognition" means detecting the facial features of a person within video data and identifying them uniquely.

[0564] "Registration information" refers to identification information of individuals that is pre-stored in the system and is used to verify the identification results.

[0565] A "vibration detection device" is a device that senses vibrations in objects or structures and detects changes in those vibrations.

[0566] An "opening detection device" is a sensor device that detects when a door or window is opened or closed.

[0567] "Analyzing emotional states" is the process of scientifically evaluating a person's emotional state based on their facial expressions, tone of voice, and other factors.

[0568] "Assessing the urgency" means determining how quickly a response is needed when a particular situation occurs.

[0569] "Issuing an alarm" means sending a warning signal when certain conditions are met to alert those involved.

[0570] An "external organization" is a separate organization that shares and collaborates on information related to the system, distinct from the organization to which the system operators or users belong.

[0571] This invention relates to a security system composed of multiple terminals and a server, which includes a function to analyze human emotions in order to ensure the safety of living spaces. First, the terminals acquire video data in real time using surveillance cameras. Changes in the environment are monitored using emotion engines, vibration detection devices, open detection devices, etc., installed in the terminals. In this process, a general image recognition library can be used as the video processing technology for analyzing the video data.

[0572] The server executes a facial recognition algorithm to identify individuals from the acquired video data and compare them with registered information. If a suspicious person is identified as a result of the comparison with registered information, an alert is sent to the user, and external organizations are notified as necessary. Furthermore, for sentiment analysis, a deep learning library, for example, is used to analyze the user's emotional state, and the alarm content is adjusted based on the results.

[0573] For example, if the terminal detects abnormal vibrations in a window via a vibration detection device, it immediately sends that information to the server. This allows the server to issue an alarm. Furthermore, based on the results of emotion analysis, if it determines that the user is in an unstable state, it takes appropriate action according to the urgency of the situation.

[0574] As a concrete example, a prompt message for the generating AI model could be input as, "If the user rapidly shows signs of anxiety, please tell me how to immediately send a notification and how to enhance the security mode accordingly," allowing the system to derive an appropriate response. In this way, the system of the present invention provides advanced security functions that enhance safety in living spaces.

[0575] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0576] Step 1:

[0577] The terminal acquires video data in real time using a surveillance camera. The input is live video from the camera, and the output is a video file or stream. This video data is temporarily stored within the terminal and prepared for subsequent processing.

[0578] Step 2:

[0579] The server receives video data transmitted from the terminal. The input is video data transmitted from the terminal over the network, and the output is data stored in the server's storage. The server then prepares to analyze this data.

[0580] Step 3:

[0581] The server executes a face recognition algorithm to recognize a person's face from video data. The input is video data stored on the server, and the output is face coordinate information and identifiers. During this process, video processing technology is used to analyze the data and compare it with registered information.

[0582] Step 4:

[0583] The server compares the recognized face with registered information. The input is the identifier of the recognized face, and the output is the result of the comparison. If there is a mismatch, a suspicious person is identified, and the suspicious person's information is stored on the server.

[0584] Step 5:

[0585] The server sends an alert to the user based on the results of the suspicious person identification. The input is the information about the suspicious person, and the output is a push notification or email to the user's device. This allows the user to quickly detect danger.

[0586] Step 6:

[0587] The terminal uses vibration and open detection devices to monitor windows and doors for abnormalities. Input is real-time data from the sensors, and output is the result of any detected abnormalities. When an abnormality is detected, the server is immediately notified.

[0588] Step 7:

[0589] The server receives anomaly notifications and issues alarms. The input is anomaly notifications from sensors, and the output is anomaly notifications to the user, such as audio or light alarms. This allows the user to respond quickly to anomalies.

[0590] Step 8:

[0591] The emotion engine installed in the device analyzes the user's facial expressions from video. The input is the user's face from the video data, and the output is an evaluation of their emotional state. The emotion analysis engine identifies the user's anxiety and fear and sends the results to the server.

[0592] Step 9:

[0593] The server evaluates the urgency based on the sentiment analysis results and adjusts the content of the notification. The input is the sentiment analysis results, and the output is the adjusted urgency and content of the notification. As an example of this process, one could input the prompt sentence, "If a user rapidly shows signs of anxiety, please tell me how to send an immediate notification and how to strengthen security mode accordingly," into the generating AI model.

[0594] (Application Example 2)

[0595] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0596] In modern society, improving the security of living spaces is an urgent issue. In particular, it is necessary to detect intruders and illegal activities in advance and respond quickly. However, existing security systems lack sufficient functions for adjusting alarms based on the user's emotional state and for real-time information sharing. Therefore, there is a need to provide security systems that integrate more advanced sensor technology and emotional analysis technology.

[0597] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0598] In this invention, the server includes means for detecting a person's face from image data using facial recognition technology and comparing it with a registered list of target persons; means for detecting abnormalities in openings using vibration detectors and opening / closing detectors; means for analyzing a person's behavior and facial expressions to evaluate their emotional state and adjusting alarms based on the evaluation results; means for transmitting warnings to the user's mobile device in real time and providing notification content tailored to the user's emotional state; and means for the information processing device to share suspicious person information with multiple external organizations. This enables flexible and rapid responses tailored to the user's emotional state while maintaining a high level of safety in the living space.

[0599] "Facial recognition technology" is a technology that automatically detects a person's face from image data and is used to identify individuals.

[0600] A "vibration detector" is a sensor that detects vibrations and is a device used to detect abnormal movement of objects or structures.

[0601] An "open / close detector" is a device that detects the open / closed status of windows and doors, and is used to detect attempts at unauthorized opening.

[0602] "Emotional state" refers to the emotional state exhibited by a person, and includes psychological or physiological responses.

[0603] An "information processing device" refers to a computer or server, which is a device used for receiving, analyzing, and transmitting data.

[0604] "External organizations" refer to groups or institutions other than those to which the user belongs, and include other organizations responsible for sharing crime prevention information and implementing countermeasures.

[0605] A "personal information terminal" refers to a portable computing device, such as a smartphone or tablet, used for sending and receiving information.

[0606] This invention is a security system for enhancing the safety of living spaces. The system mainly consists of a server, terminals (surveillance cameras, vibration detectors, door / window detectors), and a user's portable information terminal.

[0607] The server processes image data received from the camera using facial recognition technology and compares it against a registered list of individuals. It also analyzes data from vibration and opening / closing detectors to detect abnormalities in openings. Furthermore, the server uses information obtained from the video data to analyze the emotional state of individuals and adjusts the alarm based on that state. This allows for a stronger alarm and prompt action if the user experiences anxiety or fear.

[0608] Furthermore, the mobile device receives information from the server in real time and provides alerts and notifications to the user. The information displayed on the mobile device is customized according to the user's emotional state. This is achieved using software such as the Affectiva SDK for emotional analysis.

[0609] If a terminal detects an anomaly, the server will share information about the suspicious person with external organizations. This will strengthen the local crime prevention system.

[0610] For example, if the front door open / close detector detects an anomaly while the user is away from home at night, the server immediately sends an alert to the user's mobile device, allowing the user to remotely check the situation as needed. In this process, a generative AI model is used to analyze emotions and adjust the notification content accordingly.

[0611] An example of a prompt message is, "Please tell me how to implement an alarm function in a home security system that takes into account the user's emotional state."

[0612] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0613] Step 1:

[0614] The terminal acquires video data from surveillance cameras in real time. This data is then analyzed using facial recognition technology to extract feature points and compare them with a list of individuals. The input is video data, and the output is the determination of whether the person is registered or not.

[0615] Step 2:

[0616] The terminal's vibration and open / close sensors monitor the state of their respective targets (windows and doors) and detect anomalies. Input is sensor data, and output is whether or not an anomaly was detected. If an anomaly is detected, the sensor information is sent to the server.

[0617] Step 3:

[0618] The server receives video and sensor data and analyzes the user's facial expression data using the Affectiva SDK for emotion analysis. This analysis takes facial expression data as input and generates an emotional state as output.

[0619] Step 4:

[0620] Based on the emotional state, the server adjusts the content and urgency of the alarm. The input is the result of the emotional analysis, and the output is the adjusted alarm information. For example, if the user is feeling anxious, the alarm will be strengthened.

[0621] Step 5:

[0622] The server sends the adjusted alarm information to the user's mobile device. The mobile device receives this alarm information and displays a notification to the user. The input is the alarm information, and the output is the notification content presented to the user.

[0623] Step 6:

[0624] The server shares information about suspicious individuals with external organizations. It uses a generative AI model to create local crime prevention information and transmits it to external organizations. Inputs include anomaly detection data and emotion analysis data, while output is information shared with external organizations.

[0625] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0626] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0627] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0628] [Fourth Embodiment]

[0629] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0630] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0631] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0632] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0633] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0634] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0635] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0636] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0637] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0638] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0639] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0640] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0641] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0642] This invention is constructed as a security system for monitoring living spaces. The system consists of multiple terminals and a server, each terminal equipped with a surveillance camera and sensors. The operation of the system is described below in natural language.

[0643] First, the device acquires real-time video footage from the surveillance camera on which it is installed and sends the video data to the server. The server processes the received video data, uses a facial recognition algorithm to identify people's faces, and compares them with a pre-registered database. If a person is identified as a suspicious individual, the server immediately sends an alert to the user and, if necessary, contacts the security company.

[0644] Next, vibration sensors and open / close sensors connected to the terminal constantly monitor for abnormalities in windows and doors. If a sensor detects an abnormality, the terminal reports that information to the server. Based on that information, the server issues an alarm to the user and the security company.

[0645] Furthermore, by analyzing the visitor's behavior and facial expressions in the video, the server scores the visitor's level of danger. Based on this score, it automatically determines the necessary actions and sends appropriate notifications to the user and the police.

[0646] For example, if a person exhibiting suspicious behavior is near the entrance, the terminal can detect this behavior, and if the server assesses the risk level as high, it can immediately coordinate with the security company to take countermeasures.

[0647] Finally, the server uses the collected suspicious person information and anomaly detection logs to create regional alert information via a generating AI, and shares this information with local governments and relevant crime prevention organizations. This enhances overall regional security and promotes a sense of security among residents. In this configuration, the system of the present invention can provide advanced crime prevention functions.

[0648] The following describes the processing flow.

[0649] Step 1:

[0650] The terminal acquires video data from the surveillance camera in real time. The acquired data is sent to the server based on the communication protocol.

[0651] Step 2:

[0652] The server analyzes the received video data and applies a facial recognition algorithm. It then compares the data with a database to check if the identified person is on the registered list of suspicious individuals.

[0653] Step 3:

[0654] If the server detects a suspicious person, it will immediately notify the user and, if necessary, send an alert to the security company. This notification will include information such as the suspicious person's facial image, the date and time of detection, and the location where the incident occurred.

[0655] Step 4:

[0656] The terminal monitors data from vibration sensors and open / close sensors installed on windows and doors, and if there is any abnormal vibration or opening / closing, it determines it to be an anomaly.

[0657] Step 5:

[0658] A terminal that detects an anomaly reports the information to the server. Based on the details of the anomaly, the server issues an alert to the user. Simultaneously, the security company is also automatically notified.

[0659] Step 6:

[0660] The terminal analyzes the visitor's behavior and facial expressions from video data, and the server scores the level of risk based on the results. The scoring is performed in stages based on pre-set criteria.

[0661] Step 7:

[0662] If a server is assigned a high risk score, it will send a warning notification to the police and nearby residents. This allows for a swift response.

[0663] Step 8:

[0664] The server uses AI to analyze suspicious person information and anomaly detection logs, generating local alert information. This information is shared with local governments and crime prevention organizations to improve safety throughout the community.

[0665] (Example 1)

[0666] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0667] In modern society, enhancing the safety of living spaces is a crucial issue. Conventional security systems have problems such as delayed response times and insufficient information sharing. In particular, early detection of anomalies and sharing of warning information throughout the community are often inadequate, which has been a major obstacle to improving security.

[0668] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0669] In this invention, the server includes means for extracting an individual's face from video information using video recognition technology and matching it with a registration list, means for detecting abnormalities in openings using a physical change detection device, and means for analyzing human movements and facial expressions to quantify risk and generate notifications based on the numerical results. This enables real-time anomaly detection, rapid information sharing, and effective crime prevention measures across the entire region.

[0670] "Image recognition technology" is a technology that analyzes video data acquired using cameras and sensors to identify and extract specific people or objects from that data.

[0671] "Extracting individual faces" refers to identifying the human face portion from video data and extracting that information in a usable format.

[0672] "Matching against the registration list" means comparing the extracted information with information in a pre-registered database to confirm that they match.

[0673] A "physical change detection device" is a device used to sense physical changes in the environment, and its purpose is to detect vibrations and changes in opening and closing.

[0674] "Detecting abnormalities in openings" means detecting unusual behavior or conditions in openings such as windows and doors, and determining that some kind of abnormality has occurred.

[0675] "Analyzing human actions and facial expressions" refers to the technology of analyzing data from an individual's actions and facial expressions in order to evaluate their intentions and emotions.

[0676] "Quantifying risk" means assigning a numerical score to the degree of danger in a given situation based on information obtained from analyzing movements and facial expressions.

[0677] "Generating notifications" refers to the process of creating necessary alerts and reports based on analysis results and informing relevant parties.

[0678] An "information processing device" is a device for receiving, processing, and transmitting data, and it integrates and manages all the data within a system.

[0679] "Sharing information on suspicious individuals" means sharing information about detected suspicious persons or unusual behavior with multiple relevant organizations in cooperation with them.

[0680] "Real-time data transmission" means instantly sending the latest information that needs to be considered to other devices and systems, enabling a rapid response.

[0681] This invention is a monitoring system for improving the safety of living spaces, and mainly consists of a terminal, a server, and a user. The terminal is equipped with a surveillance camera and various sensors, and is responsible for monitoring video information and physical changes in real time. Specifically, the hardware combines a high-resolution camera and physical change sensors, which enables detailed observation over a wide area.

[0682] Terminal operation

[0683] The terminal acquires video data using surveillance cameras and transmits it to the server in real time. Furthermore, it detects physical anomalies using vibration sensors and door / window sensors. Encryption technology is used for communication, and low-power communication methods such as LoRaWAN are employed, allowing for efficient data transmission.

[0684] Server Processing

[0685] The server receives data from the terminal and uses video recognition technology and facial recognition algorithms to identify individuals and compare them with a registered list. Specifically, it performs analysis using a neural network with TensorFlow. In addition, it utilizes facial recognition technology based on OpenCV to analyze human movements and facial expressions. The analysis results are reflected in a risk score, and if an anomaly is detected, the user is immediately notified.

[0686] The server generates regional alert information by inputting the collected data into an AI model, and shares this information with external crime prevention organizations and local governments. This sharing enhances the overall safety of the region.

[0687] User response

[0688] Users receive alerts and notifications from the server via their smartphones or other connected devices. This enables quick responses and easy notification to security companies as needed. For example, if suspicious activity is detected, the system can quickly coordinate with security companies to take countermeasures.

[0689] Example of a prompt

[0690] "What is the optimal algorithm for identifying suspicious individuals in real-time monitoring of living spaces?"

[0691] "Please tell me how to improve the efficiency of alert transmission based on sensor anomaly detection in security systems."

[0692] "How can we improve the risk scoring algorithm based on visitor behavior analysis?"

[0693] This system is expected to provide residents with a safer and more secure living environment.

[0694] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0695] Step 1:

[0696] The terminal uses a surveillance camera to acquire video data in real time. This video data is transmitted to the server via a secure communication protocol. The input is video from the surveillance camera, and the output is an encrypted video stream transferred to the server. Specifically, a high-resolution camera continuously captures video, encodes the data, and transmits it over the network.

[0697] Step 2:

[0698] The server analyzes the received video data and extracts individual faces using video recognition technology. The input here is video data transmitted from the terminal, and the output is the extracted face region data. This process is executed by a face detection algorithm using a neural network based on TensorFlow. Specifically, it detects and extracts face regions on a frame-by-frame basis.

[0699] Step 3:

[0700] The server uses a face recognition algorithm to compare extracted face data with a registered list. The input is the detected face data, and the output is the matching result. Here, a comparison operation is performed to determine whether the face matches a face registered in the existing database. Specifically, the recognition result is generated by comparing face features.

[0701] Step 4:

[0702] The physical change detection device connected to the terminal constantly monitors for abnormalities in openings such as windows and doors. The input is real-time data from physical condition sensors, and the output is an anomaly detection trigger signal. When an anomaly is detected, the information is immediately reported to the server. Specifically, it performs continuous monitoring of vibrations and opening / closing operations and generates alerts in the event of an anomaly.

[0703] Step 5:

[0704] The server analyzes human movements and facial expressions to quantify the risk of a situation. The input is movement and facial expression information extracted from video data, and the output is a risk score. It utilizes OpenCV for facial recognition and generates an alert immediately if an anomaly is detected. Specifically, it analyzes facial expression data and performs evaluations based on established criteria.

[0705] Step 6:

[0706] The server generates necessary notifications based on the analysis results and sends them to the user's smart device. Inputs are risk scores and anomaly detection information, while output is an alert message to the user. Specifically, it generates alerts when pre-set thresholds are exceeded and sends notifications via SMS or a dedicated app.

[0707] Step 7:

[0708] The server generates regional alert information and shares it with external organizations. Input consists of all collected anomaly information and analysis data from a generating AI model; output is regional alert information. This information is transmitted in JSON format and shared with crime prevention agencies and local governments. Specifically, it performs information analysis and documentation using a generating AI model, and shares information via secure data communication.

[0709] (Application Example 1)

[0710] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0711] In modern society, ensuring the safety of living spaces is crucial. However, conventional security systems have faced challenges in real-time anomaly detection and accurate identification of suspicious individuals, making it difficult to implement effective countermeasures. Furthermore, rapid information sharing and appropriate warning dissemination in the event of an anomaly are often insufficient, highlighting the need for the establishment of new technologies to address these issues.

[0712] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0713] In this invention, the server includes a processing unit that uses a facial recognition algorithm to detect the characteristics of a person from video data and compares them with a list of subjects; a device that uses a vibration detection unit and an opening / closing detection unit to monitor abnormalities in entrances and windows; and a mechanism that analyzes a person's behavior and facial expressions to quantify the degree of danger and issue a warning based on that evaluation. This makes it possible to quickly and accurately detect abnormal situations in living spaces and take appropriate action.

[0714] A "face recognition algorithm" is a computational method that detects faces of people in video data, analyzes their features, and compares them with a pre-registered catalog.

[0715] A "vibration detection unit" is a device that senses the vibration of an object and uses that data to detect anomalies.

[0716] A "door / window opening / closing detection unit" is a device that senses the open / closed state of a door or window and checks whether there is any abnormality.

[0717] An "information processing unit" is a central device that has the function of centrally managing, analyzing, and sharing multiple data sets.

[0718] "Means for transmitting warnings to mobile communication devices when an anomaly is detected" refers to a processing mechanism for transmitting relevant information to mobile communication terminals such as mobile phones when some kind of anomaly is detected.

[0719] A "generative model" is a machine learning algorithm that creates new information and insights based on given data.

[0720] To implement this invention, the following system needs to be constructed. The server plays a central role, processing video data acquired in real time from surveillance cameras using a facial recognition algorithm and analyzing the characteristics of individuals. The results of this analysis are compared with a pre-registered catalog to determine whether or not there are any suspicious individuals. The software used includes OpenCV and TensorFlow, a library specifically for machine learning. The server also receives data from vibration detection and opening / closing detection units and processes it to detect abnormalities in entrances and windows. The data from these sensors is transmitted to the server via a microcontroller such as Arduino or Raspberry Pi.

[0721] When an anomaly is detected, the server immediately transmits an alert to a smartphone, a mobile communication device. The notification is sent via a dedicated application on the mobile device. This application allows users to check anomaly information in real time and take necessary countermeasures quickly. For example, Google Cloud Messaging is used to send notifications. In addition, a generative AI model is used to generate regional alert information and share it with external organizations. As a result, overall regional security is enhanced.

[0722] As a concrete example, consider a scenario where a suspicious person is loitering near a user's home while they are traveling. This system analyzes video data to detect the suspicious person and immediately sends an alert to the user's smartphone, enabling a quick response. An example of a prompt to the generated AI model is: "I want to develop an application that detects suspicious people from surveillance camera footage and warns the user based on the results of scoring their behavior. In particular, please tell me how to optimize the algorithm that analyzes the visitor's facial expressions and movements."

[0723] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0724] Step 1:

[0725] The server acquires video data in real time from the surveillance camera. It takes the raw video stream from the camera as input and stores the frames in buffer memory for processing. This stored data is then used to prepare for subsequent face recognition processing.

[0726] Step 2:

[0727] The server executes a face recognition algorithm on the acquired video data. Specifically, it uses the OpenCV library to detect faces of people in the video. The input is video frames, and the output is a list of the location information of the detected faces. This output is then used to analyze the features of the people.

[0728] Step 3:

[0729] The server uses the analyzed facial features to compare them with pre-registered catalog information. The input is the detected facial features, and the output is either matching catalog information or a mismatch flag. If there is no match, the system marks the person as suspicious and records the information.

[0730] Step 4:

[0731] The vibration detection unit and opening / closing detection unit connected to the terminal transmit data to the server in real time. The input is the detection signal from the sensor, and the output is an anomaly detection flag. Based on this data, the server determines whether or not there is an anomaly and prepares to send a notification in the next step.

[0732] Step 5:

[0733] When an anomaly is detected, the server sends a notification to a mobile communication device. Specifically, it uses Google Cloud Messaging to send an alert to the user's smartphone. The input is an anomaly detection flag and the characteristics of the suspicious person, and the output is an alert message. The user can receive this message and check the situation.

[0734] Step 6:

[0735] The server uses a generative AI model to create local alert information. This utilizes collected anomaly detection data and matching results as input. The output is alert information to be shared with external organizations. This information will raise crime prevention awareness throughout the community and strengthen response capabilities.

[0736] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0737] This invention is a security system for ensuring the safety of living spaces, and incorporates an emotion engine to analyze the user's emotions. The system consists of multiple terminals and a server, and the terminals are equipped with surveillance cameras, vibration sensors, door / window sensors, and the emotion engine.

[0738] First, the terminal acquires video data from the surveillance camera in real time. Using a facial recognition algorithm, the server detects faces in this data and compares them with a pre-registered database. If a suspicious person is identified, the server immediately sends an alert to the user and, if necessary, notifies the security company.

[0739] Next, the terminal's vibration sensor and open / close sensor monitor for any abnormalities in the windows and doors. If any abnormality is detected, the terminal sends the information to the server, and an alarm is immediately issued.

[0740] Furthermore, an emotion engine built into the device analyzes the user's facial expressions in the video. Based on this analysis, the server infers the user's emotional state and adjusts the content and urgency of notifications as needed. For example, if the server detects that the user's emotions are unstable, it will send the user a faster and more detailed notification and also prompt notification to the security company.

[0741] In addition, the emotion engine utilizes the analyzed emotional information and uses it as feedback when sharing information with external organizations. This allows external organizations to take more appropriate measures, taking into account the user's emotional state.

[0742] For example, if the emotion engine analyzes that a user is feeling fear upon encountering a suspicious person, an enhanced security mode may be automatically selected, prompting a rapid response. In this way, the system can provide advanced security features to protect the user's safety.

[0743] The following describes the processing flow.

[0744] Step 1:

[0745] The device acquires real-time video data of the living space via a surveillance camera. The acquired video is transmitted to the server via a secure communication path.

[0746] Step 2:

[0747] The server receives the video data and applies a facial recognition algorithm to detect people in the video. The detected facial data is then compared with a database to check for the presence of suspicious individuals.

[0748] Step 3:

[0749] If the server identifies a suspicious person, it will immediately send an alert to the user. The alert will include information about the detected suspicious person, their location, and the time.

[0750] Step 4:

[0751] The device's vibration and open / close sensors constantly monitor the status of windows and doors, detecting any abnormal vibrations or opening / closing events.

[0752] Step 5:

[0753] A terminal that detects an anomaly reports the information to the server. The server processes the information, immediately issues an alarm, and notifies the security company.

[0754] Step 6:

[0755] The device analyzes the user's facial expressions from the video it captures using an emotion engine. The server then evaluates the user's emotional state based on the analyzed data.

[0756] Step 7:

[0757] If the user's emotional state indicates anxiety or fear, the server will increase the urgency of the notification and provide detailed information to both the user and the security company.

[0758] Step 8:

[0759] By using the emotional information analyzed by the emotion engine as feedback when sharing information with external organizations, we can discuss countermeasures that take user emotions into consideration.

[0760] (Example 2)

[0761] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0762] In modern society, efficiently ensuring the safety of living spaces requires the rapid identification of suspicious individuals and the real-time detection of abnormalities in windows and doors. Furthermore, a key challenge is to further enhance safety by providing appropriate response measures that take into account the user's emotional state.

[0763] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0764] In this invention, the server includes means for recognizing a person's face from video information using video processing technology and comparing it with registered information; means for detecting abnormalities in openings using a vibration detection device and an opening detection device; and means for analyzing a person's emotional state to evaluate the degree of urgency and issuing an alarm according to the evaluation result. This makes it possible to quickly and effectively detect and respond to abnormalities in living spaces and enhance user safety.

[0765] "Image processing technology" refers to the technology of analyzing digital video data to extract or process specific information.

[0766] "Face recognition" means detecting the facial features of a person within video data and identifying them uniquely.

[0767] "Registration information" refers to identification information of individuals that is pre-stored in the system and is used to verify the identification results.

[0768] A "vibration detection device" is a device that senses vibrations in objects or structures and detects changes in those vibrations.

[0769] An "opening detection device" is a sensor device that detects when a door or window is opened or closed.

[0770] "Analyzing emotional states" is the process of scientifically evaluating a person's emotional state based on their facial expressions, tone of voice, and other factors.

[0771] "Assessing the urgency" means determining how quickly a response is needed when a particular situation occurs.

[0772] "Issuing an alarm" means sending a warning signal when certain conditions are met to alert those involved.

[0773] An "external organization" is a separate organization that shares and collaborates on information related to the system, distinct from the organization to which the system operators or users belong.

[0774] This invention relates to a security system composed of multiple terminals and a server, which includes a function to analyze human emotions in order to ensure the safety of living spaces. First, the terminals acquire video data in real time using surveillance cameras. Changes in the environment are monitored using emotion engines, vibration detection devices, open detection devices, etc., installed in the terminals. In this process, a general image recognition library can be used as the video processing technology for analyzing the video data.

[0775] The server executes a facial recognition algorithm to identify individuals from the acquired video data and compare them with registered information. If a suspicious person is identified as a result of the comparison with registered information, an alert is sent to the user, and external organizations are notified as necessary. Furthermore, for sentiment analysis, a deep learning library, for example, is used to analyze the user's emotional state, and the alarm content is adjusted based on the results.

[0776] For example, if the terminal detects abnormal vibrations in a window via a vibration detection device, it immediately sends that information to the server. This allows the server to issue an alarm. Furthermore, based on the results of emotion analysis, if it determines that the user is in an unstable state, it takes appropriate action according to the urgency of the situation.

[0777] As a concrete example, a prompt message for the generating AI model could be input as, "If the user rapidly shows signs of anxiety, please tell me how to immediately send a notification and how to enhance the security mode accordingly," allowing the system to derive an appropriate response. In this way, the system of the present invention provides advanced security functions that enhance safety in living spaces.

[0778] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0779] Step 1:

[0780] The terminal acquires video data in real time using a surveillance camera. The input is live video from the camera, and the output is a video file or stream. This video data is temporarily stored within the terminal and prepared for subsequent processing.

[0781] Step 2:

[0782] The server receives video data transmitted from the terminal. The input is video data transmitted from the terminal over the network, and the output is data stored in the server's storage. The server then prepares to analyze this data.

[0783] Step 3:

[0784] The server executes a face recognition algorithm to recognize a person's face from video data. The input is video data stored on the server, and the output is face coordinate information and identifiers. During this process, video processing technology is used to analyze the data and compare it with registered information.

[0785] Step 4:

[0786] The server compares the recognized face with registered information. The input is the identifier of the recognized face, and the output is the result of the comparison. If there is a mismatch, a suspicious person is identified, and the suspicious person's information is stored on the server.

[0787] Step 5:

[0788] The server sends an alert to the user based on the results of the suspicious person identification. The input is the information about the suspicious person, and the output is a push notification or email to the user's device. This allows the user to quickly detect danger.

[0789] Step 6:

[0790] The terminal uses vibration and open detection devices to monitor windows and doors for abnormalities. Input is real-time data from the sensors, and output is the result of any detected abnormalities. When an abnormality is detected, the server is immediately notified.

[0791] Step 7:

[0792] The server receives anomaly notifications and issues alarms. The input is anomaly notifications from sensors, and the output is anomaly notifications to the user, such as audio or light alarms. This allows the user to respond quickly to anomalies.

[0793] Step 8:

[0794] The emotion engine installed in the device analyzes the user's facial expressions from video. The input is the user's face from the video data, and the output is an evaluation of their emotional state. The emotion analysis engine identifies the user's anxiety and fear and sends the results to the server.

[0795] Step 9:

[0796] The server evaluates the urgency based on the sentiment analysis results and adjusts the content of the notification. The input is the sentiment analysis results, and the output is the adjusted urgency and content of the notification. As an example of this process, one could input the prompt sentence, "If a user rapidly shows signs of anxiety, please tell me how to send an immediate notification and how to strengthen security mode accordingly," into the generating AI model.

[0797] (Application Example 2)

[0798] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0799] In modern society, improving the security of living spaces is an urgent issue. In particular, it is necessary to detect intruders and illegal activities in advance and respond quickly. However, existing security systems lack sufficient functions for adjusting alarms based on the user's emotional state and for real-time information sharing. Therefore, there is a need to provide security systems that integrate more advanced sensor technology and emotional analysis technology.

[0800] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0801] In this invention, the server includes means for detecting a person's face from image data using facial recognition technology and comparing it with a registered list of target persons; means for detecting abnormalities in openings using vibration detectors and opening / closing detectors; means for analyzing a person's behavior and facial expressions to evaluate their emotional state and adjusting alarms based on the evaluation results; means for transmitting warnings to the user's mobile device in real time and providing notification content tailored to the user's emotional state; and means for the information processing device to share suspicious person information with multiple external organizations. This enables flexible and rapid responses tailored to the user's emotional state while maintaining a high level of safety in the living space.

[0802] "Facial recognition technology" is a technology that automatically detects a person's face from image data and is used to identify individuals.

[0803] A "vibration detector" is a sensor that detects vibrations and is a device used to detect abnormal movement of objects or structures.

[0804] An "open / close detector" is a device that detects the open / closed status of windows and doors, and is used to detect attempts at unauthorized opening.

[0805] "Emotional state" refers to the emotional state exhibited by a person, and includes psychological or physiological responses.

[0806] An "information processing device" refers to a computer or server, which is a device used for receiving, analyzing, and transmitting data.

[0807] "External organizations" refer to groups or institutions other than those to which the user belongs, and include other organizations responsible for sharing crime prevention information and implementing countermeasures.

[0808] A "personal information terminal" refers to a portable computing device, such as a smartphone or tablet, used for sending and receiving information.

[0809] This invention is a security system for enhancing the safety of living spaces. The system mainly consists of a server, terminals (surveillance cameras, vibration detectors, door / window detectors), and a user's portable information terminal.

[0810] The server processes image data received from the camera using facial recognition technology and compares it against a registered list of individuals. It also analyzes data from vibration and opening / closing detectors to detect abnormalities in openings. Furthermore, the server uses information obtained from the video data to analyze the emotional state of individuals and adjusts the alarm based on that state. This allows for a stronger alarm and prompt action if the user experiences anxiety or fear.

[0811] Furthermore, the mobile device receives information from the server in real time and provides alerts and notifications to the user. The information displayed on the mobile device is customized according to the user's emotional state. This is achieved using software such as the Affectiva SDK for emotional analysis.

[0812] If a terminal detects an anomaly, the server will share information about the suspicious person with external organizations. This will strengthen the local crime prevention system.

[0813] For example, if the front door open / close detector detects an anomaly while the user is away from home at night, the server immediately sends an alert to the user's mobile device, allowing the user to remotely check the situation as needed. In this process, a generative AI model is used to analyze emotions and adjust the notification content accordingly.

[0814] An example of a prompt message is, "Please tell me how to implement an alarm function in a home security system that takes into account the user's emotional state."

[0815] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0816] Step 1:

[0817] The terminal acquires video data from surveillance cameras in real time. This data is then analyzed using facial recognition technology to extract feature points and compare them with a list of individuals. The input is video data, and the output is the determination of whether the person is registered or not.

[0818] Step 2:

[0819] The terminal's vibration and open / close sensors monitor the state of their respective targets (windows and doors) and detect anomalies. Input is sensor data, and output is whether or not an anomaly was detected. If an anomaly is detected, the sensor information is sent to the server.

[0820] Step 3:

[0821] The server receives video and sensor data and analyzes the user's facial expression data using the Affectiva SDK for emotion analysis. This analysis takes facial expression data as input and generates an emotional state as output.

[0822] Step 4:

[0823] Based on the emotional state, the server adjusts the content and urgency of the alarm. The input is the result of the emotional analysis, and the output is the adjusted alarm information. For example, if the user is feeling anxious, the alarm will be strengthened.

[0824] Step 5:

[0825] The server sends the adjusted alarm information to the user's mobile device. The mobile device receives this alarm information and displays a notification to the user. The input is the alarm information, and the output is the notification content presented to the user.

[0826] Step 6:

[0827] The server shares information about suspicious individuals with external organizations. It uses a generative AI model to create local crime prevention information and transmits it to external organizations. Inputs include anomaly detection data and emotion analysis data, while output is information shared with external organizations.

[0828] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0829] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0830] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0831] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0832] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0833] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0834] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0835] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0836] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0837] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0838] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0839] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0840] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0841] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0842] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0843] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0844] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0845] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0846] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0847] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0848] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0849] The following is further disclosed regarding the embodiments described above.

[0850] (Claim 1)

[0851] A means for detecting a person's face from video data using a facial recognition algorithm and matching it with a list of subjects,

[0852] A means for detecting abnormalities in windows and doors using vibration sensors and opening / closing sensors,

[0853] A means of analyzing a person's behavior and facial expressions to assess the degree of risk, and providing notification based on the assessment results,

[0854] A system that includes means for an information processing device to share information about suspicious individuals with multiple external organizations.

[0855] (Claim 2)

[0856] The system according to claim 1, which detects anomalies in real time and issues an alarm based on video data and sensor data.

[0857] (Claim 3)

[0858] The system according to claim 1, which generates regional warning information using a generative model and shares that information with external organizations.

[0859] "Example 1"

[0860] (Claim 1)

[0861] A means of extracting an individual's face from video information using video recognition technology and matching it with a registered list,

[0862] A means for detecting abnormalities in the opening using a physical change detection device,

[0863] A means of analyzing human movements and facial expressions to quantify risk and generating notifications based on those numerical results,

[0864] A means by which information processing equipment shares information about suspicious individuals with multiple external organizations,

[0865] A system that includes means for transmitting data in real time.

[0866] (Claim 2)

[0867] The system according to claim 1, which immediately identifies an anomaly and generates an alarm based on data.

[0868] (Claim 3)

[0869] The system according to claim 1, which generates area alert information using a generation device and shares that information with an external organization.

[0870] "Application Example 1"

[0871] (Claim 1)

[0872] A processing device that uses a facial recognition algorithm to detect human features from video data and compares them with a subject list,

[0873] A device that monitors abnormalities in entrances and windows using a vibration detection unit and an opening / closing detection unit,

[0874] A system that analyzes a person's behavior and facial expressions to quantify the degree of danger and issues a warning based on that evaluation,

[0875] A means by which the information processing unit shares information about abnormal individuals with multiple external organizations,

[0876] A means of transmitting a warning to a mobile communication device when an anomaly is detected,

[0877] A system that includes this.

[0878] (Claim 2)

[0879] The system according to claim 1, which identifies and notifies of anomalies in real time based on video data and detection data.

[0880] (Claim 3)

[0881] The system according to claim 1, which uses a generative model to generate regional alert information and shares that data with external organizations.

[0882] "Example 2 of combining an emotion engine"

[0883] (Claim 1)

[0884] A means of recognizing a human face from video information using video processing technology and comparing it with registered information,

[0885] A means for detecting abnormalities in an opening using a vibration detection device and an opening detection device,

[0886] A means of analyzing a person's emotional state to assess the urgency and issuing an alarm based on the assessment results,

[0887] A system that includes means for an information processing device to share abnormal information, including emotional information, with multiple external organizations.

[0888] (Claim 2)

[0889] The system according to claim 1, which detects anomalies in real time and issues an alarm based on video information and information from a detection device.

[0890] (Claim 3)

[0891] The system according to claim 1, which generates environmental warning information using a generation system and shares that information with an external organization.

[0892] "Application example 2 when combining with an emotional engine"

[0893] (Claim 1)

[0894] A means for detecting a person's face from image data using facial recognition technology and matching it with a registered list of subjects,

[0895] A means for detecting an abnormality in an opening using a vibration detector and an opening / closing detector,

[0896] A means for analyzing a person's behavior and facial expressions to evaluate their emotional state and adjusting alarms based on the evaluation results,

[0897] A means of sending warnings to the user's mobile device in real time and providing notification content tailored to the user's emotional state,

[0898] A system that includes means for an information processing device to share information about suspicious individuals with multiple external organizations.

[0899] (Claim 2)

[0900] The system according to claim 1, which immediately detects an anomaly and issues a warning based on image data and detector data.

[0901] (Claim 3)

[0902] The system according to claim 1, which generates area alert information using generation technology and shares that information with an external organization. [Explanation of Symbols]

[0903] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A processing device that uses a facial recognition algorithm to detect human features from video data and compares them with a subject list, A device that monitors abnormalities in entrances and windows using a vibration detection unit and an opening / closing detection unit, A system that analyzes a person's behavior and facial expressions to quantify the degree of danger and issues a warning based on that evaluation, A means by which the information processing unit shares information about abnormal individuals with multiple external organizations, A means of transmitting a warning to a mobile communication device when an anomaly is detected, A system that includes this.

2. The system according to claim 1, which identifies and notifies of anomalies in real time based on video data and detection data.

3. The system according to claim 1, which generates regional alert information using a generative model and shares that data with external organizations.