system

The security system leverages generative AI and robot control for efficient and immediate threat detection and response, addressing the limitations of conventional systems by continuously optimizing its performance.

JP2026103510APending Publication Date: 2026-06-24SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-12
Publication Date
2026-06-24

AI Technical Summary

Technical Problem

Conventional security systems in residential areas face challenges in ensuring immediate and effective threat detection due to lack of immediacy and incomplete threat detection, leading to increased security costs and risks.

Method used

A security system utilizing generative artificial intelligence for anomaly detection, robot control, and continuous optimization, which includes sensor devices for data collection, anomaly detection, risk assessment, alert generation, and robot patrol activities to enhance security efficiency and effectiveness.

Benefits of technology

The system provides rapid and accurate anomaly detection, enabling immediate alerts and autonomous robot patrols, thereby enhancing safety and reducing response time, while continuously optimizing its performance through machine learning.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026103510000001_ABST
    Figure 2026103510000001_ABST
Patent Text Reader

Abstract

We provide the system. [Solution] A means of collecting surrounding information using various detection devices, An automated learning model means for analyzing the aforementioned information and detecting anomalies, A means for performing an evaluation based on anomaly detection and generating an alarm, A means of controlling mobile devices to patrol the site, A means of evaluating and optimizing the operation record of the entire system, A means to enable real-time confirmation of the aforementioned anomaly from a mobile device, A system that includes this.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0005]

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, the method including the steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of the chatbot's character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance that responds to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] [[ID=...]]Crimes in modern residential areas are on the rise, and the modus operandi of crimes is becoming more sophisticated and vicious. As a result, the security costs for residents to live safely are also increasing. Conventional security methods have the problem that sufficient security cannot be ensured due to lack of immediacy and incomplete threat detection. To solve this problem, more efficient and effective security solutions are required.

Means for Solving the Problems

[0005] This invention provides a means for detecting anomalies by introducing generative artificial intelligence that collects ambient data using various sensor devices and analyzes this data. Furthermore, it evaluates risks based on anomaly detection and automatically generates alerts. The invention aims to realize a security system that balances safety and efficiency by incorporating a means of controlling robots to conduct patrol activities for on-site situation confirmation, analyzing the operation logs of the entire system, and continuously optimizing it.

[0006] A "sensor device" is a device used to acquire data such as sounds and images from the environment, and is used to collect the data that a system needs.

[0007] "Generative artificial intelligence means" refers to a part of a system that includes a process for analyzing collected data, detecting anomalies and patterns using machine learning techniques, and assessing risks.

[0008] "Anomaly detection" refers to the function of identifying potential threats and risks by recognizing patterns or behaviors that deviate from normal conditions.

[0009] "Risk assessment" is the process of determining the impact and urgency of detected anomalies and deciding on appropriate countermeasures.

[0010] "Alert generation" refers to a function that notifies pre-designated stakeholders of a warning when the system detects an abnormal or dangerous situation.

[0011] "Robot control" is the technology of remotely or autonomously operating robots and sending commands to perform specified tasks.

[0012] "Patrol activities" refer to actions in which a robot moves around a specific area, monitoring it and checking for any abnormalities.

[0013] An "operation log" is a collection of information that maintains the overall operating status and events of the system, and is used for later analysis and improvement.

[0014] "Optimization" refers to the process of adjusting operational parameters and configurations to improve the efficiency and effectiveness of a system or process. [Brief explanation of the drawing]

[0015] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13]It is a sequence diagram showing the processing flow of a data processing system in Embodiment 2 when a sentiment engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of a data processing system in Application Example 2 when a sentiment engine is combined.

Mode for Carrying Out the Invention

[0016] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0017] First, the terms used in the following description will be explained.

[0018] In the following embodiments, a labeled processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0019] In the following embodiments, a labeled RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0020] In the following embodiments, a labeled storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.

[0021] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0022] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0023] [First Embodiment]

[0024] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0025] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0026] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0027] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0028] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0029] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0030] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0031] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0032] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0033] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0034] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0035] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0036] The security system according to the present invention is a highly efficient security solution for houses and apartments, and is implemented with a configuration that includes various sensor devices, a generating AI, and a control robot. A specific embodiment thereof is described below.

[0037] Data collection

[0038] The server continuously collects audio and image data from sensor devices placed around houses and apartment buildings. These include surveillance cameras, microphones, and infrared sensors. Because the collected data is difficult to handle directly, the server performs pre-processing, such as noise removal.

[0039] Data analysis and anomaly detection

[0040] The server sends pre-processed data to a generative AI model for analysis. The generative AI uses machine learning algorithms to analyze the data and detect abnormal changes or patterns from the normal state.

[0041] Risk assessment and alert generation

[0042] If an anomaly is detected, the server assesses the risk based on feedback from the AI. Based on the risk assessment, the server generates an alert notifying the user or security company of the anomaly. This facilitates immediate response.

[0043] Robot control and patrol activities

[0044] After detecting an anomaly, the server instructs the robot to patrol the site. The robot autonomously moves along the designated route, monitoring its surroundings using sensors. During this process, any newly acquired data is sent back to the server and re-analyzed by the generating AI as needed.

[0045] Continuous optimization

[0046] The server continuously records system-wide operation logs and analyzes them for further improvement. The generating AI continues machine learning based on new information and detected patterns, evolving its model to improve anomaly detection performance in subsequent iterations.

[0047] Specific example: If a garden sensor detects an unusual sound late at night, the server sends the audio data to a generating AI, which detects it as a suspicious human figure. The user is immediately notified with an alert, and a robot is instructed to patrol the garden for further investigation. Real-time information allows for quick and appropriate countermeasures to be taken. This entire process is recorded as a system operation log and used for future optimization.

[0048] Thus, the present invention realizes a highly efficient and accurate security system by combining a sensor device, a generating AI, and a robot.

[0049] The following describes the processing flow.

[0050] Step 1:

[0051] The server collects audio and image data in real time from sensors installed outside and inside houses and apartments. This includes surveillance cameras, microphones, and infrared sensors, and the collected data undergoes initial processing such as noise reduction and data format standardization.

[0052] Step 2:

[0053] The server sends the pre-processed data to the generative AI model. The generative AI model analyzes the data and executes speech recognition and image recognition algorithms to identify abnormal activity and sounds. This analysis detects any patterns that deviate from the normal state.

[0054] Step 3:

[0055] The generative AI model determines the presence or absence of anomalies based on the analysis results and performs a risk assessment. In this risk assessment, the degree of impact caused by the anomalies is calculated based on the type and frequency of the detected anomalies.

[0056] Step 4:

[0057] Based on feedback from the generated AI model, the server generates an alert if it detects an anomaly. The alert includes information such as the time, location, and risk level of the anomaly, and is immediately notified to users and security personnel.

[0058] Step 5:

[0059] The server instructs the robot to patrol the site as needed. The instructed robot then patrols a pre-set path, collecting further data using sensors and cameras, and transmitting it to the server in real time.

[0060] Step 6:

[0061] The server re-evaluates the new data sent from the robot and reassessss the overall situation. Additional alerts are generated and further risk assessments are made as needed.

[0062] Step 7:

[0063] The server records system-wide operation logs in a database, which are then used for subsequent analysis and system optimization. Furthermore, the generative AI model uses these operation logs to improve accuracy through machine learning and enhance its threat recognition capabilities.

[0064] This processing flow allows the system to quickly and efficiently enhance the security of homes and apartments, providing residents with a sense of safety.

[0065] (Example 1)

[0066] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0067] In recent years, security problems such as crime and intruder intrusions have been increasing, particularly in urban areas. Conventional security systems can be time-consuming to detect and respond to anomalies, highlighting the need for both immediacy and accuracy. The present invention aims to provide a security system that enables advanced anomaly detection and rapid response.

[0068] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0069] In this invention, the server includes means for collecting surrounding information using a remote information acquisition device, means for pre-processing the information, transmitting it to a generative artificial intelligence for analysis, and detecting anomalies, and means for evaluating the degree of risk based on the anomaly detection and creating a notification. This enables rapid and effective anomaly detection and response.

[0070] A "remote information acquisition device" is a device that senses surrounding information such as sound, video, and motion, and provides the data to a server.

[0071] "Preprocessing" is the process of removing unwanted noise and interference from collected information, preparing it for analysis.

[0072] "Generative artificial intelligence" is a type of artificial intelligence that uses machine learning algorithms to analyze input data and perform anomaly detection and pattern recognition.

[0073] "Assessing the degree of risk" means determining the risk level of an event based on the circumstances of its occurrence and its likelihood.

[0074] An "autonomous device" is a robot that automatically moves along a designated path based on programmed commands while monitoring its surroundings.

[0075] "Operational records" refer to data that records the system's operation history and processing results, and are used for later analysis.

[0076] "Continuous optimization" refers to the activity of progressively improving the system's performance and processing accuracy based on collected operational records.

[0077] To implement this invention, a sensor device, a server, a generative AI model, and an autonomous robot device are used.

[0078] Data collection:

[0079] The server collects data from sensor devices installed in houses and apartments. These include microphones to capture audio data, surveillance cameras to capture images, and infrared sensors to detect motion. These devices sense information about the surroundings and transmit the data to the server.

[0080] Data preprocessing and analysis:

[0081] The server performs preprocessing on the collected data, such as noise reduction and data normalization, and then sends it to the generative AI model. The generative AI model analyzes the collected data using machine learning algorithms and detects anomalies. This analysis identifies events that deviate from normal patterns.

[0082] Risk assessment and notification:

[0083] The server assesses the risk based on the results of anomaly detection. If the risk level is determined to be high, the server immediately generates a notification and issues an alert to the user or security company.

[0084] Robot patrol:

[0085] When an anomaly is detected, the server instructs the autonomous robot to patrol. The robot automatically moves within the designated area and acquires more detailed data. The acquired data is then sent back to the server for further analysis.

[0086] Continuous optimization:

[0087] The server records all operations that occur within the system and uses this record to optimize the entire system. The generative AI model continuously learns from this data to improve performance.

[0088] For example, if a suspicious sound is detected in the garden at night, the sound is collected by a microphone. The server sends the audio data to a generating AI model for analysis of the anomaly. If a potential intruder is detected, the server sends an alert notification to the user and instructs a robot to patrol the area. This process enables a quick and effective response.

[0089] As an example of a prompt, the generator AI model is given instructions such as, "Analyze data on unusual noises detected in the garden late at night and identify suspicious activity." This allows the generator AI model to perform the appropriate analysis.

[0090] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0091] Step 1:

[0092] Data collection

[0093] The server continuously collects ambient audio and image data from sensor devices installed in houses and apartments. Examples of sensor devices include microphones for audio acquisition and surveillance cameras for image acquisition. The input data consists of audio and image data, including noise. This data is transmitted to the server via the network.

[0094] Step 2:

[0095] Data preprocessing

[0096] The server performs noise reduction and data normalization on the collected audio and image data. For audio data, background noise is reduced and important audio signals are extracted. For image data, images are filtered to prevent unwanted light reflections and false detections. As a result of this preprocessing, clear audio and image data suitable for analysis is obtained.

[0097] Step 3:

[0098] Data analysis and anomaly detection

[0099] The server sends pre-processed data to a generative AI model and receives prompt messages to detect abnormal patterns. The generative AI model uses machine learning algorithms to analyze anomalies from normal patterns. The input is filtered, clear audio and image data, and the output is information about the presence or absence of anomalies and their patterns.

[0100] Step 4:

[0101] Risk assessment and notification generation

[0102] The server performs a risk assessment based on anomaly detection information from the generated AI model. Depending on the level of risk, the server creates and sends alert notifications to users and security companies in real time. The input is anomaly detection information, and the output is the content and priority of the notification.

[0103] Step 5:

[0104] Instructions for robot patrol

[0105] When the server detects an anomaly, it instructs the autonomous robot device to patrol the site. The robot automatically moves along the designated path and collects additional sensor data. This data is sent back to the server and re-analyzed by a generated AI model as needed. The inputs are the patrol instructions and sensor data, and the output is additional monitoring data.

[0106] Step 6:

[0107] Continuous optimization of the system

[0108] The server continuously records the operation logs of the entire system and uses them to train the generative AI model. This improves the anomaly detection capability in subsequent runs and optimizes the entire system. The input is the system's operation history, and the output is the performance of the improved generative AI model.

[0109] (Application Example 1)

[0110] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0111] Existing security systems face challenges in detecting anomalies early and responding quickly. In particular, in residential and apartment buildings, flexible responses tailored to the specific conditions and environments of individual locations are required, but conventional sensors and monitoring technologies have limitations. Furthermore, there are insufficient means for users to check the status of their homes in real time and take appropriate action while away.

[0112] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0113] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information using an automated learning model to detect anomalies, and means for enabling real-time confirmation of anomalies from a mobile terminal. This makes it possible for users to quickly grasp anomalies no matter where they are and take necessary actions in real time.

[0114] "Various detection devices" refers to a diverse group of sensors installed to acquire external environmental information such as sound, video, temperature, and motion.

[0115] "Information" refers to all data collected by various detection devices, including audio data, video data, and other forms of environmental data.

[0116] An "automated learning model" is a machine learning system that includes algorithms for analyzing collected information and detecting anomalies.

[0117] An "abnormality" refers to a unique pattern or condition that deviates from the normal state, and includes situations that are predicted to pose a safety risk.

[0118] A "mobile device" is an electronic device that a user can carry with them, such as a smartphone or smart glasses.

[0119] "Means that enable the confirmation of anomalies in real time" refers to technology that allows users to instantly recognize situations occurring at their home or facility visually or audibly through their mobile devices.

[0120] The system for carrying out this invention consists of various detection devices, an automated learning model, a mobile terminal, and a server.

[0121] The server continuously collects information about the surroundings from various detection devices installed in homes and facilities. These include voice sensors, cameras, and temperature sensors, and the data obtained from these sensors is first aggregated on the server.

[0122] The server inputs the collected information into an automated learning model to detect anomalies. The automated learning model is built using software such as Python and TENSORFLOW®, and it analyzes the information to identify patterns that deviate from the normal state. During this process, noise reduction and data preprocessing are performed using tools such as OpenCV.

[0123] When an anomaly is detected, the server sends a real-time notification to the mobile device, allowing the user to view live video from the scene via their smartphone or smart glasses. This enables the user to quickly grasp the details of the anomaly and immediately consider countermeasures.

[0124] For example, if suspicious activity is detected in a residential yard late at night, the server will determine this to be an anomaly and send a notification to the user's mobile device. The user can then view the real-time video through smart glasses to check for any suspicious individuals.

[0125] An example of a prompt message is, "An unusual sound has been detected in the garden of a house. Analyze the camera footage and identify the anomaly." This instruction is input into the automated learning model, supporting appropriate actions. This system allows users to live with peace of mind.

[0126] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0127] Step 1:

[0128] The server continuously collects information about the surroundings from various detection devices. This includes data from sound sensors, cameras, and other sources. The collected data is raw and requires noise reduction and formatting.

[0129] Step 2:

[0130] The server performs noise reduction and other necessary preprocessing on the collected information. This involves using image processing techniques such as OpenCV to reduce noise in video data and normalizing the sampling of audio data. This process improves data accuracy and facilitates analysis in the next step.

[0131] Step 3:

[0132] The server inputs pre-processed information into an automated learning model and performs anomaly detection. The generating AI model is built using TensorFlow and identifies situations that deviate from normal by analyzing information patterns. Here, it is instructed to generate an alarm based on conditions that are judged to be anomalies.

[0133] Step 4:

[0134] If the server detects an anomaly, it immediately sends an alert to the mobile device. Upon receiving this alert, the user's device can request and view a real-time video feed from the site. The video data is sent to the device using streaming technology.

[0135] Step 5:

[0136] Users check the situation on-site via their mobile devices or smart glasses and carefully examine the content of alerts. Based on the confirmed information, they decide on appropriate actions. For example, after confirming an anomaly, they might contact the security company directly.

[0137] Step 6:

[0138] The server collects operation logs for each processing step and analyzes them to improve the entire system. The system's operational data is used to improve the accuracy of the automated learning model, enabling continuous optimization.

[0139] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0140] The system according to the present invention combines various sensor devices, a generating AI, and an emotion engine to significantly enhance the security of houses and apartments and enable interaction with users. The embodiments thereof are described in detail below.

[0141] Data collection and analysis

[0142] The server continuously collects audio and image data from sensor devices installed in houses and apartments. This includes cameras, microphones, and temperature sensors. The collected data is preprocessed in real time and sent to a generative AI model. The generative AI model analyzes this data and performs speech and image recognition to determine if there are any unusual sounds or movements.

[0143] Anomaly detection and risk assessment

[0144] The server assesses the risk of the current situation based on anomalies detected by the generated AI model. The assessed risk triggers an alert based on the type and frequency of the anomaly, as well as the security policy. The alert is immediately notified to users and security personnel.

[0145] Robot control

[0146] If an anomaly is detected, the server issues a command to the robot to patrol the site. The robot autonomously moves along the designated route, continuously collecting data from its surroundings and providing real-time feedback to the server. This allows for seamless on-site verification and response to anomalies.

[0147] Emotion-based interaction

[0148] When a user interacts with the system, it utilizes an emotion engine. This engine analyzes the user's emotions from their tone of voice and facial expressions, and adjusts the system's response accordingly. For example, if a user is feeling anxious, the system can raise its alert level, providing more detailed monitoring and faster informational feedback.

[0149] Specific example: If a user detects a suspicious noise at night, the system immediately sends voice data to the AI ​​to detect the anomaly. If the user communicates to the system via intercom that they are "anxious," the emotion engine analyzes this information, enhances the alert mode, and prompts robots to patrol quickly. It also provides detailed alert information to offer reassurance. In this way, the system enables advanced interaction that responds to user needs in real time.

[0150] This entire system analyzes operation logs in detail and uses the data for subsequent system evaluation and optimization. This enables the provision of continuous and effective security.

[0151] The following describes the processing flow.

[0152] Step 1:

[0153] The server collects audio and image data in real time from sensor devices placed in the surrounding area. This includes surveillance cameras and microphones, and the collected data is pre-processed, such as noise reduction, to accurately capture changes in the environment.

[0154] Step 2:

[0155] Upon receiving pre-processed data, the server sends it to a generative AI model, which identifies unusual activity and sounds through speech and image recognition. The generative AI model can continuously process data to improve its accuracy by repeatedly learning from past data.

[0156] Step 3:

[0157] In response to detected anomalies, the server evaluates the alert level based on the risk assessment results calculated by the generated AI model. If the risk is determined to be high, the server immediately generates an alert and notifies users and security personnel of the relevant information.

[0158] Step 4:

[0159] When a user receives an alert and accesses the system to check the situation, the system uses its emotion engine to analyze the user's emotional state. It determines emotions from voice tone and facial expression data, and automatically raises the system's alert level if the user is feeling anxious.

[0160] Step 5:

[0161] The server instructs robots to patrol areas where anomalies have been detected. The robots move along pre-set paths, scanning their surroundings with built-in sensors and sending the latest status data to the server for further anomaly detection.

[0162] Step 6:

[0163] The server analyzes the data from the robots and reconfirms whether there are any anomalies. It issues additional alerts as needed to strengthen security.

[0164] Step 7:

[0165] The server records the operation logs collected at each processing step in a database. These logs are used to optimize the system and improve the learning accuracy of the generated AI models, resulting in continuous improvement of the entire system.

[0166] Through this series of processes, the system efficiently ensures user safety and, by utilizing the emotion engine, enables fast and flexible interaction.

[0167] (Example 2)

[0168] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0169] In modern living spaces, there is a growing demand for both enhanced safety and improved convenience. In particular, there is a growing need for security systems that can quickly and accurately detect anomalies within homes and buildings and respond appropriately. However, existing systems have limitations in both anomaly detection and user response, making it difficult to provide users with a greater sense of security. Therefore, the present invention aims to provide a system that enhances anomaly detection capabilities while enabling flexible responses tailored to the user's emotions.

[0170] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0171] In this invention, the server includes means for collecting environmental data using various detection devices, means for formatting the data, transmitting it to an intelligent model for analysis and detection of anomalies, and means for analyzing the user's emotional state through audio and video and adjusting the system response. This makes it possible to improve the accuracy of anomaly detection and provide appropriate responses based on the user's emotions.

[0172] A "detection device" refers to various sensors used to collect data such as sound, images, and temperature from the environment.

[0173] An "intelligent model" refers to artificial intelligence technology that uses collected data to detect and analyze anomalies.

[0174] "Mobile units" refer to automated machines such as robots that autonomously patrol designated routes and monitor changes in the environment.

[0175] "Emotional state" refers to the psychological state of a user, as analyzed from their voice tone, facial expressions, and other factors.

[0176] "Operation records" refer to the overall operational history of the system, and serve as a source of information for optimizing the system based on this data.

[0177] "Risk assessment" is the process of analyzing the risks posed by detected anomalies and using that information to generate appropriate alarms.

[0178] An "alarm" is a warning and response instruction issued by the system in response to detected anomalies or dangers.

[0179] This invention is a system that enhances the security of homes and buildings and enables advanced user interaction. The system consists of several important elements, and its specific embodiments will be described below.

[0180] Hardware and software usage

[0181] The server is responsible for collecting data from various detection devices installed in houses and apartments. Specifically, these include cameras, microphones, and temperature sensors. These devices acquire image data, audio data, and ambient temperature data, and this information is used for anomaly detection.

[0182] The server preprocesses the collected data and sends it to the generative AI model. The generative AI model uses speech recognition and image recognition technologies to analyze and detect abnormal sounds, movements, and temperature changes. The software employs advanced AI analysis algorithms and noise filtering tools.

[0183] Interaction and emotion analysis

[0184] When users interact directly with the system, they can communicate their state to the system through an emotion engine. The emotion engine analyzes the user's voice tone and facial expressions, and the system optimizes its response based on this analysis. This feature allows the system to respond flexibly according to the user's emotional state.

[0185] Specific example:

[0186] If a user reports hearing suspicious noises at night and expressing concern, the server sends the audio data to a generating AI to detect any abnormalities. If the audio analysis determines that the user is feeling anxious, the system raises its alert level, instructs robots to patrol, and checks the situation. The system also sends detailed monitoring information to the user via push notifications and email to provide reassurance.

[0187] Example of a prompt

[0188] "Please describe a system that detects abnormal sounds at night, analyzes the user's emotions, and then takes appropriate action."

[0189] Thus, by combining precise sensor technology, a generative AI model, and an emotion recognition engine, the system of the present invention can solve various challenges in modern residential security and provide users with peace of mind and safety.

[0190] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0191] Step 1:

[0192] The server collects environmental data from various detection devices. This includes image data from cameras, audio data from microphones, and temperature data from temperature sensors. The input data is sent to the server in its raw state and in various formats. Specifically, the camera periodically captures images of the gate and the interior of the room, the microphone records background sounds, and the temperature sensor measures the room temperature. The output indicates the completion of data collection.

[0193] Step 2:

[0194] The server preprocesses the collected raw data. Data processing is performed to remove noise, standardize data formats, and highlight important features. Specifically, background noise is removed from audio data, and the resolution of image data is adjusted. The input is the raw data collected in step 1, and the output is clear data ready for analysis.

[0195] Step 3:

[0196] The server sends pre-processed data to the generating AI model. The generating AI model uses speech recognition and image recognition technologies to execute an anomaly detection algorithm. Specifically, the AI ​​model analyzes the data and identifies abnormal movements and sounds. The input is pre-processed data, and the output is the result of determining whether or not an anomaly exists.

[0197] Step 4:

[0198] The server assesses the risk based on anomalies detected by the AI ​​model. It analyzes the type and frequency of anomalies and generates alarms based on security policies. Specifically, data identified as anomaly is recorded, and users are notified based on this information. The input is the anomaly detection result from the AI ​​model, and the output is alarm information.

[0199] Step 5:

[0200] If an anomaly is detected, the server issues a patrol command to the mobile robot. The robot patrols the site according to the designated route and collects additional data. Specifically, the robot automatically moves through corridors and entrances, and feeds back newly recorded video and audio to the server in real time. The input is the patrol command from the server, and the output is the results of the site inspection.

[0201] Step 6:

[0202] When a user interacts with the system, the server uses an emotion engine to analyze this information. The emotion engine determines the user's emotional state based on their tone of voice and facial expressions, and optimizes the system's response. For example, if a user says "I'm anxious," the system analyzes this and raises the alert level. The input is the emotional feedback from the user, and the output is the adjusted system response.

[0203] (Application Example 2)

[0204] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0205] Improving the safety of homes and buildings requires rapid and accurate detection of anomalies, as well as appropriate responses that respond to user emotions. However, conventional systems suffer from insufficient anomaly detection accuracy and poor user interaction quality, resulting in a lack of safety and convenience. The challenge is to solve this problem and provide a higher level of security and user experience.

[0206] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0207] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information and detecting anomalies using a generated electronic brain, and means for interacting with the user using emotion recognition and adjusting the response based on the user's emotions. This enables immediate detection of anomalies, risk assessment, and optimal interaction according to the user's emotions.

[0208] A "detection device" is a device that collects information such as sound, images, and temperature from the external environment.

[0209] A "creative electronic brain" is a system that uses advanced artificial intelligence technology to analyze collected information and detect anomalies.

[0210] A "means for assessing risk" refers to a function that determines the risk of the current situation based on the results of anomaly detection and generates appropriate warnings.

[0211] "Machines" refers to devices such as robots that move autonomously, collect information, and patrol areas with abnormalities.

[0212] "Emotion recognition" is a technology that analyzes a user's emotions from their voice and facial expressions, making it possible to adjust the system's operation based on the user's emotions.

[0213] "Operation records" refer to logs that record the overall operating status and processing details of the system in chronological order, and analyzing these logs provides information that can be used to optimize the system.

[0214] The system for carrying out this invention comprises various detection devices, a generating electronic brain means, and an emotion recognition function, and is designed to enhance the safety of homes and buildings. Specific embodiments are shown below.

[0215] First, the server continuously collects information about the surroundings through various detection devices such as cameras, microphones, and temperature sensors. This data is processed in real time and sent to a cloud-based server. Services such as AWS Lambda and Google Cloud Functions are used for this process.

[0216] The server analyzes the collected data using a generating electronic brain to detect anomalies. The generating electronic brain uses a generative AI model and can identify anomalies by performing voice recognition and image recognition. Furthermore, it uses emotion recognition to analyze voice and text from the user to understand the user's emotional state.

[0217] If an anomaly is detected, the server assesses the risk and issues a command to the machine to patrol the area. The machine autonomously moves within the designated area, collecting and feeding back data in real time, and seamlessly confirming and responding to the anomaly.

[0218] Emotion recognition enables interaction, utilizing an emotion engine when users interact with the system. The emotion engine analyzes the user's emotions from their voice tone and facial expressions, and adjusts the system's response accordingly. For example, if a user feels anxious, the system can raise its alert level and provide detailed monitoring and rapid informational feedback.

[0219] For example, if a user reports a "suspicious noise" to the system, the emotion engine analyzes it, immediately intensifies the alert mode, and the machine patrols the area quickly. The user is also notified of detailed alert information, providing a sense of security.

[0220] Examples of prompt messages include the following:

[0221] "A suspicious noise has been detected in the user's home. Please demonstrate the process of using an emotion engine to alleviate the user's anxiety."

[0222] "Please explain in detail how you plan to enhance user confidence after detecting an anomaly."

[0223] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0224] Step 1:

[0225] The server collects audio data, image data, and temperature information from various detection devices. This input data is preprocessed, such as through normalization and filtering, to improve data quality, and then prepared as input for the generated AI model.

[0226] Step 2:

[0227] The server inputs pre-processed data into a generative AI model for speech and image recognition. The generative AI model performs pattern matching and feature extraction to detect anomalies and determines whether an anomaly exists. The output consists of a flag indicating the presence or absence of an anomaly and metadata about the characteristics of that anomaly.

[0228] Step 3:

[0229] The server assesses the risk based on the results of anomaly detection. It receives anomaly flags and metadata as input and applies a rule-based algorithm to evaluate the risk level. The output is a numerical score indicating the risk level, which is used to determine the next step.

[0230] Step 4:

[0231] If an anomaly or high risk is detected, the server issues a patrol command to the machine. Using patrol route information as input, the server instructs the machine to move, collects surrounding data in real time, and feeds it back to the server. This process complements on-site anomaly verification.

[0232] Step 5:

[0233] The server receives input from the user and performs emotion recognition. It takes user voice and text data as input and analyzes it using an emotion engine. This analysis uses natural language processing algorithms to identify the user's emotions. The output is information about the type and intensity of the emotion.

[0234] Step 6:

[0235] Based on the results of the sentiment analysis, the server adjusts the system's response. For example, if the user is feeling anxious, it raises the alert level and adjusts the system to collect more data to reassure the user. This process aims to enhance informational feedback to the user.

[0236] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0237] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0238] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0239] [Second Embodiment]

[0240] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0241] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0242] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0243] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0244] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0245] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0246] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0247] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0248] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0249] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0250] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0251] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0252] The security system according to the present invention is a highly efficient security solution for houses and apartments, and is implemented with a configuration that includes various sensor devices, a generating AI, and a control robot. A specific embodiment thereof is described below.

[0253] Data collection

[0254] The server continuously collects audio and image data from sensor devices placed around houses and apartment buildings. These include surveillance cameras, microphones, and infrared sensors. Because the collected data is difficult to handle directly, the server performs pre-processing, such as noise removal.

[0255] Data analysis and anomaly detection

[0256] The server sends pre-processed data to a generative AI model for analysis. The generative AI uses machine learning algorithms to analyze the data and detect abnormal changes or patterns from the normal state.

[0257] Risk assessment and alert generation

[0258] If an anomaly is detected, the server assesses the risk based on feedback from the AI. Based on the risk assessment, the server generates an alert notifying the user or security company of the anomaly. This facilitates immediate response.

[0259] Robot control and patrol activities

[0260] After detecting an anomaly, the server instructs the robot to patrol the site. The robot autonomously moves along the designated route, monitoring its surroundings using sensors. During this process, any newly acquired data is sent back to the server and re-analyzed by the generating AI as needed.

[0261] Continuous optimization

[0262] The server continuously records system-wide operation logs and analyzes them for further improvement. The generating AI continues machine learning based on new information and detected patterns, evolving its model to improve anomaly detection performance in subsequent iterations.

[0263] Specific example: If a garden sensor detects an unusual sound late at night, the server sends the audio data to a generating AI, which detects it as a suspicious human figure. The user is immediately notified with an alert, and a robot is instructed to patrol the garden for further investigation. Real-time information allows for quick and appropriate countermeasures to be taken. This entire process is recorded as a system operation log and used for future optimization.

[0264] Thus, the present invention realizes a highly efficient and accurate security system by combining a sensor device, a generating AI, and a robot.

[0265] The following describes the processing flow.

[0266] Step 1:

[0267] The server collects audio and image data in real time from sensors installed outside and inside houses and apartments. This includes surveillance cameras, microphones, and infrared sensors, and the collected data undergoes initial processing such as noise reduction and data format standardization.

[0268] Step 2:

[0269] The server sends the pre-processed data to the generative AI model. The generative AI model analyzes the data and executes speech recognition and image recognition algorithms to identify abnormal activity and sounds. This analysis detects any patterns that deviate from the normal state.

[0270] Step 3:

[0271] The generative AI model determines the presence or absence of anomalies based on the analysis results and performs a risk assessment. In this risk assessment, the degree of impact caused by the anomalies is calculated based on the type and frequency of the detected anomalies.

[0272] Step 4:

[0273] Based on feedback from the generated AI model, the server generates an alert if it detects an anomaly. The alert includes information such as the time, location, and risk level of the anomaly, and is immediately notified to users and security personnel.

[0274] Step 5:

[0275] The server instructs the robot to patrol the site as needed. The instructed robot then patrols a pre-set path, collecting further data using sensors and cameras, and transmitting it to the server in real time.

[0276] Step 6:

[0277] The server re-evaluates the new data sent from the robot and reassessss the overall situation. Additional alerts are generated and further risk assessments are made as needed.

[0278] Step 7:

[0279] The server records system-wide operation logs in a database, which are then used for subsequent analysis and system optimization. Furthermore, the generative AI model uses these operation logs to improve accuracy through machine learning and enhance its threat recognition capabilities.

[0280] This processing flow allows the system to quickly and efficiently enhance the security of homes and apartments, providing residents with a sense of safety.

[0281] (Example 1)

[0282] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0283] In recent years, security problems such as crimes and intrusion of suspicious persons have been increasing, especially in urban areas. In conventional security systems, it may take time to detect and respond to abnormalities, and both immediacy and accuracy are required. The object of the present invention is to provide a crime prevention system that enables advanced abnormality detection and prompt response.

[0284] The specific processing by the specific processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0285] In this invention, the server includes means for collecting surrounding information using a remote information acquisition device, means for preprocessing the information, transmitting it to a generative artificial intelligence for analysis, and detecting abnormalities, and means for evaluating the risk level based on the detected abnormalities and creating a notification. This enables quick and effective abnormality detection and response.

[0286] The "remote information acquisition device" is a device that senses surrounding information such as sound, video, and movement and provides data to the server.

[0287] "Preprocessing" is a process of removing unnecessary noise and interference from the collected information and preparing it in a state suitable for analysis.

[0288] The "generative artificial intelligence" is an artificial intelligence that analyzes input data using a machine learning algorithm and performs abnormality detection and pattern recognition.

[0289] "Evaluating the risk level" means determining the risk level of an event based on the occurrence situation and possibility of an abnormality.

[0290] The "autonomous device" is a robot that monitors the surrounding situation while automatically moving along a specified route based on programmed commands.

[0291] The "operation record" is data that records the operation history and processing results of the system and is used for later analysis.

[0292] "Continuous optimization" refers to the activity of progressively improving the system's performance and processing accuracy based on collected operational records.

[0293] To implement this invention, a sensor device, a server, a generative AI model, and an autonomous robot device are used.

[0294] Data collection:

[0295] The server collects data from sensor devices installed in houses and apartments. These include microphones to capture audio data, surveillance cameras to capture images, and infrared sensors to detect motion. These devices sense information about the surroundings and transmit the data to the server.

[0296] Data preprocessing and analysis:

[0297] The server performs preprocessing on the collected data, such as noise reduction and data normalization, and then sends it to the generative AI model. The generative AI model analyzes the collected data using machine learning algorithms and detects anomalies. This analysis identifies events that deviate from normal patterns.

[0298] Risk assessment and notification:

[0299] The server assesses the risk based on the results of anomaly detection. If the risk level is determined to be high, the server immediately generates a notification and issues an alert to the user or security company.

[0300] Robot patrol:

[0301] When an anomaly is detected, the server instructs the autonomous robot to patrol. The robot automatically moves within the designated area and acquires more detailed data. The acquired data is then sent back to the server for further analysis.

[0302] Continuous optimization:

[0303] The server records all operations occurring within the system and uses this record to optimize the entire system. The generative AI model conducts continuous learning based on this data to improve performance.

[0304] As a specific example, when an unusual sound is detected in the garden at night, the sound is collected by a microphone. The server sends the audio data to the generative AI model for anomaly analysis. When the possibility of a suspicious person is detected, the server sends an alert notification to the user and instructs the robot to patrol the scene. This process enables a quick and effective response.

[0305] As an example of a prompt sentence, instructions such as "Analyze the data of the abnormal sound detected in the garden at midnight and identify suspicious activities." are provided to the generative AI model. Thereby, the generative AI model performs appropriate analysis.

[0306] The flow of the specific process in Example 1 will be described using FIG. 11.

[0307] Step 1:

[0308] Data collection

[0309] The server continuously collects ambient audio and image data from sensor devices installed in houses and condominiums. Examples of sensor devices include microphones that acquire audio and surveillance cameras that acquire images. The input data is audio and images in a state containing noise. These data are transmitted to the server via the network.

[0310] Step :

[0311] Pre - processing of data

[0312] The server performs noise reduction and data normalization on the collected audio and image data. For audio data, background noise is reduced and important audio signals are extracted. For image data, images are filtered to prevent unwanted light reflections and false detections. As a result of this preprocessing, clear audio and image data suitable for analysis is obtained.

[0313] Step 3:

[0314] Data analysis and anomaly detection

[0315] The server sends pre-processed data to a generative AI model and receives prompt messages to detect abnormal patterns. The generative AI model uses machine learning algorithms to analyze anomalies from normal patterns. The input is filtered, clear audio and image data, and the output is information about the presence or absence of anomalies and their patterns.

[0316] Step 4:

[0317] Risk assessment and notification generation

[0318] The server performs a risk assessment based on anomaly detection information from the generated AI model. Depending on the level of risk, the server creates and sends alert notifications to users and security companies in real time. The input is anomaly detection information, and the output is the content and priority of the notification.

[0319] Step 5:

[0320] Instructions for robot patrol

[0321] When the server detects an anomaly, it instructs the autonomous robot device to patrol the site. The robot automatically moves along the designated path and collects additional sensor data. This data is sent back to the server and re-analyzed by a generated AI model as needed. The inputs are the patrol instructions and sensor data, and the output is additional monitoring data.

[0322] Step 6:

[0323] Continuous optimization of the system

[0324] The server continuously records the operation logs of the entire system and uses them to train the generative AI model. This improves the anomaly detection capability in subsequent runs and optimizes the entire system. The input is the system's operation history, and the output is the performance of the improved generative AI model.

[0325] (Application Example 1)

[0326] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0327] Existing security systems face challenges in detecting anomalies early and responding quickly. In particular, in residential and apartment buildings, flexible responses tailored to the specific conditions and environments of individual locations are required, but conventional sensors and monitoring technologies have limitations. Furthermore, there are insufficient means for users to check the status of their homes in real time and take appropriate action while away.

[0328] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0329] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information using an automated learning model to detect anomalies, and means for enabling real-time confirmation of anomalies from a mobile terminal. This makes it possible for users to quickly grasp anomalies no matter where they are and take necessary actions in real time.

[0330] "Various detection devices" refers to a diverse group of sensors installed to acquire external environmental information such as sound, video, temperature, and motion.

[0331] "Information" refers to all data collected by various detection devices, including audio data, video data, and other forms of environmental data.

[0332] An "automated learning model" is a machine learning system that includes algorithms for analyzing collected information and detecting anomalies.

[0333] An "abnormality" refers to a unique pattern or condition that deviates from the normal state, and includes situations that are predicted to pose a safety risk.

[0334] A "mobile device" is an electronic device that a user can carry with them, such as a smartphone or smart glasses.

[0335] "Means that enable the confirmation of anomalies in real time" refers to technology that allows users to instantly recognize situations occurring at their home or facility visually or audibly through their mobile devices.

[0336] The system for carrying out this invention consists of various detection devices, an automated learning model, a mobile terminal, and a server.

[0337] The server continuously collects information about the surroundings from various detection devices installed in homes and facilities. These include voice sensors, cameras, and temperature sensors, and the data obtained from these sensors is first aggregated on the server.

[0338] The server inputs the collected information into an automated learning model to detect anomalies. The automated learning model is built using software such as Python and TensorFlow, and it analyzes the information to identify patterns that deviate from the normal state. During this process, noise reduction and data preprocessing are performed using tools such as OpenCV.

[0339] When an anomaly is detected, the server sends a real-time notification to the mobile device, allowing the user to view live video from the scene via their smartphone or smart glasses. This enables the user to quickly grasp the details of the anomaly and immediately consider countermeasures.

[0340] For example, if suspicious activity is detected in a residential yard late at night, the server will determine this to be an anomaly and send a notification to the user's mobile device. The user can then view the real-time video through smart glasses to check for any suspicious individuals.

[0341] An example of a prompt message is, "An unusual sound has been detected in the garden of a house. Analyze the camera footage and identify the anomaly." This instruction is input into the automated learning model, supporting appropriate actions. This system allows users to live with peace of mind.

[0342] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0343] Step 1:

[0344] The server continuously collects information about the surroundings from various detection devices. This includes data from sound sensors, cameras, and other sources. The collected data is raw and requires noise reduction and formatting.

[0345] Step 2:

[0346] The server performs noise reduction and other necessary preprocessing on the collected information. This involves using image processing techniques such as OpenCV to reduce noise in video data and normalizing the sampling of audio data. This process improves data accuracy and facilitates analysis in the next step.

[0347] Step 3:

[0348] The server inputs pre-processed information into an automated learning model and performs anomaly detection. The generating AI model is built using TensorFlow and identifies situations that deviate from normal by analyzing information patterns. Here, it is instructed to generate an alarm based on conditions that are judged to be anomalies.

[0349] Step 4:

[0350] If the server detects an anomaly, it immediately sends an alert to the mobile device. Upon receiving this alert, the user's device can request and view a real-time video feed from the site. The video data is sent to the device using streaming technology.

[0351] Step 5:

[0352] Users check the situation on-site via their mobile devices or smart glasses and carefully examine the content of alerts. Based on the confirmed information, they decide on appropriate actions. For example, after confirming an anomaly, they might contact the security company directly.

[0353] Step 6:

[0354] The server collects operation logs for each processing step and analyzes them to improve the entire system. The system's operational data is used to improve the accuracy of the automated learning model, enabling continuous optimization.

[0355] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0356] The system according to the present invention combines various sensor devices, a generating AI, and an emotion engine to significantly enhance the security of houses and apartments and enable interaction with users. The embodiments thereof are described in detail below.

[0357] Data collection and analysis

[0358] The server continuously collects audio and image data from sensor devices installed in houses and apartments. This includes cameras, microphones, and temperature sensors. The collected data is preprocessed in real time and sent to a generative AI model. The generative AI model analyzes this data and performs speech and image recognition to determine if there are any unusual sounds or movements.

[0359] Anomaly detection and risk assessment

[0360] The server assesses the risk of the current situation based on anomalies detected by the generated AI model. The assessed risk triggers an alert based on the type and frequency of the anomaly, as well as the security policy. The alert is immediately notified to users and security personnel.

[0361] Robot control

[0362] If an anomaly is detected, the server issues a command to the robot to patrol the site. The robot autonomously moves along the designated route, continuously collecting data from its surroundings and providing real-time feedback to the server. This allows for seamless on-site verification and response to anomalies.

[0363] Emotion-based interaction

[0364] When a user interacts with the system, it utilizes an emotion engine. This engine analyzes the user's emotions from their tone of voice and facial expressions, and adjusts the system's response accordingly. For example, if a user is feeling anxious, the system can raise its alert level, providing more detailed monitoring and faster informational feedback.

[0365] Specific example: If a user detects a suspicious noise at night, the system immediately sends voice data to the AI ​​to detect the anomaly. If the user communicates to the system via intercom that they are "anxious," the emotion engine analyzes this information, enhances the alert mode, and prompts robots to patrol quickly. It also provides detailed alert information to offer reassurance. In this way, the system enables advanced interaction that responds to user needs in real time.

[0366] This entire system analyzes operation logs in detail and uses the data for subsequent system evaluation and optimization. This enables the provision of continuous and effective security.

[0367] The following describes the processing flow.

[0368] Step 1:

[0369] The server collects audio and image data in real time from sensor devices placed in the surrounding area. This includes surveillance cameras and microphones, and the collected data is pre-processed, such as noise reduction, to accurately capture changes in the environment.

[0370] Step 2:

[0371] Upon receiving pre-processed data, the server sends it to a generative AI model, which identifies unusual activity and sounds through speech and image recognition. The generative AI model can continuously process data to improve its accuracy by repeatedly learning from past data.

[0372] Step 3:

[0373] In response to detected anomalies, the server evaluates the alert level based on the risk assessment results calculated by the generated AI model. If the risk is determined to be high, the server immediately generates an alert and notifies users and security personnel of the relevant information.

[0374] Step 4:

[0375] When a user receives an alert and accesses the system to check the situation, the system uses its emotion engine to analyze the user's emotional state. It determines emotions from voice tone and facial expression data, and automatically raises the system's alert level if the user is feeling anxious.

[0376] Step 5:

[0377] The server instructs robots to patrol areas where anomalies have been detected. The robots move along pre-set paths, scanning their surroundings with built-in sensors and sending the latest status data to the server for further anomaly detection.

[0378] Step 6:

[0379] The server analyzes the data from the robots and reconfirms whether there are any anomalies. It issues additional alerts as needed to strengthen security.

[0380] Step 7:

[0381] The server records the operation logs collected at each processing step in a database. These logs are used to optimize the system and improve the learning accuracy of the generated AI models, resulting in continuous improvement of the entire system.

[0382] Through this series of processes, the system efficiently ensures user safety and, by utilizing the emotion engine, enables fast and flexible interaction.

[0383] (Example 2)

[0384] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0385] In modern living spaces, there is a growing demand for both enhanced safety and improved convenience. In particular, there is a growing need for security systems that can quickly and accurately detect anomalies within homes and buildings and respond appropriately. However, existing systems have limitations in both anomaly detection and user response, making it difficult to provide users with a greater sense of security. Therefore, the present invention aims to provide a system that enhances anomaly detection capabilities while enabling flexible responses tailored to the user's emotions.

[0386] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0387] In this invention, the server includes means for collecting environmental data using various detection devices, means for formatting the data, transmitting it to an intelligent model for analysis and detection of anomalies, and means for analyzing the user's emotional state through audio and video and adjusting the system response. This makes it possible to improve the accuracy of anomaly detection and provide appropriate responses based on the user's emotions.

[0388] A "detection device" refers to various sensors used to collect data such as sound, images, and temperature from the environment.

[0389] An "intelligent model" refers to artificial intelligence technology that uses collected data to detect and analyze anomalies.

[0390] "Mobile units" refer to automated machines such as robots that autonomously patrol designated routes and monitor changes in the environment.

[0391] "Emotional state" refers to the psychological state of a user, as analyzed from their voice tone, facial expressions, and other factors.

[0392] "Operation records" refer to the overall operational history of the system, and serve as a source of information for optimizing the system based on this data.

[0393] "Risk assessment" is the process of analyzing the risks posed by detected anomalies and using that information to generate appropriate alarms.

[0394] An "alarm" is a warning and response instruction issued by the system in response to detected anomalies or dangers.

[0395] This invention is a system that enhances the security of homes and buildings and enables advanced user interaction. The system consists of several important elements, and its specific embodiments will be described below.

[0396] Hardware and software usage

[0397] The server is responsible for collecting data from various detection devices installed in houses and apartments. Specifically, these include cameras, microphones, and temperature sensors. These devices acquire image data, audio data, and ambient temperature data, and this information is used for anomaly detection.

[0398] The server preprocesses the collected data and sends it to the generative AI model. The generative AI model uses speech recognition and image recognition technologies to analyze and detect abnormal sounds, movements, and temperature changes. The software employs advanced AI analysis algorithms and noise filtering tools.

[0399] Interaction and emotion analysis

[0400] When users interact directly with the system, they can communicate their state to the system through an emotion engine. The emotion engine analyzes the user's voice tone and facial expressions, and the system optimizes its response based on this analysis. This feature allows the system to respond flexibly according to the user's emotional state.

[0401] Specific example:

[0402] If a user reports hearing suspicious noises at night and expressing concern, the server sends the audio data to a generating AI to detect any abnormalities. If the audio analysis determines that the user is feeling anxious, the system raises its alert level, instructs robots to patrol, and checks the situation. The system also sends detailed monitoring information to the user via push notifications and email to provide reassurance.

[0403] Example of a prompt

[0404] "Please describe a system that detects abnormal sounds at night, analyzes the user's emotions, and then takes appropriate action."

[0405] Thus, by combining precise sensor technology, a generative AI model, and an emotion recognition engine, the system of the present invention can solve various challenges in modern residential security and provide users with peace of mind and safety.

[0406] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0407] Step 1:

[0408] The server collects environmental data from various detection devices. This includes image data from cameras, audio data from microphones, and temperature data from temperature sensors. The input data is sent to the server in its raw state and in various formats. Specifically, the camera periodically captures images of the gate and the interior of the room, the microphone records background sounds, and the temperature sensor measures the room temperature. The output indicates the completion of data collection.

[0409] Step 2:

[0410] The server preprocesses the collected raw data. Data processing is performed to remove noise, standardize data formats, and highlight important features. Specifically, background noise is removed from audio data, and the resolution of image data is adjusted. The input is the raw data collected in step 1, and the output is clear data ready for analysis.

[0411] Step 3:

[0412] The server sends pre-processed data to the generating AI model. The generating AI model uses speech recognition and image recognition technologies to execute an anomaly detection algorithm. Specifically, the AI ​​model analyzes the data and identifies abnormal movements and sounds. The input is pre-processed data, and the output is the result of determining whether or not an anomaly exists.

[0413] Step 4:

[0414] The server assesses the risk based on anomalies detected by the AI ​​model. It analyzes the type and frequency of anomalies and generates alarms based on security policies. Specifically, data identified as anomaly is recorded, and users are notified based on this information. The input is the anomaly detection result from the AI ​​model, and the output is alarm information.

[0415] Step 5:

[0416] If an anomaly is detected, the server issues a patrol command to the mobile robot. The robot patrols the site according to the designated route and collects additional data. Specifically, the robot automatically moves through corridors and entrances, and feeds back newly recorded video and audio to the server in real time. The input is the patrol command from the server, and the output is the results of the site inspection.

[0417] Step 6:

[0418] When a user interacts with the system, the server uses an emotion engine to analyze this information. The emotion engine determines the user's emotional state based on their tone of voice and facial expressions, and optimizes the system's response. For example, if a user says "I'm anxious," the system analyzes this and raises the alert level. The input is the emotional feedback from the user, and the output is the adjusted system response.

[0419] (Application Example 2)

[0420] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0421] Improving the safety of homes and buildings requires rapid and accurate detection of anomalies, as well as appropriate responses that respond to user emotions. However, conventional systems suffer from insufficient anomaly detection accuracy and poor user interaction quality, resulting in a lack of safety and convenience. The challenge is to solve this problem and provide a higher level of security and user experience.

[0422] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0423] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information and detecting anomalies using a generated electronic brain, and means for interacting with the user using emotion recognition and adjusting the response based on the user's emotions. This enables immediate detection of anomalies, risk assessment, and optimal interaction according to the user's emotions.

[0424] A "detection device" is a device that collects information such as sound, images, and temperature from the external environment.

[0425] A "creative electronic brain" is a system that uses advanced artificial intelligence technology to analyze collected information and detect anomalies.

[0426] A "means for assessing risk" refers to a function that determines the risk of the current situation based on the results of anomaly detection and generates appropriate warnings.

[0427] "Machines" refers to devices such as robots that move autonomously, collect information, and patrol areas with abnormalities.

[0428] "Emotion recognition" is a technology that analyzes a user's emotions from their voice and facial expressions, making it possible to adjust the system's operation based on the user's emotions.

[0429] "Operation records" refer to logs that record the overall operating status and processing details of the system in chronological order, and analyzing these logs provides information that can be used to optimize the system.

[0430] The system for carrying out this invention comprises various detection devices, a generating electronic brain means, and an emotion recognition function, and is designed to enhance the safety of homes and buildings. Specific embodiments are shown below.

[0431] First, the server continuously collects information about the surroundings through various detection devices such as cameras, microphones, and temperature sensors. This data is processed in real time and sent to a cloud-based server. Services such as AWS Lambda and Google Cloud Functions are used for this process.

[0432] The server analyzes the collected data using a generating electronic brain to detect anomalies. The generating electronic brain uses a generative AI model and can identify anomalies by performing voice recognition and image recognition. Furthermore, it uses emotion recognition to analyze voice and text from the user to understand the user's emotional state.

[0433] If an anomaly is detected, the server assesses the risk and issues a command to the machine to patrol the area. The machine autonomously moves within the designated area, collecting and feeding back data in real time, and seamlessly confirming and responding to the anomaly.

[0434] Emotion recognition enables interaction, utilizing an emotion engine when users interact with the system. The emotion engine analyzes the user's emotions from their voice tone and facial expressions, and adjusts the system's response accordingly. For example, if a user feels anxious, the system can raise its alert level and provide detailed monitoring and rapid informational feedback.

[0435] For example, if a user reports a "suspicious noise" to the system, the emotion engine analyzes it, immediately intensifies the alert mode, and the machine patrols the area quickly. The user is also notified of detailed alert information, providing a sense of security.

[0436] Examples of prompt messages include the following:

[0437] "A suspicious noise has been detected in the user's home. Please demonstrate the process of using an emotion engine to alleviate the user's anxiety."

[0438] "Please explain in detail how you plan to enhance user confidence after detecting an anomaly."

[0439] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0440] Step 1:

[0441] The server collects audio data, image data, and temperature information from various detection devices. This input data is preprocessed, such as through normalization and filtering, to improve data quality, and then prepared as input for the generated AI model.

[0442] Step 2:

[0443] The server inputs pre-processed data into a generative AI model for speech and image recognition. The generative AI model performs pattern matching and feature extraction to detect anomalies and determines whether an anomaly exists. The output consists of a flag indicating the presence or absence of an anomaly and metadata about the characteristics of that anomaly.

[0444] Step 3:

[0445] The server assesses the risk based on the results of anomaly detection. It receives anomaly flags and metadata as input and applies a rule-based algorithm to evaluate the risk level. The output is a numerical score indicating the risk level, which is used to determine the next step.

[0446] Step 4:

[0447] If an anomaly or high risk is detected, the server issues a patrol command to the machine. Using patrol route information as input, the server instructs the machine to move, collects surrounding data in real time, and feeds it back to the server. This process complements on-site anomaly verification.

[0448] Step 5:

[0449] The server receives input from the user and performs emotion recognition. It takes user voice and text data as input and analyzes it using an emotion engine. This analysis uses natural language processing algorithms to identify the user's emotions. The output is information about the type and intensity of the emotion.

[0450] Step 6:

[0451] Based on the results of the sentiment analysis, the server adjusts the system's response. For example, if the user is feeling anxious, it raises the alert level and adjusts the system to collect more data to reassure the user. This process aims to enhance informational feedback to the user.

[0452] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0453] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0454] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0455] [Third Embodiment]

[0456] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0457] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0458] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0459] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0460] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0461] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0462] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0463] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0464] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0465] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0466] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0467] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0468] The security system according to the present invention is a highly efficient security solution for houses and apartments, and is implemented with a configuration that includes various sensor devices, a generating AI, and a control robot. A specific embodiment thereof is described below.

[0469] Data collection

[0470] The server continuously collects audio and image data from sensor devices placed around houses and apartment buildings. These include surveillance cameras, microphones, and infrared sensors. Because the collected data is difficult to handle directly, the server performs pre-processing, such as noise removal.

[0471] Data analysis and anomaly detection

[0472] The server sends pre-processed data to a generative AI model for analysis. The generative AI uses machine learning algorithms to analyze the data and detect abnormal changes or patterns from the normal state.

[0473] Risk assessment and alert generation

[0474] If an anomaly is detected, the server assesses the risk based on feedback from the AI. Based on the risk assessment, the server generates an alert notifying the user or security company of the anomaly. This facilitates immediate response.

[0475] Robot control and patrol activities

[0476] After detecting an anomaly, the server instructs the robot to patrol the site. The robot autonomously moves along the designated route, monitoring its surroundings using sensors. During this process, any newly acquired data is sent back to the server and re-analyzed by the generating AI as needed.

[0477] Continuous optimization

[0478] The server continuously records system-wide operation logs and analyzes them for further improvement. The generating AI continues machine learning based on new information and detected patterns, evolving its model to improve anomaly detection performance in subsequent iterations.

[0479] Specific example: If a garden sensor detects an unusual sound late at night, the server sends the audio data to a generating AI, which detects it as a suspicious human figure. The user is immediately notified with an alert, and a robot is instructed to patrol the garden for further investigation. Real-time information allows for quick and appropriate countermeasures to be taken. This entire process is recorded as a system operation log and used for future optimization.

[0480] Thus, the present invention realizes a highly efficient and accurate security system by combining a sensor device, a generating AI, and a robot.

[0481] The following describes the processing flow.

[0482] Step 1:

[0483] The server collects audio and image data in real time from sensors installed outside and inside houses and apartments. This includes surveillance cameras, microphones, and infrared sensors, and the collected data undergoes initial processing such as noise reduction and data format standardization.

[0484] Step 2:

[0485] The server sends the pre-processed data to the generative AI model. The generative AI model analyzes the data and executes speech recognition and image recognition algorithms to identify abnormal activity and sounds. This analysis detects any patterns that deviate from the normal state.

[0486] Step 3:

[0487] The generative AI model determines the presence or absence of anomalies based on the analysis results and performs a risk assessment. In this risk assessment, the degree of impact caused by the anomalies is calculated based on the type and frequency of the detected anomalies.

[0488] Step 4:

[0489] Based on feedback from the generated AI model, the server generates an alert if it detects an anomaly. The alert includes information such as the time, location, and risk level of the anomaly, and is immediately notified to users and security personnel.

[0490] Step 5:

[0491] The server instructs the robot to patrol the site as needed. The instructed robot then patrols a pre-set path, collecting further data using sensors and cameras, and transmitting it to the server in real time.

[0492] Step 6:

[0493] The server re-evaluates the new data sent from the robot and reassessss the overall situation. Additional alerts are generated and further risk assessments are made as needed.

[0494] Step 7:

[0495] The server records system-wide operation logs in a database, which are then used for subsequent analysis and system optimization. Furthermore, the generative AI model uses these operation logs to improve accuracy through machine learning and enhance its threat recognition capabilities.

[0496] This processing flow allows the system to quickly and efficiently enhance the security of homes and apartments, providing residents with a sense of safety.

[0497] (Example 1)

[0498] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0499] In recent years, security problems such as crime and intruder intrusions have been increasing, particularly in urban areas. Conventional security systems can be time-consuming to detect and respond to anomalies, highlighting the need for both immediacy and accuracy. The present invention aims to provide a security system that enables advanced anomaly detection and rapid response.

[0500] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0501] In this invention, the server includes means for collecting surrounding information using a remote information acquisition device, means for pre-processing the information, transmitting it to a generative artificial intelligence for analysis, and detecting anomalies, and means for evaluating the degree of risk based on the anomaly detection and creating a notification. This enables rapid and effective anomaly detection and response.

[0502] A "remote information acquisition device" is a device that senses surrounding information such as sound, video, and motion, and provides the data to a server.

[0503] "Preprocessing" is the process of removing unwanted noise and interference from collected information, preparing it for analysis.

[0504] "Generative artificial intelligence" is a type of artificial intelligence that uses machine learning algorithms to analyze input data and perform anomaly detection and pattern recognition.

[0505] "Assessing the degree of risk" means determining the risk level of an event based on the circumstances of its occurrence and its likelihood.

[0506] An "autonomous device" is a robot that automatically moves along a designated path based on programmed commands while monitoring its surroundings.

[0507] "Operational records" refer to data that records the system's operation history and processing results, and are used for later analysis.

[0508] "Continuous optimization" refers to the activity of progressively improving the system's performance and processing accuracy based on collected operational records.

[0509] To implement this invention, a sensor device, a server, a generative AI model, and an autonomous robot device are used.

[0510] Data collection:

[0511] The server collects data from sensor devices installed in houses and apartments. These include microphones to capture audio data, surveillance cameras to capture images, and infrared sensors to detect motion. These devices sense information about the surroundings and transmit the data to the server.

[0512] Data preprocessing and analysis:

[0513] The server performs preprocessing on the collected data, such as noise reduction and data normalization, and then sends it to the generative AI model. The generative AI model analyzes the collected data using machine learning algorithms and detects anomalies. This analysis identifies events that deviate from normal patterns.

[0514] Risk assessment and notification:

[0515] The server assesses the risk based on the results of anomaly detection. If the risk level is determined to be high, the server immediately generates a notification and issues an alert to the user or security company.

[0516] Robot patrol:

[0517] When an anomaly is detected, the server instructs the autonomous robot to patrol. The robot automatically moves within the designated area and acquires more detailed data. The acquired data is then sent back to the server for further analysis.

[0518] Continuous optimization:

[0519] The server records all operations that occur within the system and uses this record to optimize the entire system. The generative AI model continuously learns from this data to improve performance.

[0520] For example, if a suspicious sound is detected in the garden at night, the sound is collected by a microphone. The server sends the audio data to a generating AI model for analysis of the anomaly. If a potential intruder is detected, the server sends an alert notification to the user and instructs a robot to patrol the area. This process enables a quick and effective response.

[0521] As an example of a prompt, the generator AI model is given instructions such as, "Analyze data on unusual noises detected in the garden late at night and identify suspicious activity." This allows the generator AI model to perform the appropriate analysis.

[0522] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0523] Step 1:

[0524] Data collection

[0525] The server continuously collects ambient audio and image data from sensor devices installed in houses and apartments. Examples of sensor devices include microphones for audio acquisition and surveillance cameras for image acquisition. The input data consists of audio and image data, including noise. This data is transmitted to the server via the network.

[0526] Step 2:

[0527] Data preprocessing

[0528] The server performs noise reduction and data normalization on the collected audio and image data. For audio data, background noise is reduced and important audio signals are extracted. For image data, images are filtered to prevent unwanted light reflections and false detections. As a result of this preprocessing, clear audio and image data suitable for analysis is obtained.

[0529] Step 3:

[0530] Data analysis and anomaly detection

[0531] The server sends pre-processed data to a generative AI model and receives prompt messages to detect abnormal patterns. The generative AI model uses machine learning algorithms to analyze anomalies from normal patterns. The input is filtered, clear audio and image data, and the output is information about the presence or absence of anomalies and their patterns.

[0532] Step 4:

[0533] Risk assessment and notification generation

[0534] The server performs a risk assessment based on anomaly detection information from the generated AI model. Depending on the level of risk, the server creates and sends alert notifications to users and security companies in real time. The input is anomaly detection information, and the output is the content and priority of the notification.

[0535] Step 5:

[0536] Instructions for robot patrol

[0537] When the server detects an anomaly, it instructs the autonomous robot device to patrol the site. The robot automatically moves along the designated path and collects additional sensor data. This data is sent back to the server and re-analyzed by a generated AI model as needed. The inputs are the patrol instructions and sensor data, and the output is additional monitoring data.

[0538] Step 6:

[0539] Continuous optimization of the system

[0540] The server continuously records the operation logs of the entire system and uses them to train the generative AI model. This improves the anomaly detection capability in subsequent runs and optimizes the entire system. The input is the system's operation history, and the output is the performance of the improved generative AI model.

[0541] (Application Example 1)

[0542] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0543] Existing security systems face challenges in detecting anomalies early and responding quickly. In particular, in residential and apartment buildings, flexible responses tailored to the specific conditions and environments of individual locations are required, but conventional sensors and monitoring technologies have limitations. Furthermore, there are insufficient means for users to check the status of their homes in real time and take appropriate action while away.

[0544] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0545] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information using an automated learning model to detect anomalies, and means for enabling real-time confirmation of anomalies from a mobile terminal. This makes it possible for users to quickly grasp anomalies no matter where they are and take necessary actions in real time.

[0546] "Various detection devices" refers to a diverse group of sensors installed to acquire external environmental information such as sound, video, temperature, and motion.

[0547] "Information" refers to all data collected by various detection devices, including audio data, video data, and other forms of environmental data.

[0548] An "automated learning model" is a machine learning system that includes algorithms for analyzing collected information and detecting anomalies.

[0549] An "abnormality" refers to a unique pattern or condition that deviates from the normal state, and includes situations that are predicted to pose a safety risk.

[0550] A "mobile device" is an electronic device that a user can carry with them, such as a smartphone or smart glasses.

[0551] "Means that enable the confirmation of anomalies in real time" refers to technology that allows users to instantly recognize situations occurring at their home or facility visually or audibly through their mobile devices.

[0552] The system for carrying out this invention consists of various detection devices, an automated learning model, a mobile terminal, and a server.

[0553] The server continuously collects information about the surroundings from various detection devices installed in homes and facilities. These include voice sensors, cameras, and temperature sensors, and the data obtained from these sensors is first aggregated on the server.

[0554] The server inputs the collected information into an automated learning model to detect anomalies. The automated learning model is built using software such as Python and TensorFlow, and it analyzes the information to identify patterns that deviate from the normal state. During this process, noise reduction and data preprocessing are performed using tools such as OpenCV.

[0555] When an anomaly is detected, the server sends a real-time notification to the mobile device, allowing the user to view live video from the scene via their smartphone or smart glasses. This enables the user to quickly grasp the details of the anomaly and immediately consider countermeasures.

[0556] For example, if suspicious activity is detected in a residential yard late at night, the server will determine this to be an anomaly and send a notification to the user's mobile device. The user can then view the real-time video through smart glasses to check for any suspicious individuals.

[0557] An example of a prompt message is, "An unusual sound has been detected in the garden of a house. Analyze the camera footage and identify the anomaly." This instruction is input into the automated learning model, supporting appropriate actions. This system allows users to live with peace of mind.

[0558] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0559] Step 1:

[0560] The server continuously collects information about the surroundings from various detection devices. This includes data from sound sensors, cameras, and other sources. The collected data is raw and requires noise reduction and formatting.

[0561] Step 2:

[0562] The server performs noise reduction and other necessary preprocessing on the collected information. This involves using image processing techniques such as OpenCV to reduce noise in video data and normalizing the sampling of audio data. This process improves data accuracy and facilitates analysis in the next step.

[0563] Step 3:

[0564] The server inputs pre-processed information into an automated learning model and performs anomaly detection. The generating AI model is built using TensorFlow and identifies situations that deviate from normal by analyzing information patterns. Here, it is instructed to generate an alarm based on conditions that are judged to be anomalies.

[0565] Step 4:

[0566] If the server detects an anomaly, it immediately sends an alert to the mobile device. Upon receiving this alert, the user's device can request and view a real-time video feed from the site. The video data is sent to the device using streaming technology.

[0567] Step 5:

[0568] Users check the situation on-site via their mobile devices or smart glasses and carefully examine the content of alerts. Based on the confirmed information, they decide on appropriate actions. For example, after confirming an anomaly, they might contact the security company directly.

[0569] Step 6:

[0570] The server collects operation logs for each processing step and analyzes them to improve the entire system. The system's operational data is used to improve the accuracy of the automated learning model, enabling continuous optimization.

[0571] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0572] The system according to the present invention combines various sensor devices, a generating AI, and an emotion engine to significantly enhance the security of houses and apartments and enable interaction with users. The embodiments thereof are described in detail below.

[0573] Data collection and analysis

[0574] The server continuously collects audio and image data from sensor devices installed in houses and apartments. This includes cameras, microphones, and temperature sensors. The collected data is preprocessed in real time and sent to a generative AI model. The generative AI model analyzes this data and performs speech and image recognition to determine if there are any unusual sounds or movements.

[0575] Anomaly detection and risk assessment

[0576] The server assesses the risk of the current situation based on anomalies detected by the generated AI model. The assessed risk triggers an alert based on the type and frequency of the anomaly, as well as the security policy. The alert is immediately notified to users and security personnel.

[0577] Robot control

[0578] If an anomaly is detected, the server issues a command to the robot to patrol the site. The robot autonomously moves along the designated route, continuously collecting data from its surroundings and providing real-time feedback to the server. This allows for seamless on-site verification and response to anomalies.

[0579] Emotion-based interaction

[0580] When a user interacts with the system, it utilizes an emotion engine. This engine analyzes the user's emotions from their tone of voice and facial expressions, and adjusts the system's response accordingly. For example, if a user is feeling anxious, the system can raise its alert level, providing more detailed monitoring and faster informational feedback.

[0581] Specific example: If a user detects a suspicious noise at night, the system immediately sends voice data to the AI ​​to detect the anomaly. If the user communicates to the system via intercom that they are "anxious," the emotion engine analyzes this information, enhances the alert mode, and prompts robots to patrol quickly. It also provides detailed alert information to offer reassurance. In this way, the system enables advanced interaction that responds to user needs in real time.

[0582] This entire system analyzes operation logs in detail and uses the data for subsequent system evaluation and optimization. This enables the provision of continuous and effective security.

[0583] The following describes the processing flow.

[0584] Step 1:

[0585] The server collects audio and image data in real time from sensor devices placed in the surrounding area. This includes surveillance cameras and microphones, and the collected data is pre-processed, such as noise reduction, to accurately capture changes in the environment.

[0586] Step 2:

[0587] Upon receiving pre-processed data, the server sends it to a generative AI model, which identifies unusual activity and sounds through speech and image recognition. The generative AI model can continuously process data to improve its accuracy by repeatedly learning from past data.

[0588] Step 3:

[0589] In response to detected anomalies, the server evaluates the alert level based on the risk assessment results calculated by the generated AI model. If the risk is determined to be high, the server immediately generates an alert and notifies users and security personnel of the relevant information.

[0590] Step 4:

[0591] When a user receives an alert and accesses the system to check the situation, the system uses its emotion engine to analyze the user's emotional state. It determines emotions from voice tone and facial expression data, and automatically raises the system's alert level if the user is feeling anxious.

[0592] Step 5:

[0593] The server instructs robots to patrol areas where anomalies have been detected. The robots move along pre-set paths, scanning their surroundings with built-in sensors and sending the latest status data to the server for further anomaly detection.

[0594] Step 6:

[0595] The server analyzes the data from the robots and reconfirms whether there are any anomalies. It issues additional alerts as needed to strengthen security.

[0596] Step 7:

[0597] The server records the operation logs collected at each processing step in a database. These logs are used to optimize the system and improve the learning accuracy of the generated AI models, resulting in continuous improvement of the entire system.

[0598] Through this series of processes, the system efficiently ensures user safety and, by utilizing the emotion engine, enables fast and flexible interaction.

[0599] (Example 2)

[0600] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0601] In modern living spaces, there is a growing demand for both enhanced safety and improved convenience. In particular, there is a growing need for security systems that can quickly and accurately detect anomalies within homes and buildings and respond appropriately. However, existing systems have limitations in both anomaly detection and user response, making it difficult to provide users with a greater sense of security. Therefore, the present invention aims to provide a system that enhances anomaly detection capabilities while enabling flexible responses tailored to the user's emotions.

[0602] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0603] In this invention, the server includes means for collecting environmental data using various detection devices, means for formatting the data, transmitting it to an intelligent model for analysis and detection of anomalies, and means for analyzing the user's emotional state through audio and video and adjusting the system response. This makes it possible to improve the accuracy of anomaly detection and provide appropriate responses based on the user's emotions.

[0604] A "detection device" refers to various sensors used to collect data such as sound, images, and temperature from the environment.

[0605] An "intelligent model" refers to artificial intelligence technology that uses collected data to detect and analyze anomalies.

[0606] "Mobile units" refer to automated machines such as robots that autonomously patrol designated routes and monitor changes in the environment.

[0607] "Emotional state" refers to the psychological state of a user, as analyzed from their voice tone, facial expressions, and other factors.

[0608] "Operation records" refer to the overall operational history of the system, and serve as a source of information for optimizing the system based on this data.

[0609] "Risk assessment" is the process of analyzing the risks posed by detected anomalies and using that information to generate appropriate alarms.

[0610] An "alarm" is a warning and response instruction issued by the system in response to detected anomalies or dangers.

[0611] This invention is a system that enhances the security of homes and buildings and enables advanced user interaction. The system consists of several important elements, and its specific embodiments will be described below.

[0612] Hardware and software usage

[0613] The server is responsible for collecting data from various detection devices installed in houses and apartments. Specifically, these include cameras, microphones, and temperature sensors. These devices acquire image data, audio data, and ambient temperature data, and this information is used for anomaly detection.

[0614] The server preprocesses the collected data and sends it to the generative AI model. The generative AI model uses speech recognition and image recognition technologies to analyze and detect abnormal sounds, movements, and temperature changes. The software employs advanced AI analysis algorithms and noise filtering tools.

[0615] Interaction and emotion analysis

[0616] When users interact directly with the system, they can communicate their state to the system through an emotion engine. The emotion engine analyzes the user's voice tone and facial expressions, and the system optimizes its response based on this analysis. This feature allows the system to respond flexibly according to the user's emotional state.

[0617] Specific example:

[0618] If a user reports hearing suspicious noises at night and expressing concern, the server sends the audio data to a generating AI to detect any abnormalities. If the audio analysis determines that the user is feeling anxious, the system raises its alert level, instructs robots to patrol, and checks the situation. The system also sends detailed monitoring information to the user via push notifications and email to provide reassurance.

[0619] Example of a prompt

[0620] "Please describe a system that detects abnormal sounds at night, analyzes the user's emotions, and then takes appropriate action."

[0621] Thus, by combining precise sensor technology, a generative AI model, and an emotion recognition engine, the system of the present invention can solve various challenges in modern residential security and provide users with peace of mind and safety.

[0622] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0623] Step 1:

[0624] The server collects environmental data from various detection devices. This includes image data from cameras, audio data from microphones, and temperature data from temperature sensors. The input data is sent to the server in its raw state and in various formats. Specifically, the camera periodically captures images of the gate and the interior of the room, the microphone records background sounds, and the temperature sensor measures the room temperature. The output indicates the completion of data collection.

[0625] Step 2:

[0626] The server preprocesses the collected raw data. Data processing is performed to remove noise, standardize data formats, and highlight important features. Specifically, background noise is removed from audio data, and the resolution of image data is adjusted. The input is the raw data collected in step 1, and the output is clear data ready for analysis.

[0627] Step 3:

[0628] The server sends pre-processed data to the generating AI model. The generating AI model uses speech recognition and image recognition technologies to execute an anomaly detection algorithm. Specifically, the AI ​​model analyzes the data and identifies abnormal movements and sounds. The input is pre-processed data, and the output is the result of determining whether or not an anomaly exists.

[0629] Step 4:

[0630] The server assesses the risk based on anomalies detected by the AI ​​model. It analyzes the type and frequency of anomalies and generates alarms based on security policies. Specifically, data identified as anomaly is recorded, and users are notified based on this information. The input is the anomaly detection result from the AI ​​model, and the output is alarm information.

[0631] Step 5:

[0632] If an anomaly is detected, the server issues a patrol command to the mobile robot. The robot patrols the site according to the designated route and collects additional data. Specifically, the robot automatically moves through corridors and entrances, and feeds back newly recorded video and audio to the server in real time. The input is the patrol command from the server, and the output is the results of the site inspection.

[0633] Step 6:

[0634] When a user interacts with the system, the server uses an emotion engine to analyze this information. The emotion engine determines the user's emotional state based on their tone of voice and facial expressions, and optimizes the system's response. For example, if a user says "I'm anxious," the system analyzes this and raises the alert level. The input is the emotional feedback from the user, and the output is the adjusted system response.

[0635] (Application Example 2)

[0636] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0637] Improving the safety of homes and buildings requires rapid and accurate detection of anomalies, as well as appropriate responses that respond to user emotions. However, conventional systems suffer from insufficient anomaly detection accuracy and poor user interaction quality, resulting in a lack of safety and convenience. The challenge is to solve this problem and provide a higher level of security and user experience.

[0638] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0639] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information and detecting anomalies using a generated electronic brain, and means for interacting with the user using emotion recognition and adjusting the response based on the user's emotions. This enables immediate detection of anomalies, risk assessment, and optimal interaction according to the user's emotions.

[0640] A "detection device" is a device that collects information such as sound, images, and temperature from the external environment.

[0641] A "creative electronic brain" is a system that uses advanced artificial intelligence technology to analyze collected information and detect anomalies.

[0642] A "means for assessing risk" refers to a function that determines the risk of the current situation based on the results of anomaly detection and generates appropriate warnings.

[0643] "Machines" refers to devices such as robots that move autonomously, collect information, and patrol areas with abnormalities.

[0644] "Emotion recognition" is a technology that analyzes a user's emotions from their voice and facial expressions, making it possible to adjust the system's operation based on the user's emotions.

[0645] "Operation records" refer to logs that record the overall operating status and processing details of the system in chronological order, and analyzing these logs provides information that can be used to optimize the system.

[0646] The system for carrying out this invention comprises various detection devices, a generating electronic brain means, and an emotion recognition function, and is designed to enhance the safety of homes and buildings. Specific embodiments are shown below.

[0647] First, the server continuously collects information about the surroundings through various detection devices such as cameras, microphones, and temperature sensors. This data is processed in real time and sent to a cloud-based server. Services such as AWS Lambda and Google Cloud Functions are used for this process.

[0648] The server analyzes the collected data using a generating electronic brain to detect anomalies. The generating electronic brain uses a generative AI model and can identify anomalies by performing voice recognition and image recognition. Furthermore, it uses emotion recognition to analyze voice and text from the user to understand the user's emotional state.

[0649] If an anomaly is detected, the server assesses the risk and issues a command to the machine to patrol the area. The machine autonomously moves within the designated area, collecting and feeding back data in real time, and seamlessly confirming and responding to the anomaly.

[0650] Emotion recognition enables interaction, utilizing an emotion engine when users interact with the system. The emotion engine analyzes the user's emotions from their voice tone and facial expressions, and adjusts the system's response accordingly. For example, if a user feels anxious, the system can raise its alert level and provide detailed monitoring and rapid informational feedback.

[0651] For example, if a user reports a "suspicious noise" to the system, the emotion engine analyzes it, immediately intensifies the alert mode, and the machine patrols the area quickly. The user is also notified of detailed alert information, providing a sense of security.

[0652] Examples of prompt messages include the following:

[0653] "A suspicious noise has been detected in the user's home. Please demonstrate the process of using an emotion engine to alleviate the user's anxiety."

[0654] "Please explain in detail how you plan to enhance user confidence after detecting an anomaly."

[0655] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0656] Step 1:

[0657] The server collects audio data, image data, and temperature information from various detection devices. This input data is preprocessed, such as through normalization and filtering, to improve data quality, and then prepared as input for the generated AI model.

[0658] Step 2:

[0659] The server inputs pre-processed data into a generative AI model for speech and image recognition. The generative AI model performs pattern matching and feature extraction to detect anomalies and determines whether an anomaly exists. The output consists of a flag indicating the presence or absence of an anomaly and metadata about the characteristics of that anomaly.

[0660] Step 3:

[0661] The server assesses the risk based on the results of anomaly detection. It receives anomaly flags and metadata as input and applies a rule-based algorithm to evaluate the risk level. The output is a numerical score indicating the risk level, which is used to determine the next step.

[0662] Step 4:

[0663] If an anomaly or high risk is detected, the server issues a patrol command to the machine. Using patrol route information as input, the server instructs the machine to move, collects surrounding data in real time, and feeds it back to the server. This process complements on-site anomaly verification.

[0664] Step 5:

[0665] The server receives input from the user and performs emotion recognition. It takes user voice and text data as input and analyzes it using an emotion engine. This analysis uses natural language processing algorithms to identify the user's emotions. The output is information about the type and intensity of the emotion.

[0666] Step 6:

[0667] Based on the results of the sentiment analysis, the server adjusts the system's response. For example, if the user is feeling anxious, it raises the alert level and adjusts the system to collect more data to reassure the user. This process aims to enhance informational feedback to the user.

[0668] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0669] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0670] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0671] [Fourth Embodiment]

[0672] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0673] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0674] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0675] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0676] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0677] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0678] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0679] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0680] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0681] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0682] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0683] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0684] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0685] The security system according to the present invention is a highly efficient security solution for houses and apartments, and is implemented with a configuration that includes various sensor devices, a generating AI, and a control robot. A specific embodiment thereof is described below.

[0686] Data collection

[0687] The server continuously collects audio and image data from sensor devices placed around houses and apartment buildings. These include surveillance cameras, microphones, and infrared sensors. Because the collected data is difficult to handle directly, the server performs pre-processing, such as noise removal.

[0688] Data analysis and anomaly detection

[0689] The server sends pre-processed data to a generative AI model for analysis. The generative AI uses machine learning algorithms to analyze the data and detect abnormal changes or patterns from the normal state.

[0690] Risk assessment and alert generation

[0691] If an anomaly is detected, the server assesses the risk based on feedback from the AI. Based on the risk assessment, the server generates an alert notifying the user or security company of the anomaly. This facilitates immediate response.

[0692] Robot control and patrol activities

[0693] After detecting an anomaly, the server instructs the robot to patrol the site. The robot autonomously moves along the designated route, monitoring its surroundings using sensors. During this process, any newly acquired data is sent back to the server and re-analyzed by the generating AI as needed.

[0694] Continuous optimization

[0695] The server continuously records system-wide operation logs and analyzes them for further improvement. The generating AI continues machine learning based on new information and detected patterns, evolving its model to improve anomaly detection performance in subsequent iterations.

[0696] Specific example: If a garden sensor detects an unusual sound late at night, the server sends the audio data to a generating AI, which detects it as a suspicious human figure. The user is immediately notified with an alert, and a robot is instructed to patrol the garden for further investigation. Real-time information allows for quick and appropriate countermeasures to be taken. This entire process is recorded as a system operation log and used for future optimization.

[0697] Thus, the present invention realizes a highly efficient and accurate security system by combining a sensor device, a generating AI, and a robot.

[0698] The following describes the processing flow.

[0699] Step 1:

[0700] The server collects audio and image data in real time from sensors installed outside and inside houses and apartments. This includes surveillance cameras, microphones, and infrared sensors, and the collected data undergoes initial processing such as noise reduction and data format standardization.

[0701] Step 2:

[0702] The server sends the pre-processed data to the generative AI model. The generative AI model analyzes the data and executes speech recognition and image recognition algorithms to identify abnormal activity and sounds. This analysis detects any patterns that deviate from the normal state.

[0703] Step 3:

[0704] The generative AI model determines the presence or absence of anomalies based on the analysis results and performs a risk assessment. In this risk assessment, the degree of impact caused by the anomalies is calculated based on the type and frequency of the detected anomalies.

[0705] Step 4:

[0706] Based on feedback from the generated AI model, the server generates an alert if it detects an anomaly. The alert includes information such as the time, location, and risk level of the anomaly, and is immediately notified to users and security personnel.

[0707] Step 5:

[0708] The server instructs the robot to patrol the site as needed. The robot, upon receiving the instruction, patrols a pre-set path, collecting further data using sensors and cameras, and transmitting it to the server in real time.

[0709] Step 6:

[0710] The server re-evaluates the new data sent from the robot and reassesssss the overall situation. If necessary, additional alerts are generated and further risk assessments are performed.

[0711] Step 7:

[0712] The server records system-wide operation logs in a database, which are then used for subsequent analysis and system optimization. Furthermore, the generative AI model uses these operation logs to improve accuracy through machine learning and enhance its threat recognition capabilities.

[0713] This processing flow allows the system to quickly and efficiently enhance the security of homes and apartments, providing residents with a sense of safety.

[0714] (Example 1)

[0715] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0716] In recent years, security problems such as crime and intruder intrusions have been increasing, particularly in urban areas. Conventional security systems can be time-consuming to detect and respond to anomalies, highlighting the need for both immediacy and accuracy. The present invention aims to provide a security system that enables advanced anomaly detection and rapid response.

[0717] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0718] In this invention, the server includes means for collecting surrounding information using a remote information acquisition device, means for pre-processing the information, transmitting it to a generative artificial intelligence for analysis, and detecting anomalies, and means for evaluating the degree of risk based on the anomaly detection and creating a notification. This enables rapid and effective anomaly detection and response.

[0719] A "remote information acquisition device" is a device that senses surrounding information such as sound, video, and motion, and provides the data to a server.

[0720] "Preprocessing" is the process of removing unwanted noise and interference from collected information, preparing it for analysis.

[0721] "Generative artificial intelligence" is a type of artificial intelligence that uses machine learning algorithms to analyze input data and perform anomaly detection and pattern recognition.

[0722] "Assessing the degree of risk" means determining the risk level of an event based on the circumstances of its occurrence and its likelihood.

[0723] An "autonomous device" is a robot that automatically moves along a designated path based on programmed commands while monitoring its surroundings.

[0724] "Operational records" refer to data that records the system's operation history and processing results, and are used for later analysis.

[0725] "Continuous optimization" refers to the activity of progressively improving the system's performance and processing accuracy based on collected operational records.

[0726] To implement this invention, a sensor device, a server, a generative AI model, and an autonomous robot device are used.

[0727] Data collection:

[0728] The server collects data from sensor devices installed in houses and apartments. These include microphones to capture audio data, surveillance cameras to capture images, and infrared sensors to detect motion. These devices sense information about the surroundings and transmit the data to the server.

[0729] Data preprocessing and analysis:

[0730] The server performs preprocessing on the collected data, such as noise reduction and data normalization, and then sends it to the generative AI model. The generative AI model analyzes the collected data using machine learning algorithms and detects anomalies. This analysis identifies events that deviate from normal patterns.

[0731] Risk assessment and notification:

[0732] The server assesses the risk based on the results of anomaly detection. If the risk level is determined to be high, the server immediately generates a notification and issues an alert to the user or security company.

[0733] Robot patrol:

[0734] When an anomaly is detected, the server instructs the autonomous robot to patrol. The robot automatically moves within the designated area and acquires more detailed data. The acquired data is then sent back to the server for further analysis.

[0735] Continuous optimization:

[0736] The server records all operations that occur within the system and uses this record to optimize the entire system. The generative AI model continuously learns from this data to improve performance.

[0737] For example, if a suspicious sound is detected in the garden at night, the sound is collected by a microphone. The server sends the audio data to a generating AI model for analysis of the anomaly. If a potential intruder is detected, the server sends an alert notification to the user and instructs a robot to patrol the area. This process enables a quick and effective response.

[0738] As an example of a prompt, the generator AI model is given instructions such as, "Analyze data on unusual noises detected in the garden late at night and identify suspicious activity." This allows the generator AI model to perform the appropriate analysis.

[0739] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0740] Step 1:

[0741] Data collection

[0742] The server continuously collects ambient audio and image data from sensor devices installed in houses and apartments. Examples of sensor devices include microphones for audio acquisition and surveillance cameras for image acquisition. The input data consists of audio and image data, including noise. This data is transmitted to the server via the network.

[0743] Step 2:

[0744] Data preprocessing

[0745] The server performs noise reduction and data normalization on the collected audio and image data. For audio data, background noise is reduced and important audio signals are extracted. For image data, images are filtered to prevent unwanted light reflections and false detections. As a result of this preprocessing, clear audio and image data suitable for analysis is obtained.

[0746] Step 3:

[0747] Data analysis and anomaly detection

[0748] The server sends pre-processed data to a generative AI model and receives prompt messages to detect abnormal patterns. The generative AI model uses machine learning algorithms to analyze anomalies from normal patterns. The input is filtered, clear audio and image data, and the output is information about the presence or absence of anomalies and their patterns.

[0749] Step 4:

[0750] Risk assessment and notification generation

[0751] The server performs a risk assessment based on anomaly detection information from the generated AI model. Depending on the level of risk, the server creates and sends alert notifications to users and security companies in real time. The input is anomaly detection information, and the output is the content and priority of the notification.

[0752] Step 5:

[0753] Instructions for robot patrol

[0754] When the server detects an anomaly, it instructs the autonomous robot device to patrol the site. The robot automatically moves along the designated path and collects additional sensor data. This data is sent back to the server and re-analyzed by a generated AI model as needed. The inputs are the patrol instructions and sensor data, and the output is additional monitoring data.

[0755] Step 6:

[0756] Continuous optimization of the system

[0757] The server continuously records system operation logs and uses them to train the generative AI model. This improves anomaly detection capabilities in subsequent runs and optimizes the entire system. The input is the system's operation history, and the output is the performance of the improved generative AI model.

[0758] (Application Example 1)

[0759] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0760] Existing security systems face challenges in detecting anomalies early and responding quickly. In particular, in residential and apartment buildings, flexible responses tailored to the specific conditions and environments of individual locations are required, but conventional sensors and monitoring technologies have limitations. Furthermore, there are insufficient means for users to check the status of their homes in real time and take appropriate action while away.

[0761] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0762] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information using an automated learning model to detect anomalies, and means for enabling real-time confirmation of anomalies from a mobile terminal. This makes it possible for users to quickly grasp anomalies no matter where they are and take necessary actions in real time.

[0763] "Various detection devices" refers to a diverse group of sensors installed to acquire external environmental information such as sound, video, temperature, and motion.

[0764] "Information" refers to all data collected by various detection devices, including audio data, video data, and other forms of environmental data.

[0765] An "automated learning model" is a machine learning system that includes algorithms for analyzing collected information and detecting anomalies.

[0766] An "abnormality" refers to a unique pattern or condition that deviates from the normal state, and includes situations that are predicted to pose a safety risk.

[0767] A "mobile device" is an electronic device that a user can carry with them, such as a smartphone or smart glasses.

[0768] "Means that enable the confirmation of anomalies in real time" refers to technology that allows users to instantly recognize situations occurring at their home or facility visually or audibly through their mobile devices.

[0769] The system for carrying out this invention consists of various detection devices, an automated learning model, a mobile terminal, and a server.

[0770] The server continuously collects information about the surroundings from various detection devices installed in homes and facilities. These include voice sensors, cameras, and temperature sensors, and the data obtained from these sensors is first aggregated on the server.

[0771] The server inputs the collected information into an automated learning model to detect anomalies. The automated learning model is built using software such as Python and TensorFlow, and it analyzes the information to identify patterns that deviate from the normal state. During this process, noise reduction and data preprocessing are performed using tools such as OpenCV.

[0772] When an anomaly is detected, the server sends a real-time notification to the mobile device, allowing the user to view live video from the scene via their smartphone or smart glasses. This enables the user to quickly grasp the details of the anomaly and immediately consider countermeasures.

[0773] For example, if suspicious activity is detected in a residential yard late at night, the server will determine this to be an anomaly and send a notification to the user's mobile device. The user can then view the real-time video through smart glasses to check for any suspicious individuals.

[0774] An example of a prompt message is, "An unusual sound has been detected in the garden of a house. Analyze the camera footage and identify the anomaly." This instruction is input into the automated learning model, supporting appropriate actions. This system allows users to live with peace of mind.

[0775] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0776] Step 1:

[0777] The server continuously collects information about the surroundings from various detection devices. This includes data from sound sensors, cameras, and other sources. The collected data is raw and requires noise reduction and formatting.

[0778] Step 2:

[0779] The server performs noise reduction and other necessary preprocessing on the collected information. This involves using image processing techniques such as OpenCV to reduce noise in video data and normalizing the sampling of audio data. This process improves data accuracy and facilitates analysis in the next step.

[0780] Step 3:

[0781] The server inputs pre-processed information into an automated learning model and performs anomaly detection. The generated AI model is built using TensorFlow and identifies situations that deviate from normal by analyzing information patterns. Here, it is instructed to generate an alarm based on conditions that are judged to be anomalies.

[0782] Step 4:

[0783] If the server detects an anomaly, it immediately sends an alert to the mobile device. Upon receiving this alert, the user's device can request and view a real-time video feed from the site. The video data is sent to the device using streaming technology.

[0784] Step 5:

[0785] Users check the situation on-site via their mobile devices or smart glasses and carefully examine the content of alerts. Based on the confirmed information, they decide on appropriate actions. For example, after confirming an anomaly, they might contact the security company directly.

[0786] Step 6:

[0787] The server collects operation logs for each processing step and analyzes them to improve the entire system. The system's operational data is used to improve the accuracy of the automated learning model, enabling continuous optimization.

[0788] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0789] The system according to the present invention combines various sensor devices, a generating AI, and an emotion engine to significantly enhance the security of houses and apartments and enable interaction with users. The embodiments thereof are described in detail below.

[0790] Data collection and analysis

[0791] The server continuously collects audio and image data from sensor devices installed in houses and apartments. This includes cameras, microphones, and temperature sensors. The collected data is preprocessed in real time and sent to a generative AI model. The generative AI model analyzes this data and performs speech and image recognition to determine if there are any unusual sounds or movements.

[0792] Anomaly detection and risk assessment

[0793] The server assesses the risk of the current situation based on anomalies detected by the generated AI model. The assessed risk triggers an alert based on the type and frequency of the anomaly, as well as the security policy. The alert is immediately notified to users and security personnel.

[0794] Robot control

[0795] If an anomaly is detected, the server issues a command to the robot to patrol the site. The robot autonomously moves along the designated route, continuously collecting data from its surroundings and providing real-time feedback to the server. This allows for seamless on-site verification and response to anomalies.

[0796] Emotion-based interaction

[0797] When a user interacts with the system, it utilizes an emotion engine. This engine analyzes the user's emotions from their tone of voice and facial expressions, and adjusts the system's response accordingly. For example, if a user is feeling anxious, the system can raise its alert level, providing more detailed monitoring and faster informational feedback.

[0798] Specific example: If a user detects a suspicious noise at night, the system immediately sends voice data to the AI ​​to detect the anomaly. If the user communicates to the system via intercom that they are "anxious," the emotion engine analyzes this information, enhances the alert mode, and prompts robots to patrol quickly. It also provides detailed alert information to offer reassurance. In this way, the system enables advanced interaction that responds to user needs in real time.

[0799] This entire system analyzes operation logs in detail and uses the data for subsequent system evaluation and optimization. This enables the provision of continuous and effective security.

[0800] The following describes the processing flow.

[0801] Step 1:

[0802] The server collects audio and image data in real time from sensor devices placed in the surrounding area. This includes surveillance cameras and microphones, and the collected data is pre-processed, such as noise reduction, to accurately capture changes in the environment.

[0803] Step 2:

[0804] Upon receiving pre-processed data, the server sends it to a generative AI model, which identifies unusual activity and sounds through speech and image recognition. The generative AI model can continuously process data to improve its accuracy by repeatedly learning from past data.

[0805] Step 3:

[0806] In response to detected anomalies, the server evaluates the alert level based on the risk assessment results calculated by the generated AI model. If the risk is determined to be high, the server immediately generates an alert and notifies users and security personnel of the relevant information.

[0807] Step 4:

[0808] When a user receives an alert and accesses the system to check the situation, the system uses its emotion engine to analyze the user's emotional state. It determines emotions from voice tone and facial expression data, and automatically raises the system's alert level if the user is feeling anxious.

[0809] Step 5:

[0810] The server instructs robots to patrol areas where anomalies have been detected. The robots move along pre-set paths, scanning their surroundings with built-in sensors and sending the latest status data to the server for further anomaly detection.

[0811] Step 6:

[0812] The server analyzes the data from the robots and reconfirms whether there are any anomalies. It issues additional alerts as needed to strengthen security.

[0813] Step 7:

[0814] The server records the operation logs collected at each processing step in a database. These logs are used to optimize the system and improve the learning accuracy of the generated AI models, resulting in continuous improvement of the entire system.

[0815] Through this series of processes, the system efficiently ensures user safety and, by utilizing the emotion engine, enables fast and flexible interaction.

[0816] (Example 2)

[0817] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0818] In modern living spaces, there is a growing demand for both enhanced safety and improved convenience. In particular, there is a growing need for security systems that can quickly and accurately detect anomalies within homes and buildings and respond appropriately. However, existing systems have limitations in both anomaly detection and user response, making it difficult to provide users with a greater sense of security. Therefore, the present invention aims to provide a system that enhances anomaly detection capabilities while enabling flexible responses tailored to the user's emotions.

[0819] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0820] In this invention, the server includes means for collecting environmental data using various detection devices, means for formatting the data, transmitting it to an intelligent model for analysis and detection of anomalies, and means for analyzing the user's emotional state through audio and video and adjusting the system response. This makes it possible to improve the accuracy of anomaly detection and provide appropriate responses based on the user's emotions.

[0821] A "detection device" refers to various sensors used to collect data such as sound, images, and temperature from the environment.

[0822] An "intelligent model" refers to artificial intelligence technology that uses collected data to detect and analyze anomalies.

[0823] "Mobile units" refer to automated machines such as robots that autonomously patrol designated routes and monitor changes in the environment.

[0824] "Emotional state" refers to the psychological state of a user, as analyzed from their voice tone, facial expressions, and other factors.

[0825] "Operation records" refer to the overall operational history of the system, and serve as a source of information for optimizing the system based on this data.

[0826] "Risk assessment" is the process of analyzing the risks posed by detected anomalies and using that information to generate appropriate alarms.

[0827] An "alarm" is a warning and response instruction issued by the system in response to detected anomalies or dangers.

[0828] This invention is a system that enhances the security of homes and buildings and enables advanced user interaction. The system consists of several important elements, and its specific embodiments will be described below.

[0829] Hardware and software usage

[0830] The server is responsible for collecting data from various detection devices installed in houses and apartments. Specifically, these include cameras, microphones, and temperature sensors. These devices acquire image data, audio data, and ambient temperature data, and this information is used for anomaly detection.

[0831] The server preprocesses the collected data and sends it to the generative AI model. The generative AI model uses speech recognition and image recognition technologies to analyze and detect abnormal sounds, movements, and temperature changes. The software employs advanced AI analysis algorithms and noise filtering tools.

[0832] Interaction and emotion analysis

[0833] When users interact directly with the system, they can communicate their state to the system through an emotion engine. The emotion engine analyzes the user's voice tone and facial expressions, and the system optimizes its response based on this analysis. This feature allows the system to respond flexibly according to the user's emotional state.

[0834] Specific example:

[0835] If a user reports hearing suspicious noises at night and expressing concern, the server sends the audio data to a generating AI to detect any abnormalities. If the audio analysis determines that the user is feeling anxious, the system raises its alert level, instructs robots to patrol, and checks the situation. The system also sends detailed monitoring information to the user via push notifications and email to provide reassurance.

[0836] Example of a prompt

[0837] "Please describe a system that detects abnormal sounds at night, analyzes the user's emotions, and then takes appropriate action."

[0838] Thus, by combining precise sensor technology, a generative AI model, and an emotion recognition engine, the system of the present invention can solve various challenges in modern residential security and provide users with peace of mind and safety.

[0839] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0840] Step 1:

[0841] The server collects environmental data from various detection devices. This includes image data from cameras, audio data from microphones, and temperature data from temperature sensors. The input data is sent to the server in its raw state and in various formats. Specifically, the camera periodically captures images of the gate and the interior of the room, the microphone records background sounds, and the temperature sensor measures the room temperature. The output indicates the completion of data collection.

[0842] Step 2:

[0843] The server preprocesses the collected raw data. Data processing is performed to remove noise, standardize data formats, and highlight important features. Specifically, background noise is removed from audio data, and the resolution of image data is adjusted. The input is the raw data collected in step 1, and the output is clear data ready for analysis.

[0844] Step 3:

[0845] The server sends pre-processed data to the generating AI model. The generating AI model uses speech recognition and image recognition technologies to execute an anomaly detection algorithm. Specifically, the AI ​​model analyzes the data and identifies abnormal movements and sounds. The input is pre-processed data, and the output is the result of determining whether or not an anomaly exists.

[0846] Step 4:

[0847] The server assesses the risk based on anomalies detected by the AI ​​model. It analyzes the type and frequency of anomalies and generates alarms based on security policies. Specifically, data identified as anomaly is recorded, and users are notified based on this information. The input is the anomaly detection result from the AI ​​model, and the output is alarm information.

[0848] Step 5:

[0849] If an anomaly is detected, the server issues a patrol command to the mobile robot. The robot patrols the site according to the designated route and collects additional data. Specifically, the robot automatically moves through corridors and entrances, and feeds back newly recorded video and audio to the server in real time. The input is the patrol command from the server, and the output is the results of the site inspection.

[0850] Step 6:

[0851] When a user interacts with the system, the server uses an emotion engine to analyze this information. The emotion engine determines the user's emotional state based on their tone of voice and facial expressions, and optimizes the system's response. For example, if a user says "I'm anxious," the system analyzes this and raises the alert level. The input is the emotional feedback from the user, and the output is the adjusted system response.

[0852] (Application Example 2)

[0853] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0854] Improving the safety of homes and buildings requires rapid and accurate detection of anomalies, as well as appropriate responses that respond to user emotions. However, conventional systems suffer from insufficient anomaly detection accuracy and poor user interaction quality, resulting in a lack of safety and convenience. The challenge is to solve this problem and provide a higher level of security and user experience.

[0855] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0856] In this invention, the server includes means for collecting surrounding information using various detection devices, means for analyzing the information and detecting anomalies using a generated electronic brain, and means for interacting with the user using emotion recognition and adjusting the response based on the user's emotions. This enables immediate detection of anomalies, risk assessment, and optimal interaction according to the user's emotions.

[0857] A "detection device" is a device that collects information such as sound, images, and temperature from the external environment.

[0858] A "creative electronic brain" is a system that uses advanced artificial intelligence technology to analyze collected information and detect anomalies.

[0859] A "means for assessing risk" refers to a function that determines the risk of the current situation based on the results of anomaly detection and generates appropriate warnings.

[0860] "Machines" refers to devices such as robots that move autonomously, collect information, and patrol areas with abnormalities.

[0861] "Emotion recognition" is a technology that analyzes a user's emotions from their voice and facial expressions, making it possible to adjust the system's operation based on the user's emotions.

[0862] "Operation records" refer to logs that record the overall operating status and processing details of the system in chronological order, and analyzing these logs provides information that can be used to optimize the system.

[0863] The system for carrying out this invention comprises various detection devices, a generating electronic brain means, and an emotion recognition function, and is designed to enhance the safety of homes and buildings. Specific embodiments are shown below.

[0864] First, the server continuously collects information about the surroundings through various detection devices such as cameras, microphones, and temperature sensors. This data is processed in real time and sent to a cloud-based server. Services such as AWS Lambda and Google Cloud Functions are used for this process.

[0865] The server analyzes the collected data using a generating electronic brain to detect anomalies. The generating electronic brain uses a generative AI model and can identify anomalies by performing voice recognition and image recognition. Furthermore, it uses emotion recognition to analyze voice and text from the user to understand the user's emotional state.

[0866] If an anomaly is detected, the server assesses the risk and issues a command to the machine to patrol the area. The machine autonomously moves within the designated area, collecting and feeding back data in real time, and seamlessly confirming and responding to the anomaly.

[0867] Emotion recognition enables interaction, utilizing an emotion engine when users interact with the system. The emotion engine analyzes the user's emotions from their voice tone and facial expressions, and adjusts the system's response accordingly. For example, if a user feels anxious, the system can raise its alert level and provide detailed monitoring and rapid informational feedback.

[0868] For example, if a user reports a "suspicious noise" to the system, the emotion engine analyzes it, immediately intensifies the alert mode, and the machine patrols the area quickly. The user is also notified of detailed alert information, providing a sense of security.

[0869] Examples of prompt messages include the following:

[0870] "A suspicious noise has been detected in the user's home. Please demonstrate the process of using an emotion engine to alleviate the user's anxiety."

[0871] "Please explain in detail how you plan to enhance user confidence after detecting an anomaly."

[0872] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0873] Step 1:

[0874] The server collects audio data, image data, and temperature information from various detection devices. This input data is preprocessed, such as through normalization and filtering, to improve data quality, and then prepared as input for the generated AI model.

[0875] Step 2:

[0876] The server inputs pre-processed data into a generative AI model for speech and image recognition. The generative AI model performs pattern matching and feature extraction to detect anomalies and determines whether an anomaly exists. The output consists of a flag indicating the presence or absence of an anomaly and metadata about the characteristics of that anomaly.

[0877] Step 3:

[0878] The server assesses the risk based on the results of anomaly detection. It receives anomaly flags and metadata as input and applies a rule-based algorithm to evaluate the risk level. The output is a numerical score indicating the risk level, which is used to determine the next step.

[0879] Step 4:

[0880] If an anomaly or high risk is detected, the server issues a patrol command to the machine. Using patrol route information as input, the server instructs the machine to move, collects surrounding data in real time, and feeds it back to the server. This process complements on-site anomaly verification.

[0881] Step 5:

[0882] The server receives input from the user and performs emotion recognition. It takes user voice and text data as input and analyzes it using an emotion engine. This analysis uses natural language processing algorithms to identify the user's emotions. The output is information about the type and intensity of the emotion.

[0883] Step 6:

[0884] Based on the results of the sentiment analysis, the server adjusts the system's response. For example, if the user is feeling anxious, it raises the alert level and adjusts the system to collect more data to reassure the user. This process aims to enhance informational feedback to the user.

[0885] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0886] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0887] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0888] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0889] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0890] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0891] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0892] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0893] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0894] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0895] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0896] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0897] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0898] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0899] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0900] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0901] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0902] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0903] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0904] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0905] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0906] The following is further disclosed regarding the embodiments described above.

[0907] (Claim 1)

[0908] A means of collecting surrounding data using various sensor devices,

[0909] A generative artificial intelligence means for analyzing the aforementioned data and detecting anomalies,

[0910] A means of evaluating risk and generating alerts based on anomaly detection,

[0911] A means of controlling a robot to patrol the site,

[0912] A method for analyzing and optimizing the operation logs of the entire system,

[0913] A system that includes this.

[0914] (Claim 2)

[0915] The system according to claim 1, characterized in that the generating artificial intelligence means detects anomalies by utilizing speech recognition and image recognition technology.

[0916] (Claim 3)

[0917] The system according to claim 1, characterized in that it continuously learns and improves the generating artificial intelligence means based on data from the sensor device.

[0918] "Example 1"

[0919] (Claim 1)

[0920] A means for collecting surrounding information using a remote information acquisition device,

[0921] A means for preprocessing the aforementioned information, transmitting it to a generative artificial intelligence for analysis, and detecting anomalies,

[0922] A means of evaluating the degree of risk based on anomaly detection and generating notifications,

[0923] A means of controlling an autonomous device to patrol a designated area,

[0924] A means to analyze the operational records of the entire system and perform continuous optimization,

[0925] A system that includes this.

[0926] (Claim 2)

[0927] The system according to claim 1, characterized in that the generative artificial intelligence detects anomalies by utilizing acoustic analysis and visual analysis techniques.

[0928] (Claim 3)

[0929] The system according to claim 1, characterized in that it performs continuous learning and improvement of generative artificial intelligence based on information from the remote information acquisition device.

[0930] "Application Example 1"

[0931] (Claim 1)

[0932] A means of collecting surrounding information using various detection devices,

[0933] An automated learning model means for analyzing the aforementioned information and detecting anomalies,

[0934] A means for performing an evaluation based on anomaly detection and generating an alarm,

[0935] A means of controlling mobile devices to patrol the site,

[0936] A means of evaluating and optimizing the operation record of the entire system,

[0937] A means to enable real-time confirmation of the aforementioned anomaly from a mobile device,

[0938] A system that includes this.

[0939] (Claim 2)

[0940] The system according to claim 1, characterized in that the automated learning model means detects anomalies by utilizing acoustic analysis and video analysis technology.

[0941] (Claim 3)

[0942] The system according to claim 1, characterized in that it continuously acquires and improves knowledge of the automatic learning model means based on information from the detection device.

[0943] "Example 2 of combining an emotion engine"

[0944] (Claim 1)

[0945] A means of collecting environmental data using various detection devices,

[0946] A means for formatting the aforementioned data, transmitting it to an intelligent model for analysis, and detecting anomalies,

[0947] A means for evaluating risk based on anomaly detection and generating an alarm,

[0948] A means of controlling a mobile device to patrol the site and reconfirm the situation,

[0949] A means of analyzing the user's emotional state through audio and video and adjusting the system response,

[0950] A means of continuously improving by analyzing the operation records of the entire system,

[0951] A system that includes this.

[0952] (Claim 2)

[0953] The system according to claim 1, characterized in that the intelligent model detects anomalies using voice and video recognition technology.

[0954] (Claim 3)

[0955] The system according to claim 1, characterized in that it intermittently performs learning and improvement of an intelligent model based on data from the detection device.

[0956] "Application example 2 when combining with an emotional engine"

[0957] (Claim 1)

[0958] A means of collecting surrounding information using various detection devices,

[0959] A means of creating an electronic brain that analyzes the aforementioned information and detects anomalies,

[0960] A means for evaluating risk based on anomaly detection and generating a warning,

[0961] A means of controlling a machine to patrol the site,

[0962] A means of interacting with the user using emotion recognition and adjusting responses based on the user's emotions,

[0963] A means of analyzing and optimizing the operation records of the entire system,

[0964] A system that includes this.

[0965] (Claim 2)

[0966] The system according to claim 1, characterized in that the generated electronic brain means detects anomalies by utilizing voice recognition and image recognition technology, and analyzes the user's emotions to optimize the dialogue.

[0967] (Claim 3)

[0968] The system according to claim 1, characterized in that it continuously learns and improves the generated electronic brain means based on information from the detection device, and realizes optimal interaction in accordance with the user's emotions. [Explanation of symbols]

[0969] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A means of collecting surrounding information using various detection devices, An automated learning model means for analyzing the aforementioned information and detecting anomalies, A means for performing an evaluation based on anomaly detection and generating an alarm, A means of controlling mobile devices to patrol the site, A means of evaluating and optimizing the operation record of the entire system, A means to enable real-time confirmation of the aforementioned anomaly from a mobile device, A system that includes this.

2. The system according to claim 1, characterized in that the automated learning model means detects anomalies by utilizing acoustic analysis and video analysis technology.

3. The system according to claim 1, characterized in that it continuously acquires and improves knowledge of the automatic learning model means based on information from the detection device.