system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The system addresses the challenge of real-time risk assessment and reporting by using AI and emotion analysis to capture and analyze images for immediate feedback and automated responses, ensuring user safety and public security.

JP2026105320APending Publication Date: 2026-06-26SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: SOFTBANK GROUP CORP
Filing Date: 2024-12-16
Publication Date: 2026-06-26

AI Technical Summary

Technical Problem

Existing systems fail to provide real-time risk assessment and prompt reporting of illegal activities, posing safety risks to citizens and hindering effective compliance with laws and regulations, especially in deteriorating public security environments.

Method used

A system utilizing an image acquisition device with AI analysis to capture and transmit images to a server for rapid risk assessment and automatic reporting, providing real-time feedback and warnings, and integrating an emotion engine for emotional state consideration.

Benefits of technology

Enables safe and effective compliance with laws by promptly identifying and reporting illegal activities, enhancing user safety and public security through real-time feedback and automated responses, while considering emotional states.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure 2026105320000001_ABST

Patent Text Reader

Abstract

Provide a system. 【Solution means】 Means for capturing surrounding videos via an image acquisition device, Means for transmitting the acquired video data to an analysis server via a communication device, In the server, means for analyzing actions and objects in the video and comparing them with a pre-set illegal behavior database, Means for performing a risk assessment based on the analysis result and notifying the evaluation result to the user terminal, Means for automatically reporting to external agencies as needed, Means for storing the analyzed results in a database and providing information for improving local public security, Means for identifying surrounding dangers to the user and providing feedback to recommend safe actions, Means for cooperating with external public security maintenance organizations as needed, A system including the above.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] In recent years, despite the deterioration of local public security, when citizens use existing smartphones to collect evidence, there are problems such as troubles and adverse effects on personal safety. Also, even if illegal or nuisance acts are actually discovered, it is difficult to promptly report them to appropriate authorities, so a prompt response is required. Therefore, there is a need for a system that can effectively promote compliance with laws and regulations by local residents and contribute to the improvement of public security.

Means for Solving the Problems

[0005] This invention enables citizens to safely and effectively capture images of their surroundings using a dedicated device equipped with an image acquisition device, and to quickly transmit these images to an analysis server via a communication device. On the server, AI analyzes actions and objects within the images and automatically compares them against a pre-configured database of illegal activities. The analysis results allow for a rapid risk assessment, and automatic reporting is possible if necessary. Furthermore, the system provides real-time feedback and warnings to the user, and the stored data can be used to improve local security in the future.

[0006] An "image acquisition device" is a device used to visually capture a specified environment or surrounding conditions, and may include cameras and sensors.

[0007] A "communication device" is a device equipped with the function of transferring data to other devices or servers, and it generally uses wireless communication technology.

[0008] An "analysis server" is a computer system that processes received digital data and analyzes the information it provides.

[0009] An "AI analysis engine" is a software component that uses artificial intelligence technology to analyze data and make decisions in line with a specific purpose.

[0010] A "database of illegal activities" is a database that collects information related to acts that violate laws and regulations, and is used as a reference point for analysis.

[0011] "Risk assessment" is the process of measuring and judging the legal or safety risks associated with a particular event based on the results of an analysis.

[0012] "Automatic notification" refers to a process in which, when a system meets certain criteria, it automatically notifies a pre-configured external organization without manual intervention.

[0013] "Feedback" refers to information provided to users, used to communicate analysis results and current risks.

[0014] "Storage" refers to the act of recording data in a way that makes it accessible later, and this is done in storage systems such as databases. [Brief explanation of the drawing]

[0015] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13]It is a sequence diagram showing the processing flow of the data processing system in Embodiment 2 when the emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when the emotion engine is combined.

Modes for Carrying Out the Invention

[0016] Hereinafter, an example of an embodiment of the system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0017] First, the terms used in the following description will be explained.

[0018] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), etc.

[0019] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0020] In the following embodiments, the numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.

[0021] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0022] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0023] [First Embodiment]

[0024] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0025] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0026] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0027] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0028] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0029] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0030] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0031] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0032] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0033] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0034] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0035] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0036] This invention provides a system that combines smart devices and AI analysis technology to enable users to safely promote legal compliance in their daily lives. Specific embodiments are described below.

[0037] First, the user attaches a dedicated image acquisition device. This device is equipped with high-performance cameras and sensors that can capture the surrounding environment in high resolution in real time.

[0038] The captured video is transmitted directly to the communication device via the terminal. This communication uses wireless technologies such as Bluetooth, ensuring fast and secure data transmission.

[0039] The transmitted data is accessed by a server in the cloud. The server integrates an AI analysis engine and begins processing the received video data immediately. Specifically, it detects the movement of objects and people in the video and compares it against a pre-configured database of illegal activities.

[0040] If the server performs an analysis and detects illegal activity or related risks, a risk assessment process is initiated. Based on the results of this assessment, feedback is immediately provided to the user. This feedback includes details of the relevant risk and recommended actions, and is communicated to the user through the application on their device.

[0041] Furthermore, depending on the risk assessment results, the server can activate an automatic notification function. This function transmits necessary information to external relevant organizations according to pre-configured criteria. The notification may include location information and video footage from the scene, enabling a rapid response.

[0042] Finally, the server saves the analysis results to a database. This saved data can be used for future analysis of local security and understanding trends, and can serve as basic data for policy decisions and security improvement measures.

[0043] As a concrete example, suppose a user is walking through the city at night and spots a suspicious group near a commercial facility. In this case, the image acquisition device automatically captures video footage, and the server analyzes their actions. Based on the analysis results, potential risks are fed back, allowing the user to choose safer actions. Furthermore, if a significant risk is detected, the server immediately notifies the police, contributing to local safety by promoting preventative measures.

[0044] The following describes the processing flow.

[0045] Step 1:

[0046] The user attaches the image acquisition device and starts capturing video. The device's sensors capture the surrounding environment and continuously generate video data.

[0047] Step 2:

[0048] The terminal (image acquisition device) transmits the captured video data to the smartphone using its built-in communication device. This communication is usually performed via Bluetooth connection.

[0049] Step 3:

[0050] The device (smartphone) uploads the video data received from the smart glasses to an analysis server. The upload is performed using an internet connection, and the data is protected by a secure protocol.

[0051] Step 4:

[0052] The server inputs the received video data into an AI analysis engine. Here, the scenes and movements within the video are analyzed. The AI uses a pre-trained model to identify actions that appear to be abnormal or illegal.

[0053] Step 5:

[0054] The server determines whether there is a possibility of illegal activity based on the analysis results. Risk assessment is performed by comparing the analysis results with a pre-configured database of illegal activities.

[0055] Step 6:

[0056] The server performs a risk assessment and generates a feedback message. The feedback includes specific risk information and recommended actions based on the analysis results.

[0057] Step 7:

[0058] The device (smartphone) receives feedback messages and notifies the user. This notification allows the user to understand the current risks and take safe actions.

[0059] Step 8:

[0060] When the server detects a high-risk event, it activates an automatic notification function. Based on the settings, a notification message is sent to the relevant external organization.

[0061] Step 9:

[0062] The server records the analysis and reporting details in a database. This data is used for subsequent analysis and as planning material for improving local security.

[0063] (Example 1)

[0064] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0065] In modern society, ensuring individual safety and maintaining public order requires quickly and accurately understanding the environment and encouraging appropriate actions. However, conventional monitoring technologies have limitations in real-time risk assessment and rapid feedback to users, making it difficult to implement effective countermeasures. Furthermore, there is a need for a comprehensive system that can detect specific risk behaviors and notify relevant organizations in a timely manner.

[0066] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0067] In this invention, the server includes means for analyzing actions and objects in visual information and comparing them with a pre-configured database of inappropriate actions; means for performing a risk assessment based on the analysis results and notifying the user terminal; and means for providing the user with recommended actions according to specific situations and supporting action selection. This enables real-time risk assessment and rapid feedback, making it possible to improve local security while enhancing user safety.

[0068] An "image acquisition device" is a device that captures surrounding visual information at high resolution and is portable for personal use.

[0069] A "communication device" is a device that uses wireless technology to transmit acquired visual information to an analysis device.

[0070] An "analysis device" is a computer system that uses advanced artificial intelligence technology to analyze received visual information and determine specific actions.

[0071] The "inappropriate behavior database" is a database that stores information on pre-defined inappropriate behaviors and serves as a standard data storage used to compare analysis results.

[0072] "Risk assessment" is the process of evaluating the degree of risk of a specific action or situation based on analyzed information, and then providing advice to the user based on the results.

[0073] A "user terminal" is a personal electronic device designed to receive feedback and warning information.

[0074] "External organizations" refer to organizations that require a rapid response depending on the specific situation, such as the police or related public institutions.

[0075] "Recommended actions" are suggestions that encourage users to choose safe and appropriate actions based on the analysis results.

[0076] "Support for action choices" refers to a function that assists users in making decisions by recommending safe actions based on the results of risk assessments.

[0077] This invention is a system that promotes safe behavior by capturing and analyzing visual information from the surroundings through an image acquisition device worn by the user. The image acquisition device is equipped with a high-performance camera and sensors, and can record visual information in real time with high accuracy even while the user is moving. The acquired visual information is quickly transmitted via a communication device to an analysis device located remotely using wireless technology such as Bluetooth.

[0078] The server functions as a cloud-based analysis device, receiving visual information and executing advanced algorithms using artificial intelligence technology to analyze the actions and objects contained within that visual information. The analysis results are compared against an inappropriate behavior database, and a risk assessment of each action is performed. The server notifies the user's terminal of the assessment results and provides real-time feedback through an application on the terminal.

[0079] As a concrete example, consider a scenario where a user is walking near a commercial facility at night and image acquisition equipment detects unusual movement within a crowd. In this case, the server quickly analyzes the situation and notifies the user of the potential risk. Simultaneously, if the server determines that there is a significant risk, it will automatically report to external organizations such as the police, as necessary, based on pre-configured settings.

[0080] An example of a prompt message is, "Generate recommended safe actions for a user to take when they spot a suspicious group near a commercial facility, based on image analysis." Based on this message, the generating AI model creates action suggestions appropriate to the risk, prompting the user to make a safe choice.

[0081] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0082] Step 1:

[0083] The user wears an image acquisition device to acquire visual information of their surroundings in real time. The input is visual data of the user's environment, and the output is high-resolution visual data. This data is ready to be transmitted directly to a communication device. Specifically, the camera and sensors work together to continuously capture frames and digitize this data.

[0084] Step 2:

[0085] The terminal transmits the acquired visual data to the server via a communication device. The input is the visual data generated in step 1, and the output is the completion of secure data transfer using wireless technology. Communication is mainly carried out using the Bluetooth protocol, and the data is compressed before being transferred to the server.

[0086] Step 3:

[0087] The server analyzes the visual data it receives. The input is compressed visual data sent from the terminal, and the output is behavioral information and risk assessment data as a result of the analysis. Here, an AI analysis engine is used to perform object recognition and behavioral analysis, and the process is carried out by comparing it with an inappropriate behavior database.

[0088] Step 4:

[0089] The server performs a risk assessment based on the analysis results. The input is the behavioral and risk information obtained from the analysis in step 3, and the output is the details of the assessed risk and recommended actions. Specifically, the process involves quantifying the degree of risk based on the analyzed data and incorporating action suggestions.

[0090] Step 5:

[0091] The device receives notifications from the server and provides feedback to the user. Input is a risk assessment and recommended action sent from the server, while output is a specific warning message and action suggestion to the user. Specifically, this includes actions such as the application on the device displaying notifications in real time and providing information in a way that is easily understandable to the user.

[0092] Step 6:

[0093] The server automatically notifies external organizations if a significant risk is identified. Inputs are the risk assessment results and location information obtained in step 4, and output is a report to the relevant authorities. Specific actions include organizing information that meets the notification criteria and sending data via automated email or API.

[0094] Step 7:

[0095] The server stores the analysis results and the evaluation information based on them in a database. The input is the data and evaluation information from steps 3 to 6, and the output is an update to the database that can be used for future analysis and security improvement measures. Here, we will implement specific operations to securely store data using a database management system and enable analysis as needed.

[0096] (Application Example 1)

[0097] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0098] In modern society, it is difficult for individual users to accurately assess the safety of their current environment and take appropriate action, and in many cases, accidents and troubles occur due to a lack of caution. To improve this situation and enable users to live their daily lives with peace of mind, a system is needed that can assess environmental risks in real time and provide swift and concrete safety measures.

[0099] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0100] In this invention, the server includes means for providing feedback to the user to identify surrounding hazards and recommend safe actions, means for coordinating with external law enforcement organizations as needed, and means for generating feedback, including details, in real time when the analysis server detects suspicious behavior, and providing it immediately to the user terminal. This enables the user to continuously take safe and appropriate actions.

[0101] An "image acquisition device" is a device that captures surrounding visual information at high resolution and outputs it as digital data.

[0102] A "communication device" is a device used to transmit digital data to other devices or servers using wireless or wired technology.

[0103] An "analysis server" is a computer system installed to process received data and perform analysis based on specific conditions.

[0104] A "database of illegal activities" is a database that records information about actions and objects that violate laws and regulations that have been registered in advance.

[0105] "Risk assessment" is the process of determining potential risks related to the current situation based on the results of an analysis.

[0106] A "user terminal" is an electronic device used by a user to receive information and to receive feedback and recommendations through an interface.

[0107] "Automatic reporting" refers to a function where the system automatically reports the situation to external relevant organizations when certain criteria are met.

[0108] "External organizations" refer to groups such as the police and private security organizations that cooperate for the purpose of maintaining public order and ensuring safety.

[0109] "Feedback" refers to information provided based on analysis results to inform users about the situation they are facing and to guide them toward appropriate actions.

[0110] A "public safety organization" is a public or private group that works to ensure local security and compliance with the law.

[0111] In the system based on this invention, the user first utilizes an image acquisition device equipped with a high-performance camera and sensors. This image acquisition device collects surrounding visual information in real time and transmits this acquired data to an analysis server via a communication device. Wireless communication technologies such as Bluetooth and 4G / 5G are used as the communication technology.

[0112] The server compares the received video data with a pre-configured database of illegal activities and performs analysis using an AI analysis engine. A cloud-based solution such as Google Cloud AI could be used as the AI analysis engine. Based on the analysis results, the server assesses the risk and immediately notifies the user's device. The user's device is a mobile device such as a smartphone or tablet, and feedback is displayed on its screen to provide the user with recommendations for safe behavior.

[0113] Furthermore, the server uses an automated reporting function as needed to report the situation to external law enforcement organizations. This function mitigates potential risks faced by users and enables a rapid response. Users can take actions to avoid risks based on feedback from the system.

[0114] As a concrete example, if a user is walking at night using their smartphone, the smartphone's AI security assistant will detect suspicious activity in the vicinity, issue a warning to the user, and suggest a safe route. Furthermore, if the situation is deemed particularly dangerous, it will immediately notify the police. In this way, users can move around with peace of mind.

[0115] An example of a prompt for a generated AI model might be: "Please describe a risk detection algorithm for ensuring safety at night. In particular, please describe in detail how to efficiently recognize suspicious activity in the surroundings and how to design a system that provides feedback to the user."

[0116] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0117] Step 1:

[0118] The user uses an image acquisition device to acquire high-resolution video data of the surroundings in real time. During this process, the camera and sensors of the image acquisition device capture visual information, and this data is transmitted to the communication device in digital format. The input is real-time acquired visual data, and the output is digital video data.

[0119] Step 2:

[0120] The communication device transmits the acquired digital video data to the server. Wireless technologies such as Bluetooth and LTE are used for fast and secure data transfer. The input is the video data generated in step 1, and the data is the output delivered to the server.

[0121] Step 3:

[0122] The server receives video data and processes it using an AI analysis engine. This AI analysis engine uses a cloud-based solution such as Google Cloud AI to analyze actions and objects in the video and compare them against a database of illegal activities. The input is the video data sent to the server, and the output is the analyzed information on actions and objects.

[0123] Step 4:

[0124] The server assesses the risks of the current environment based on the analyzed data. Using a risk assessment algorithm, it performs a data evaluation process, obtaining the level of risk as output. The input is the analysis result from step 3.

[0125] Step 5:

[0126] The server notifies the user terminal of the assessed risk results. The user terminal displays warnings and feedback on safety actions on the screen of a smartphone or tablet. The input is the risk assessment results obtained in step 4, and the output is the notification content to the user.

[0127] Step 6:

[0128] The server automatically notifies external law enforcement organizations of risk information when the risk exceeds a certain threshold. The notification includes location information and details of the risk, prompting a swift external response. The input is the risk assessment result from step 4, and the output is the reported risk information.

[0129] Step 7:

[0130] The user takes safe actions based on the feedback. They follow the provided safe routes and guidelines, and take specific actions to avoid risks. The input is the feedback information from step 5, and the output is the user's actions.

[0131] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0132] This invention provides a system that incorporates an emotion engine in addition to an image acquisition device and an AI analysis engine, thereby providing feedback and notifications that take into account the user's emotional state. This system ensures user safety while simultaneously improving local public safety by considering psychological factors.

[0133] First, the user puts on an image acquisition device. This device not only captures images of the surroundings but also has the function of detecting the user's own facial expressions and voice. The device acquires the user's real-time facial expression data and sends it to the emotion engine.

[0134] The device (usually a smartphone) transmits the acquired video data and facial expression data to an analysis server. This communication is conducted using a secure protocol.

[0135] The server inputs the received data into an AI analysis engine and an emotion engine. The AI analysis engine works to detect illegal activities in the area from the video data. Meanwhile, the emotion engine analyzes the user's emotional state from their facial expressions and voice tone to determine their stress and anxiety levels.

[0136] Based on the analysis results, the server assesses the overall risk. Here, the user's emotional state is reflected in the risk assessment and incorporated into the final feedback. For example, if the user is in a high-stress state, the tone of the feedback may be changed, or additional warnings may be issued.

[0137] The device receives feedback from the server and notifies the user. This feedback includes not only potential illegal activity but also emotionally sensitive advice and recommendations. If necessary, the server automatically reports to external agencies. These reports include spatial and temporal information, as well as warnings based on emotional data.

[0138] Ultimately, the server stores the analysis results and emotional state data in a database. This data is used as a foundation for analyzing the psychological and safety trends of the community and is utilized in developing future improvement plans.

[0139] As a concrete example, suppose a user is walking through a noisy shopping street when an image acquisition device detects a violent incident occurring nearby. When the emotion engine senses the user's anxiety, the server determines that the situation is high-risk. It issues a warning to the device and simultaneously automatically notifies the police, prompting a swift response. In this way, the system can perform comprehensive risk management, including the user's psychological state.

[0140] The following describes the processing flow.

[0141] Step 1:

[0142] The user puts on the image acquisition device and activates the system. The device is equipped with facial expression detection capabilities and monitors the user's facial movements and voice tone in real time.

[0143] Step 2:

[0144] The terminal (smartphone) receives video data and user facial expression data transmitted from the image acquisition device. The data is transferred using a secure wireless protocol.

[0145] Step 3:

[0146] The terminal uploads video data and facial expression data to an analysis server. The server receives this data in a single process.

[0147] Step 4:

[0148] The server inputs video data into an AI analysis engine to determine whether the people or movements in the video constitute illegal activity. This determination is made by comparing the data with a database of illegal activities.

[0149] Step 5:

[0150] Simultaneously, the server inputs the user's facial expression data into the emotion engine. The emotion engine analyzes the user's emotional state (e.g., stress, anxiety, tension). This analysis combines facial expression recognition algorithms with voice analysis.

[0151] Step 6:

[0152] The server integrates the analysis results from the AI analysis engine and the emotion engine to perform a comprehensive risk assessment. If the user's emotional state is determined to be higher risk than usual, special considerations are added to the assessment results.

[0153] Step 7:

[0154] The server generates feedback that takes the user's emotions into consideration, based on the analysis results. This feedback includes recommendations for specific actions and advice on emotional support.

[0155] Step 8:

[0156] The device receives the generated feedback and immediately notifies the user. The notification includes detailed information about the situation the user is facing and recommended actions.

[0157] Step 9:

[0158] If necessary, the server will trigger an automatic reporting system. The report will include details about the potential illegal activity, as well as an analysis of the user's emotional state.

[0159] Step 10:

[0160] The server stores all analysis results and sentiment data in a database, which will be used for long-term analysis of local security and the psychological state of residents. This data will contribute to planning future security improvement measures.

[0161] (Example 2)

[0162] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0163] Ensuring social safety and maintaining individual psychological health are crucial challenges in modern society. Conventional security systems only detect surrounding dangers, making it difficult to respond appropriately while considering the user's emotional state. Furthermore, they lack sufficient real-time notifications and warnings to respond quickly to abnormal situations. There is a need to realize a safe and psychologically secure society that takes into account emotional states and environmental changes.

[0164] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0165] In this invention, the server includes means for capturing ambient data via image and sound acquisition equipment, means for transmitting the acquired data to an analysis device via a communication device, and means for the analysis device to analyze the video and audio data and compare it with a pre-configured database to evaluate behavior or emotions. This enables comprehensive risk management that takes into account emotional states while ensuring user safety.

[0166] "Image and audio acquisition equipment" refers to devices that capture surrounding video and audio, and acquire the user's visual and auditory information in real time.

[0167] A "communication device" is a device used to transmit acquired data to other devices or servers, and it ensures the security of the data using security protocols.

[0168] An "analysis device" is a device that analyzes behavior and emotions based on received data, and uses AI technology to process the data and make predictions and judgments.

[0169] A "database" is a collection of information that serves as a reference standard for video and audio analysis, and it forms the basis for comparing data with past data and specified information.

[0170] "Assessment of behavior or emotion" is the process by which an analytical device analyzes a person's behavioral patterns and emotional state based on data, and determines their stress level and risk.

[0171] An "information terminal" refers to a device used to notify users of analysis results and feedback, and is a portable communication device.

[0172] "Automatic reporting to external agencies" is a process that automatically reports information to the relevant authorities when an anomaly or danger is detected, in order to facilitate a rapid response.

[0173] To implement this invention, the user first wears an image and audio acquisition device. This device is capable of capturing surrounding video and the user's own voice in real time. Next, the device transmits the acquired data to an analysis device. At this time, communication technologies such as Bluetooth and Wi-Fi are used, and the security of the data is ensured by security protocols.

[0174] The server inputs the received data into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes the video data to detect abnormal behavior. Meanwhile, the emotion analysis engine analyzes the user's emotional state based on audio data and facial expression data to determine the level of stress and anxiety. The analysis device compares the data with a pre-configured database to assess the risk.

[0175] Based on the evaluation results, the device provides feedback to the user. This feedback may include advice and action guidelines tailored to the user's emotional state. Furthermore, if necessary, the server automatically notifies external organizations to prompt a swift response. This information includes location and time data from the analysis device.

[0176] For example, if a user is walking in a noisy area and the tracking device detects danger, and the server determines that the user is experiencing high stress, the device will display instructions to the user such as "move to a safe place" or "take a deep breath." By utilizing this system, user safety can be ensured, and real-time risk management becomes possible.

[0177] Examples of prompts for a generative AI model:

[0178] "Please explain the specific processing steps for a local security system that takes user emotions into consideration. In particular, please focus on the data analysis and feedback provision processes."

[0179] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0180] Step 1:

[0181] The user wears image and audio acquisition equipment. This equipment captures surrounding video and the user's own voice in real time and saves it as digital data. The input is ambient environmental data and the user's voice data, and the output at this stage is digitized video and audio data. Specifically, the camera and microphone become constantly active and begin capturing data.

[0182] Step 2:

[0183] The terminal receives the acquired digital data and transmits it to the analysis device. For security reasons, the data is transmitted via Bluetooth or Wi-Fi. The input is digitized video and audio data, and the output is secure data packets for transmission to the server. Specifically, the data is compressed and encrypted, and the packetized data is transmitted over the network.

[0184] Step 3:

[0185] The server inputs data received from the terminal into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes behavior and abnormal situations from video data, while the emotion analysis engine analyzes voice tone and facial expressions to determine the user's emotional state. The input is compressed data sent to the server, and the output is the behavior analysis results and the emotional state evaluation results. Specifically, the analysis engine uses an image recognition algorithm to detect suspicious activities and emotional patterns through data pattern matching.

[0186] Step 4:

[0187] The server performs a comprehensive risk assessment based on the analysis results. The user's emotional state is a key factor in this assessment. Inputs are behavioral analysis results and emotional state assessment results, while outputs are a risk score and recommended actions. Specifically, the server integrates the data and applies a risk algorithm to quantify the potential risk level.

[0188] Step 5:

[0189] The device receives risk assessment results from the server and provides feedback to the user. This feedback includes emotionally sensitive advice. Input is the risk score and recommended actions from the server, and output is notifications and recommended actions to the user. Specifically, the device displays alerts and draws attention through sound and vibration.

[0190] Step 6:

[0191] If necessary, the server will automatically notify external organizations. This notification will include the user's location and time information. The input is detailed data on situations deemed high-risk, and the output is the notification information sent to external organizations. Specifically, the notification protocol is activated, and a standardized message is sent to pre-configured emergency contacts.

[0192] (Application Example 2)

[0193] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0194] In modern society, deteriorating public safety and increasing psychological stress are serious problems. In public spaces, while prevention of illegal activities and swift response are required, it is also necessary to consider the psychological burden on individuals. However, conventional systems have difficulty accurately grasping the emotional state of users and conducting risk assessments accordingly, which has sometimes led to delays in response. This invention aims to solve these problems and realize both user safety and psychological consideration.

[0195] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0196] In this invention, the server includes means for capturing surrounding video via an image acquisition device, means equipped with an AI analysis engine for analyzing the acquired video and audio data, and means for evaluating the user's emotional state using an emotion analysis engine to determine risk. This enables early detection of illegal activities that also take into account the user's emotional state, and provides feedback to reduce psychological burden.

[0197] An "image acquisition device" is a device that captures surrounding video footage and collects visual information about the user and the environment.

[0198] A "communication device" is a device used to transmit acquired data to a server for analysis, and it has the function of connecting using a secure protocol.

[0199] An "analysis server" is a computer system that processes received video and audio data to perform behavioral analysis, sentiment analysis, and risk assessment.

[0200] An "AI analysis engine" is artificial intelligence that analyzes actions and objects from video data and detects illegal activities by comparing them with a pre-configured database of illegal activities.

[0201] An "emotion analysis engine" is a system that uses voice and facial expression data to analyze the user's emotional state and evaluate their stress and anxiety levels.

[0202] A "database of illegal activities" is a collection of information that records past cases and behavioral patterns based on laws, and is used to cross-reference with analysis results.

[0203] "Risk assessment" is the process of evaluating the degree of potential danger and psychological burden based on analysis results and emotional assessments.

[0204] A "user terminal" is an electronic device that notifies users of analysis results and feedback, and is used for users to receive information.

[0205] "Automatic notification" is a function that sends alerts to external organizations based on analysis results and risk assessments as needed.

[0206] A "database" is a means of storing information that accumulates the results of analysis and is used for analyzing and providing information on local public safety and psychological trends.

[0207] To implement this invention, the following system configuration is primarily required. The user first wears an image acquisition device, which acquires video footage of the surroundings and their own facial expression data. The server receives this acquired video and audio data and performs analysis using an AI analysis engine and an emotion analysis engine. Specifically, the AI analysis engine analyzes actions and objects in the video and compares them with a database of illegal activities to assess potential risks.

[0208] The server uses an emotion analysis engine to analyze the user's emotional state from voice data and determine the level of stress and anxiety. This analysis result, along with the emotional assessment, is comprehensively evaluated to determine the risk level. If necessary, the server uses an automatic notification function to send an alert to an external organization. This feedback is notified to the user's terminal, providing the user with feedback and warnings.

[0209] The software used includes TENSORFLOW® and PyTorch, which enable the functionality of the AI analysis engine. Data processing and calculations are performed through these frameworks. For emotion analysis, custom algorithms are used to analyze facial expressions and voice tone.

[0210] For example, if a user is walking through a busy area and the emotion engine detects potential illegal activity from surrounding noise and behavior, and assesses the user's high stress level, the system will provide feedback such as, "There is a problem nearby. Please move to a safer location," and will contact the police if necessary.

[0211] An example of a prompt message is: "The user is walking through a busy area and feels uneasy about their surroundings. The emotion engine has assessed the stress level as high. Please provide a safety-conscious action plan for how to respond to this situation." This demonstrates how the system can generate questions based on the user's emotions and circumstances.

[0212] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0213] Step 1:

[0214] The user wears an image acquisition device. The device acquires video of the surroundings and the user's facial expressions, and simultaneously collects audio data. This input data is transmitted to a server via a communication device. The output here is video and audio data.

[0215] Step 2:

[0216] The server inputs the received video data into its AI analysis engine and begins analyzing behavior and objects. It then performs a process of comparing the data with an illegal activity database to determine whether or not illegal activity has occurred. The input here is video data, and the output is the analyzed behavior data and the evaluation results of illegal activity.

[0217] Step 3:

[0218] Simultaneously, the server inputs the voice data into an emotion analysis engine to evaluate the user's emotional state. This analysis focuses on determining stress and anxiety levels. The input is voice data, and the analyzed emotion data is the output.

[0219] Step 4:

[0220] The server integrates behavioral and emotional data to perform a risk assessment. Here, it determines the overall level of risk and decides on the necessary actions. Inputs are behavioral and emotional data, and output is the risk level assessment result.

[0221] Step 5:

[0222] The server generates and sends feedback to the user terminal based on the evaluation results. This feedback includes specific advice and warnings. The input is the risk assessment result, and the output is the content of the notification.

[0223] Step 6:

[0224] In some cases, the server uses an automated notification function to send an alert to the appropriate external organization. The input is the risk assessment result, and the output is the content of the notification.

[0225] Each step incorporates multiple evaluation processes and data analysis methods to enable rapid responses based on the user's safety and psychological state.

[0226] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0227] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0228] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0229] [Second Embodiment]

[0230] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0231] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0232] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0233] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0234] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0235] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0236] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0237] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0238] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0239] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0240] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0241] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0242] This invention provides a system that combines smart devices and AI analysis technology to enable users to safely promote legal compliance in their daily lives. Specific embodiments are described below.

[0243] First, the user attaches a dedicated image acquisition device. This device is equipped with high-performance cameras and sensors that can capture the surrounding environment in high resolution in real time.

[0244] The captured video is transmitted directly to the communication device via the terminal. This communication uses wireless technologies such as Bluetooth, ensuring fast and secure data transmission.

[0245] The transmitted data is accessed by a server in the cloud. The server integrates an AI analysis engine and begins processing the received video data immediately. Specifically, it detects the movement of objects and people in the video and compares it against a pre-configured database of illegal activities.

[0246] If the server performs an analysis and detects illegal activity or related risks, a risk assessment process is initiated. Based on the results of this assessment, feedback is immediately provided to the user. This feedback includes details of the relevant risk and recommended actions, and is communicated to the user through the application on their device.

[0247] Furthermore, depending on the risk assessment results, the server can activate an automatic notification function. This function transmits necessary information to external relevant organizations according to pre-configured criteria. The notification may include location information and video footage from the scene, enabling a rapid response.

[0248] Finally, the server saves the analysis results to a database. This saved data can be used for future analysis of local security and understanding trends, and can serve as basic data for policy decisions and security improvement measures.

[0249] As a concrete example, suppose a user is walking through the city at night and spots a suspicious group near a commercial facility. In this case, the image acquisition device automatically captures video footage, and the server analyzes their actions. Based on the analysis results, potential risks are fed back, allowing the user to choose safer actions. Furthermore, if a significant risk is detected, the server immediately notifies the police, contributing to local safety by promoting preventative measures.

[0250] The following describes the processing flow.

[0251] Step 1:

[0252] The user attaches the image acquisition device and starts capturing video. The device's sensors capture the surrounding environment and continuously generate video data.

[0253] Step 2:

[0254] The terminal (image acquisition device) transmits the captured video data to the smartphone using its built-in communication device. This communication is usually performed via Bluetooth connection.

[0255] Step 3:

[0256] The device (smartphone) uploads the video data received from the smart glasses to an analysis server. The upload is performed using an internet connection, and the data is protected by a secure protocol.

[0257] Step 4:

[0258] The server inputs the received video data into an AI analysis engine. Here, the scenes and movements within the video are analyzed. The AI uses a pre-trained model to identify actions that appear to be abnormal or illegal.

[0259] Step 5:

[0260] The server determines whether there is a possibility of illegal activity based on the analysis results. Risk assessment is performed by comparing the analysis results with a pre-configured database of illegal activities.

[0261] Step 6:

[0262] The server performs a risk assessment and generates a feedback message. The feedback includes specific risk information and recommended actions based on the analysis results.

[0263] Step 7:

[0264] The device (smartphone) receives feedback messages and notifies the user. This notification allows the user to understand the current risks and take safe actions.

[0265] Step 8:

[0266] When the server detects a high-risk event, it activates an automatic notification function. Based on the settings, a notification message is sent to the relevant external organization.

[0267] Step 9:

[0268] The server records the analysis and reporting details in a database. This data is used for subsequent analysis and as planning material for improving local security.

[0269] (Example 1)

[0270] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0271] In modern society, ensuring individual safety and maintaining public order requires quickly and accurately understanding the environment and encouraging appropriate actions. However, conventional monitoring technologies have limitations in real-time risk assessment and rapid feedback to users, making it difficult to implement effective countermeasures. Furthermore, there is a need for a comprehensive system that can detect specific risk behaviors and notify relevant organizations in a timely manner.

[0272] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0273] In this invention, the server includes means for analyzing actions and objects in visual information and comparing them with a pre-configured database of inappropriate actions; means for performing a risk assessment based on the analysis results and notifying the user terminal; and means for providing the user with recommended actions according to specific situations and supporting action selection. This enables real-time risk assessment and rapid feedback, making it possible to improve local security while enhancing user safety.

[0274] An "image acquisition device" is a device that captures surrounding visual information at high resolution and is portable for personal use.

[0275] A "communication device" is a device that uses wireless technology to transmit acquired visual information to an analysis device.

[0276] An "analysis device" is a computer system that uses advanced artificial intelligence technology to analyze received visual information and determine specific actions.

[0277] The "inappropriate behavior database" is a database that stores information on pre-defined inappropriate behaviors and serves as a standard data storage used to compare analysis results.

[0278] "Risk assessment" is the process of evaluating the degree of risk of a specific action or situation based on analyzed information, and then providing advice to the user based on the results.

[0279] A "user terminal" is a personal electronic device designed to receive feedback and warning information.

[0280] "External organizations" refer to organizations that require a rapid response depending on the specific situation, such as the police or related public institutions.

[0281] "Recommended actions" are suggestions that encourage users to choose safe and appropriate actions based on the analysis results.

[0282] "Support for action selection" is a function that assists the user's decision-making by recommending safe actions to the user based on the risk assessment results.

[0283] This invention is a system that captures surrounding visual information through an image acquisition device worn by the user and analyzes it to promote safe actions. The image acquisition device is equipped with a high-performance camera and sensors, and can record visual information in real time with high precision even while the user is moving. Through a communication device, the acquired visual information is quickly transmitted to an analysis device located remotely using wireless technologies such as Bluetooth.

[0284] The server functions as a cloud-based analysis device, receives visual information, and analyzes the actions and objects contained in the visual information by executing an advanced algorithm using artificial intelligence technology. The analysis results are compared with an inappropriate action database, and a risk assessment of each action is performed. The server notifies the user's terminal of the evaluation results and provides real-time feedback through an application on the terminal.

[0285] As a specific example, when the user is walking near a commercial facility at night and the image acquisition device detects abnormal movements in a crowd. In this case, the server quickly performs an analysis and notifies the user about potential risks. At the same time, if the server determines a significant risk, an automatic report is made to an external organization such as the police as required based on pre-settings.

[0286] Examples of prompt sentences include "Please generate a method for recommending safe actions that the user should take based on image analysis when discovering a suspicious group near a commercial facility." Based on this sentence, the generation AI model creates action proposals suitable for the risk and prompts the user to make a safe choice.

[0287] The flow of the specific process in Example 1 will be described using FIG. 11.

[0288] Step 1:

[0289] The user wears an image acquisition device to acquire visual information of their surroundings in real time. The input is visual data of the user's environment, and the output is high-resolution visual data. This data is ready to be transmitted directly to a communication device. Specifically, the camera and sensors work together to continuously capture frames and digitize this data.

[0290] Step 2:

[0291] The terminal transmits the acquired visual data to the server via a communication device. The input is the visual data generated in step 1, and the output is the completion of secure data transfer using wireless technology. Communication is mainly carried out using the Bluetooth protocol, and the data is compressed before being transferred to the server.

[0292] Step 3:

[0293] The server analyzes the visual data it receives. The input is compressed visual data sent from the terminal, and the output is behavioral information and risk assessment data as a result of the analysis. Here, an AI analysis engine is used to perform object recognition and behavioral analysis, and the process is carried out by comparing it with an inappropriate behavior database.

[0294] Step 4:

[0295] The server performs a risk assessment based on the analysis results. The input is the behavioral and risk information obtained from the analysis in step 3, and the output is the details of the assessed risk and recommended actions. Specifically, the process involves quantifying the degree of risk based on the analyzed data and incorporating action suggestions.

[0296] Step 5:

[0297] The device receives notifications from the server and provides feedback to the user. Input is a risk assessment and recommended action sent from the server, while output is a specific warning message and action suggestion to the user. Specifically, this includes actions such as the application on the device displaying notifications in real time and providing information in a way that is easily understandable to the user.

[0298] Step 6:

[0299] The server automatically notifies external organizations if a significant risk is identified. Inputs are the risk assessment results and location information obtained in step 4, and output is a report to the relevant authorities. Specific actions include organizing information that meets the notification criteria and sending data via automated email or API.

[0300] Step 7:

[0301] The server stores the analysis results and the evaluation information based on them in a database. The input is the data and evaluation information from steps 3 to 6, and the output is an update to the database that can be used for future analysis and security improvement measures. Here, we will implement specific operations to securely store data using a database management system and enable analysis as needed.

[0302] (Application Example 1)

[0303] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0304] In modern society, it is difficult for individual users to accurately assess the safety of their current environment and take appropriate action, and in many cases, accidents and troubles occur due to a lack of caution. To improve this situation and enable users to live their daily lives with peace of mind, a system is needed that can assess environmental risks in real time and provide swift and concrete safety measures.

[0305] The specific processing by the specific processing unit 290 of the data processing apparatus 12 in Application Example 1 is realized by the following means.

[0306] In this invention, the server includes means for identifying surrounding dangers to the user and providing feedback recommending safe actions, means for collaborating with external public order maintenance organizations as necessary, and means for generating in real time a feedback including details thereof and immediately providing it to the user terminal when the analysis server detects suspicious behavior. Thereby, it becomes possible for the user to continuously perform safe and appropriate actions.

[0307] An "image acquisition device" is a device for capturing surrounding visual information at high resolution and outputting it as digital data.

[0308] A "communication device" is a device for transmitting digital data to other devices or servers by means of wireless or wired technology.

[0309] An "analysis server" is a computer system installed for processing received data and performing analysis based on specific conditions.

[0310] A "law violation database" is a database recording information on actions or objects violating pre-registered laws and regulations.

[0311] "Risk assessment" is a process of judging potential risks related to the current situation based on analysis results.

[0312] A "user terminal" is an electronic device for the user to receive information and receive feedback and recommendations through an interface.

[0313] "Automatic notification" is a function for the system to automatically report the situation to external related organizations when specific criteria are met.

[0314] "External organizations" refer to groups such as the police and private security organizations that cooperate for the purpose of maintaining public order and ensuring safety.

[0315] "Feedback" refers to information provided based on analysis results to inform users about the situation they are facing and to guide them toward appropriate actions.

[0316] A "public safety organization" is a public or private group that works to ensure local security and compliance with the law.

[0317] In the system based on this invention, the user first utilizes an image acquisition device equipped with a high-performance camera and sensors. This image acquisition device collects surrounding visual information in real time and transmits this acquired data to an analysis server via a communication device. Wireless communication technologies such as Bluetooth and 4G / 5G are used as the communication technology.

[0318] The server compares the received video data with a pre-configured database of illegal activities and performs analysis using an AI analysis engine. A cloud-based solution such as Google Cloud AI could be used as the AI analysis engine. Based on the analysis results, the server assesses the risk and immediately notifies the user's device. The user's device is a mobile device such as a smartphone or tablet, and feedback is displayed on its screen to provide the user with recommendations for safe behavior.

[0319] Furthermore, the server uses an automated reporting function as needed to report the situation to external law enforcement organizations. This function mitigates potential risks faced by users and enables a rapid response. Users can take actions to avoid risks based on feedback from the system.

[0320] As a concrete example, if a user is walking at night using their smartphone, the smartphone's AI security assistant will detect suspicious activity in the vicinity, issue a warning to the user, and suggest a safe route. Furthermore, if the situation is deemed particularly dangerous, it will immediately notify the police. In this way, users can move around with peace of mind.

[0321] An example of a prompt for a generated AI model might be: "Please describe a risk detection algorithm for ensuring safety at night. In particular, please describe in detail how to efficiently recognize suspicious activity in the surroundings and how to design a system that provides feedback to the user."

[0322] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0323] Step 1:

[0324] The user uses an image acquisition device to acquire high-resolution video data of the surroundings in real time. During this process, the camera and sensors of the image acquisition device capture visual information, and this data is transmitted to the communication device in digital format. The input is real-time acquired visual data, and the output is digital video data.

[0325] Step 2:

[0326] The communication device transmits the acquired digital video data to the server. Wireless technologies such as Bluetooth and LTE are used for fast and secure data transfer. The input is the video data generated in step 1, and the data is the output delivered to the server.

[0327] Step 3:

[0328] The server receives video data and processes it using an AI analysis engine. This AI analysis engine uses a cloud-based solution such as Google Cloud AI to analyze actions and objects in the video and compare them against a database of illegal activities. The input is the video data sent to the server, and the output is the analyzed information on actions and objects.

[0329] Step 4:

[0330] The server assesses the risks of the current environment based on the analyzed data. Using a risk assessment algorithm, it performs a data evaluation process, obtaining the level of risk as output. The input is the analysis result from step 3.

[0331] Step 5:

[0332] The server notifies the user terminal of the assessed risk results. The user terminal displays warnings and feedback on safety actions on the screen of a smartphone or tablet. The input is the risk assessment results obtained in step 4, and the output is the notification content to the user.

[0333] Step 6:

[0334] The server automatically notifies external law enforcement organizations of risk information when the risk exceeds a certain threshold. The notification includes location information and details of the risk, prompting a swift external response. The input is the risk assessment result from step 4, and the output is the reported risk information.

[0335] Step 7:

[0336] The user takes safe actions based on the feedback. They follow the provided safe routes and guidelines, and take specific actions to avoid risks. The input is the feedback information from step 5, and the output is the user's actions.

[0337] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0338] This invention provides a system that incorporates an emotion engine in addition to an image acquisition device and an AI analysis engine, thereby providing feedback and notifications that take into account the user's emotional state. This system ensures user safety while simultaneously improving local public safety by considering psychological factors.

[0339] First, the user puts on an image acquisition device. This device not only captures images of the surroundings but also has the function of detecting the user's own facial expressions and voice. The device acquires the user's real-time facial expression data and sends it to the emotion engine.

[0340] The device (usually a smartphone) transmits the acquired video data and facial expression data to an analysis server. This communication is conducted using a secure protocol.

[0341] The server inputs the received data into an AI analysis engine and an emotion engine. The AI analysis engine works to detect illegal activities in the area from the video data. Meanwhile, the emotion engine analyzes the user's emotional state from their facial expressions and voice tone to determine their stress and anxiety levels.

[0342] Based on the analysis results, the server assesses the overall risk. Here, the user's emotional state is reflected in the risk assessment and incorporated into the final feedback. For example, if the user is in a high-stress state, the tone of the feedback may be changed, or additional warnings may be issued.

[0343] The device receives feedback from the server and notifies the user. This feedback includes not only potential illegal activity but also emotionally sensitive advice and recommendations. If necessary, the server automatically reports to external agencies. These reports include spatial and temporal information, as well as warnings based on emotional data.

[0344] Ultimately, the server stores the analysis results and emotional state data in a database. This data is used as a foundation for analyzing the psychological and safety trends of the community and is utilized in developing future improvement plans.

[0345] As a concrete example, suppose a user is walking through a noisy shopping street when an image acquisition device detects a violent incident occurring nearby. When the emotion engine senses the user's anxiety, the server determines that the situation is high-risk. It issues a warning to the device and simultaneously automatically notifies the police, prompting a swift response. In this way, the system can perform comprehensive risk management, including the user's psychological state.

[0346] The following describes the processing flow.

[0347] Step 1:

[0348] The user puts on the image acquisition device and activates the system. The device is equipped with facial expression detection capabilities and monitors the user's facial movements and voice tone in real time.

[0349] Step 2:

[0350] The terminal (smartphone) receives video data and user facial expression data transmitted from the image acquisition device. The data is transferred using a secure wireless protocol.

[0351] Step 3:

[0352] The terminal uploads video data and facial expression data to an analysis server. The server receives this data in a single process.

[0353] Step 4:

[0354] The server inputs video data into an AI analysis engine to determine whether the people or movements in the video constitute illegal activity. This determination is made by comparing the data with a database of illegal activities.

[0355] Step 5:

[0356] Simultaneously, the server inputs the user's facial expression data into the emotion engine. The emotion engine analyzes the user's emotional state (e.g., stress, anxiety, tension). This analysis combines facial expression recognition algorithms with voice analysis.

[0357] Step 6:

[0358] The server integrates the analysis results from the AI analysis engine and the emotion engine to perform a comprehensive risk assessment. If the user's emotional state is determined to be higher risk than usual, special considerations are added to the assessment results.

[0359] Step 7:

[0360] The server generates feedback that takes the user's emotions into consideration, based on the analysis results. This feedback includes recommendations for specific actions and advice on emotional support.

[0361] Step 8:

[0362] The device receives the generated feedback and immediately notifies the user. The notification includes detailed information about the situation the user is facing and recommended actions.

[0363] Step 9:

[0364] If necessary, the server will trigger an automatic reporting system. The report will include details about the potential illegal activity, as well as an analysis of the user's emotional state.

[0365] Step 10:

[0366] The server stores all analysis results and sentiment data in a database, which will be used for long-term analysis of local security and the psychological state of residents. This data will contribute to planning future security improvement measures.

[0367] (Example 2)

[0368] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0369] Ensuring social safety and maintaining individual psychological health are crucial challenges in modern society. Conventional security systems only detect surrounding dangers, making it difficult to respond appropriately while considering the user's emotional state. Furthermore, they lack sufficient real-time notifications and warnings to respond quickly to abnormal situations. There is a need to realize a safe and psychologically secure society that takes into account emotional states and environmental changes.

[0370] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0371] In this invention, the server includes means for capturing ambient data via image and sound acquisition equipment, means for transmitting the acquired data to an analysis device via a communication device, and means for the analysis device to analyze the video and audio data and compare it with a pre-configured database to evaluate behavior or emotions. This enables comprehensive risk management that takes into account emotional states while ensuring user safety.

[0372] "Image and audio acquisition equipment" refers to devices that capture surrounding video and audio, and acquire the user's visual and auditory information in real time.

[0373] A "communication device" is a device used to transmit acquired data to other devices or servers, and it ensures the security of the data using security protocols.

[0374] An "analysis device" is a device that analyzes behavior and emotions based on received data, and uses AI technology to process the data and make predictions and judgments.

[0375] A "database" is a collection of information that serves as a reference standard for video and audio analysis, and it forms the basis for comparing data with past data and specified information.

[0376] "Assessment of behavior or emotion" is the process by which an analytical device analyzes a person's behavioral patterns and emotional state based on data, and determines their stress level and risk.

[0377] An "information terminal" refers to a device used to notify users of analysis results and feedback, and is a portable communication device.

[0378] "Automatic reporting to external agencies" is a process that automatically reports information to the relevant authorities when an anomaly or danger is detected, in order to facilitate a rapid response.

[0379] To implement this invention, the user first wears an image and audio acquisition device. This device is capable of capturing surrounding video and the user's own voice in real time. Next, the device transmits the acquired data to an analysis device. At this time, communication technologies such as Bluetooth and Wi-Fi are used, and the security of the data is ensured by security protocols.

[0380] The server inputs the received data into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes the video data to detect abnormal behavior. Meanwhile, the emotion analysis engine analyzes the user's emotional state based on audio data and facial expression data to determine the level of stress and anxiety. The analysis device compares the data with a pre-configured database to assess the risk.

[0381] Based on the evaluation results, the device provides feedback to the user. This feedback may include advice and action guidelines tailored to the user's emotional state. Furthermore, if necessary, the server automatically notifies external organizations to prompt a swift response. This information includes location and time data from the analysis device.

[0382] For example, if a user is walking in a noisy area and the tracking device detects danger, and the server determines that the user is experiencing high stress, the device will display instructions to the user such as "move to a safe place" or "take a deep breath." By utilizing this system, user safety can be ensured, and real-time risk management becomes possible.

[0383] Examples of prompts for a generative AI model:

[0384] "Please explain the specific processing steps for a local security system that takes user emotions into consideration. In particular, please focus on the data analysis and feedback provision processes."

[0385] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0386] Step 1:

[0387] The user wears image and audio acquisition equipment. This equipment captures surrounding video and the user's own voice in real time and saves it as digital data. The input is ambient environmental data and the user's voice data, and the output at this stage is digitized video and audio data. Specifically, the camera and microphone become constantly active and begin capturing data.

[0388] Step 2:

[0389] The terminal receives the acquired digital data and transmits it to the analysis device. For security reasons, the data is transmitted via Bluetooth or Wi-Fi. The input is digitized video and audio data, and the output is secure data packets for transmission to the server. Specifically, the data is compressed and encrypted, and the packetized data is transmitted over the network.

[0390] Step 3:

[0391] The server inputs data received from the terminal into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes behavior and abnormal situations from video data, while the emotion analysis engine analyzes voice tone and facial expressions to determine the user's emotional state. The input is compressed data sent to the server, and the output is the behavior analysis results and the emotional state evaluation results. Specifically, the analysis engine uses an image recognition algorithm to detect suspicious activities and emotional patterns through data pattern matching.

[0392] Step 4:

[0393] The server performs a comprehensive risk assessment based on the analysis results. The user's emotional state is a key factor in this assessment. Inputs are behavioral analysis results and emotional state assessment results, while outputs are a risk score and recommended actions. Specifically, the server integrates the data and applies a risk algorithm to quantify the potential risk level.

[0394] Step 5:

[0395] The device receives risk assessment results from the server and provides feedback to the user. This feedback includes emotionally sensitive advice. Input is the risk score and recommended actions from the server, and output is notifications and recommended actions to the user. Specifically, the device displays alerts and draws attention through sound and vibration.

[0396] Step 6:

[0397] If necessary, the server will automatically notify external organizations. This notification will include the user's location and time information. The input is detailed data on situations deemed high-risk, and the output is the notification information sent to external organizations. Specifically, the notification protocol is activated, and a standardized message is sent to pre-configured emergency contacts.

[0398] (Application Example 2)

[0399] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0400] In modern society, deteriorating public safety and increasing psychological stress are serious problems. In public spaces, while prevention of illegal activities and swift response are required, it is also necessary to consider the psychological burden on individuals. However, conventional systems have difficulty accurately grasping the emotional state of users and conducting risk assessments accordingly, which has sometimes led to delays in response. This invention aims to solve these problems and realize both user safety and psychological consideration.

[0401] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0402] In this invention, the server includes means for capturing surrounding video via an image acquisition device, means equipped with an AI analysis engine for analyzing the acquired video and audio data, and means for evaluating the user's emotional state using an emotion analysis engine to determine risk. This enables early detection of illegal activities that also take into account the user's emotional state, and provides feedback to reduce psychological burden.

[0403] An "image acquisition device" is a device that captures surrounding video footage and collects visual information about the user and the environment.

[0404] A "communication device" is a device used to transmit acquired data to a server for analysis, and it has the function of connecting using a secure protocol.

[0405] An "analysis server" is a computer system that processes received video and audio data to perform behavioral analysis, sentiment analysis, and risk assessment.

[0406] An "AI analysis engine" is artificial intelligence that analyzes actions and objects from video data and detects illegal activities by comparing them with a pre-configured database of illegal activities.

[0407] An "emotion analysis engine" is a system that uses voice and facial expression data to analyze the user's emotional state and evaluate their stress and anxiety levels.

[0408] A "database of illegal activities" is a collection of information that records past cases and behavioral patterns based on laws, and is used to cross-reference with analysis results.

[0409] "Risk assessment" is the process of evaluating the degree of potential danger and psychological burden based on analysis results and emotional assessments.

[0410] A "user terminal" is an electronic device that notifies users of analysis results and feedback, and is used for users to receive information.

[0411] "Automatic notification" is a function that sends alerts to external organizations based on analysis results and risk assessments as needed.

[0412] A "database" is a means of storing information that accumulates the results of analysis and is used for analyzing and providing information on local public safety and psychological trends.

[0413] To implement this invention, the following system configuration is primarily required. The user first wears an image acquisition device, which acquires video footage of the surroundings and their own facial expression data. The server receives this acquired video and audio data and performs analysis using an AI analysis engine and an emotion analysis engine. Specifically, the AI analysis engine analyzes actions and objects in the video and compares them with a database of illegal activities to assess potential risks.

[0414] The server uses an emotion analysis engine to analyze the user's emotional state from voice data and determine the level of stress and anxiety. This analysis result, along with the emotional assessment, is comprehensively evaluated to determine the risk level. If necessary, the server uses an automatic notification function to send an alert to an external organization. This feedback is notified to the user's terminal, providing the user with feedback and warnings.

[0415] The software used includes TensorFlow and PyTorch, which enable the functionality of the AI analysis engine. Data processing and calculations are performed through these frameworks. For sentiment analysis, custom algorithms are used to analyze facial expressions and voice tone.

[0416] For example, if a user is walking through a busy area and the emotion engine detects potential illegal activity from surrounding noise and behavior, and assesses the user's high stress level, the system will provide feedback such as, "There is a problem nearby. Please move to a safer location," and will contact the police if necessary.

[0417] An example of a prompt message is: "The user is walking through a busy area and feels uneasy about their surroundings. The emotion engine has assessed the stress level as high. Please provide a safety-conscious action plan for how to respond to this situation." This demonstrates how the system can generate questions based on the user's emotions and circumstances.

[0418] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0419] Step 1:

[0420] The user wears an image acquisition device. The device acquires video of the surroundings and the user's facial expressions, and simultaneously collects audio data. This input data is transmitted to a server via a communication device. The output here is video and audio data.

[0421] Step 2:

[0422] The server inputs the received video data into its AI analysis engine and begins analyzing behavior and objects. It then performs a process of comparing the data with an illegal activity database to determine whether or not illegal activity has occurred. The input here is video data, and the output is the analyzed behavior data and the evaluation results of illegal activity.

[0423] Step 3:

[0424] Simultaneously, the server inputs the voice data into an emotion analysis engine to evaluate the user's emotional state. This analysis focuses on determining stress and anxiety levels. The input is voice data, and the analyzed emotion data is the output.

[0425] Step 4:

[0426] The server integrates behavioral and emotional data to perform a risk assessment. Here, it determines the overall level of risk and decides on the necessary actions. Inputs are behavioral and emotional data, and output is the risk level assessment result.

[0427] Step 5:

[0428] The server generates and sends feedback to the user terminal based on the evaluation results. This feedback includes specific advice and warnings. The input is the risk assessment result, and the output is the content of the notification.

[0429] Step 6:

[0430] In some cases, the server uses an automated notification function to send an alert to the appropriate external organization. The input is the risk assessment result, and the output is the content of the notification.

[0431] Each step incorporates multiple evaluation processes and data analysis methods to enable rapid responses based on the user's safety and psychological state.

[0432] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0433] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0434] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0435] [Third Embodiment]

[0436] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0437] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0438] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0439] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0440] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0441] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0442] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0443] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0444] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0445] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0446] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0447] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0448] This invention provides a system that combines smart devices and AI analysis technology to enable users to safely promote legal compliance in their daily lives. Specific embodiments are described below.

[0449] First, the user attaches a dedicated image acquisition device. This device is equipped with high-performance cameras and sensors that can capture the surrounding environment in high resolution in real time.

[0450] The captured video is transmitted directly to the communication device via the terminal. This communication uses wireless technologies such as Bluetooth, ensuring fast and secure data transmission.

[0451] The transmitted data is accessed by a server in the cloud. The server integrates an AI analysis engine and begins processing the received video data immediately. Specifically, it detects the movement of objects and people in the video and compares it against a pre-configured database of illegal activities.

[0452] If the server performs an analysis and detects illegal activity or related risks, a risk assessment process is initiated. Based on the results of this assessment, feedback is immediately provided to the user. This feedback includes details of the relevant risk and recommended actions, and is communicated to the user through the application on their device.

[0453] Furthermore, depending on the risk assessment results, the server can activate an automatic notification function. This function transmits necessary information to external relevant organizations according to pre-configured criteria. The notification may include location information and video footage from the scene, enabling a rapid response.

[0454] Finally, the server saves the analysis results to a database. This saved data can be used for future analysis of local security and understanding trends, and can serve as basic data for policy decisions and security improvement measures.

[0455] As a concrete example, suppose a user is walking through the city at night and spots a suspicious group near a commercial facility. In this case, the image acquisition device automatically captures video footage, and the server analyzes their actions. Based on the analysis results, potential risks are fed back, allowing the user to choose safer actions. Furthermore, if a significant risk is detected, the server immediately notifies the police, contributing to local safety by promoting preventative measures.

[0456] The following describes the processing flow.

[0457] Step 1:

[0458] The user attaches the image acquisition device and starts capturing video. The device's sensors capture the surrounding environment and continuously generate video data.

[0459] Step 2:

[0460] The terminal (image acquisition device) transmits the captured video data to the smartphone using its built-in communication device. This communication is usually performed via Bluetooth connection.

[0461] Step 3:

[0462] The device (smartphone) uploads the video data received from the smart glasses to an analysis server. The upload is performed using an internet connection, and the data is protected by a secure protocol.

[0463] Step 4:

[0464] The server inputs the received video data into an AI analysis engine. Here, the scenes and movements within the video are analyzed. The AI uses a pre-trained model to identify actions that appear to be abnormal or illegal.

[0465] Step 5:

[0466] The server determines whether there is a possibility of illegal activity based on the analysis results. Risk assessment is performed by comparing the analysis results with a pre-configured database of illegal activities.

[0467] Step 6:

[0468] The server performs a risk assessment and generates a feedback message. The feedback includes specific risk information and recommended actions based on the analysis results.

[0469] Step 7:

[0470] The device (smartphone) receives feedback messages and notifies the user. This notification allows the user to understand the current risks and take safe actions.

[0471] Step 8:

[0472] When the server detects a high-risk event, it activates an automatic notification function. Based on the settings, a notification message is sent to the relevant external organization.

[0473] Step 9:

[0474] The server records the analysis and reporting details in a database. This data is used for subsequent analysis and as planning material for improving local security.

[0475] (Example 1)

[0476] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0477] In modern society, ensuring individual safety and maintaining public order requires quickly and accurately understanding the environment and encouraging appropriate actions. However, conventional monitoring technologies have limitations in real-time risk assessment and rapid feedback to users, making it difficult to implement effective countermeasures. Furthermore, there is a need for a comprehensive system that can detect specific risk behaviors and notify relevant organizations in a timely manner.

[0478] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0479] In this invention, the server includes means for analyzing actions and objects in visual information and comparing them with a pre-configured database of inappropriate actions; means for performing a risk assessment based on the analysis results and notifying the user terminal; and means for providing the user with recommended actions according to specific situations and supporting action selection. This enables real-time risk assessment and rapid feedback, making it possible to improve local security while enhancing user safety.

[0480] An "image acquisition device" is a device that captures surrounding visual information at high resolution and is portable for personal use.

[0481] A "communication device" is a device that uses wireless technology to transmit acquired visual information to an analysis device.

[0482] An "analysis device" is a computer system that uses advanced artificial intelligence technology to analyze received visual information and determine specific actions.

[0483] The "inappropriate behavior database" is a database that stores information on pre-defined inappropriate behaviors and serves as a standard data storage used to compare analysis results.

[0484] "Risk assessment" is the process of evaluating the degree of risk of a specific action or situation based on analyzed information, and then providing advice to the user based on the results.

[0485] A "user terminal" is a personal electronic device designed to receive feedback and warning information.

[0486] "External organizations" refer to organizations that require a rapid response depending on the specific situation, such as the police or related public institutions.

[0487] "Recommended actions" are suggestions that encourage users to choose safe and appropriate actions based on the analysis results.

[0488] "Support for action choices" refers to a function that assists users in making decisions by recommending safe actions based on the results of risk assessments.

[0489] This invention is a system that promotes safe behavior by capturing and analyzing visual information from the surroundings through an image acquisition device worn by the user. The image acquisition device is equipped with a high-performance camera and sensors, and can record visual information in real time with high accuracy even while the user is moving. The acquired visual information is quickly transmitted via a communication device to an analysis device located remotely using wireless technology such as Bluetooth.

[0490] The server functions as a cloud-based analysis device, receiving visual information and executing advanced algorithms using artificial intelligence technology to analyze the actions and objects contained within that visual information. The analysis results are compared against an inappropriate behavior database, and a risk assessment of each action is performed. The server notifies the user's terminal of the assessment results and provides real-time feedback through an application on the terminal.

[0491] As a concrete example, consider a scenario where a user is walking near a commercial facility at night and image acquisition equipment detects unusual movement within a crowd. In this case, the server quickly analyzes the situation and notifies the user of the potential risk. Simultaneously, if the server determines that there is a significant risk, it will automatically report to external organizations such as the police, as necessary, based on pre-configured settings.

[0492] An example of a prompt message is, "Generate recommended safe actions for a user to take when they spot a suspicious group near a commercial facility, based on image analysis." Based on this message, the generating AI model creates action suggestions appropriate to the risk, prompting the user to make a safe choice.

[0493] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0494] Step 1:

[0495] The user wears an image acquisition device to acquire visual information of their surroundings in real time. The input is visual data of the user's environment, and the output is high-resolution visual data. This data is ready to be transmitted directly to a communication device. Specifically, the camera and sensors work together to continuously capture frames and digitize this data.

[0496] Step 2:

[0497] The terminal transmits the acquired visual data to the server via a communication device. The input is the visual data generated in step 1, and the output is the completion of secure data transfer using wireless technology. Communication is mainly carried out using the Bluetooth protocol, and the data is compressed before being transferred to the server.

[0498] Step 3:

[0499] The server analyzes the visual data it receives. The input is compressed visual data sent from the terminal, and the output is behavioral information and risk assessment data as a result of the analysis. Here, an AI analysis engine is used to perform object recognition and behavioral analysis, and the process is carried out by comparing it with an inappropriate behavior database.

[0500] Step 4:

[0501] The server performs a risk assessment based on the analysis results. The input is the behavioral and risk information obtained from the analysis in step 3, and the output is the details of the assessed risk and recommended actions. Specifically, the process involves quantifying the degree of risk based on the analyzed data and incorporating action suggestions.

[0502] Step 5:

[0503] The device receives notifications from the server and provides feedback to the user. Input is a risk assessment and recommended action sent from the server, while output is a specific warning message and action suggestion to the user. Specifically, this includes actions such as the application on the device displaying notifications in real time and providing information in a way that is easily understandable to the user.

[0504] Step 6:

[0505] The server automatically notifies external organizations if a significant risk is identified. Inputs are the risk assessment results and location information obtained in step 4, and output is a report to the relevant authorities. Specific actions include organizing information that meets the notification criteria and sending data via automated email or API.

[0506] Step 7:

[0507] The server stores the analysis results and the evaluation information based on them in a database. The input is the data and evaluation information from steps 3 to 6, and the output is an update to the database that can be used for future analysis and security improvement measures. Here, we will implement specific operations to securely store data using a database management system and enable analysis as needed.

[0508] (Application Example 1)

[0509] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0510] In modern society, it is difficult for individual users to accurately assess the safety of their current environment and take appropriate action, and in many cases, accidents and troubles occur due to a lack of caution. To improve this situation and enable users to live their daily lives with peace of mind, a system is needed that can assess environmental risks in real time and provide swift and concrete safety measures.

[0511] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0512] In this invention, the server includes means for providing feedback to the user to identify surrounding hazards and recommend safe actions, means for coordinating with external law enforcement organizations as needed, and means for generating feedback, including details, in real time when the analysis server detects suspicious behavior, and providing it immediately to the user terminal. This enables the user to continuously take safe and appropriate actions.

[0513] An "image acquisition device" is a device that captures surrounding visual information at high resolution and outputs it as digital data.

[0514] A "communication device" is a device used to transmit digital data to other devices or servers using wireless or wired technology.

[0515] An "analysis server" is a computer system installed to process received data and perform analysis based on specific conditions.

[0516] A "database of illegal activities" is a database that records information about actions and objects that violate laws and regulations that have been registered in advance.

[0517] "Risk assessment" is the process of determining potential risks related to the current situation based on the results of an analysis.

[0518] A "user terminal" is an electronic device used by a user to receive information and to receive feedback and recommendations through an interface.

[0519] "Automatic reporting" refers to a function where the system automatically reports the situation to external relevant organizations when certain criteria are met.

[0520] "External organizations" refer to groups such as the police and private security organizations that cooperate for the purpose of maintaining public order and ensuring safety.

[0521] "Feedback" refers to information provided based on analysis results to inform users about the situation they are facing and to guide them toward appropriate actions.

[0522] A "public safety organization" is a public or private group that works to ensure local security and compliance with the law.

[0523] In the system based on this invention, the user first utilizes an image acquisition device equipped with a high-performance camera and sensors. This image acquisition device collects surrounding visual information in real time and transmits this acquired data to an analysis server via a communication device. Wireless communication technologies such as Bluetooth and 4G / 5G are used as the communication technology.

[0524] The server compares the received video data with a pre-configured database of illegal activities and performs analysis using an AI analysis engine. A cloud-based solution such as Google Cloud AI could be used as the AI analysis engine. Based on the analysis results, the server assesses the risk and immediately notifies the user's device. The user's device is a mobile device such as a smartphone or tablet, and feedback is displayed on its screen to provide the user with recommendations for safe behavior.

[0525] Furthermore, the server uses an automated reporting function as needed to report the situation to external law enforcement organizations. This function mitigates potential risks faced by users and enables a rapid response. Users can take actions to avoid risks based on feedback from the system.

[0526] As a concrete example, if a user is walking at night using their smartphone, the smartphone's AI security assistant will detect suspicious activity in the vicinity, issue a warning to the user, and suggest a safe route. Furthermore, if the situation is deemed particularly dangerous, it will immediately notify the police. In this way, users can move around with peace of mind.

[0527] An example of a prompt for a generated AI model might be: "Please describe a risk detection algorithm for ensuring safety at night. In particular, please describe in detail how to efficiently recognize suspicious activity in the surroundings and how to design a system that provides feedback to the user."

[0528] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0529] Step 1:

[0530] The user uses an image acquisition device to acquire high-resolution video data of the surroundings in real time. During this process, the camera and sensors of the image acquisition device capture visual information, and this data is transmitted to the communication device in digital format. The input is real-time acquired visual data, and the output is digital video data.

[0531] Step 2:

[0532] The communication device transmits the acquired digital video data to the server. Wireless technologies such as Bluetooth and LTE are used for fast and secure data transfer. The input is the video data generated in step 1, and the data is the output delivered to the server.

[0533] Step 3:

[0534] The server receives video data and processes it using an AI analysis engine. This AI analysis engine uses a cloud-based solution such as Google Cloud AI to analyze actions and objects in the video and compare them against a database of illegal activities. The input is the video data sent to the server, and the output is the analyzed information on actions and objects.

[0535] Step 4:

[0536] The server assesses the risks of the current environment based on the analyzed data. Using a risk assessment algorithm, it performs a data evaluation process, obtaining the level of risk as output. The input is the analysis result from step 3.

[0537] Step 5:

[0538] The server notifies the user terminal of the assessed risk results. The user terminal displays warnings and feedback on safety actions on the screen of a smartphone or tablet. The input is the risk assessment results obtained in step 4, and the output is the notification content to the user.

[0539] Step 6:

[0540] The server automatically notifies external law enforcement organizations of risk information when the risk exceeds a certain threshold. The notification includes location information and details of the risk, prompting a swift external response. The input is the risk assessment result from step 4, and the output is the reported risk information.

[0541] Step 7:

[0542] The user takes safe actions based on the feedback. They follow the provided safe routes and guidelines, and take specific actions to avoid risks. The input is the feedback information from step 5, and the output is the user's actions.

[0543] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0544] This invention provides a system that incorporates an emotion engine in addition to an image acquisition device and an AI analysis engine, thereby providing feedback and notifications that take into account the user's emotional state. This system ensures user safety while simultaneously improving local public safety by considering psychological factors.

[0545] First, the user puts on an image acquisition device. This device not only captures images of the surroundings but also has the function of detecting the user's own facial expressions and voice. The device acquires the user's real-time facial expression data and sends it to the emotion engine.

[0546] The device (usually a smartphone) transmits the acquired video data and facial expression data to an analysis server. This communication is conducted using a secure protocol.

[0547] The server inputs the received data into an AI analysis engine and an emotion engine. The AI analysis engine works to detect illegal activities in the area from the video data. Meanwhile, the emotion engine analyzes the user's emotional state from their facial expressions and voice tone to determine their stress and anxiety levels.

[0548] Based on the analysis results, the server assesses the overall risk. Here, the user's emotional state is reflected in the risk assessment and incorporated into the final feedback. For example, if the user is in a high-stress state, the tone of the feedback may be changed, or additional warnings may be issued.

[0549] The device receives feedback from the server and notifies the user. This feedback includes not only potential illegal activity but also emotionally sensitive advice and recommendations. If necessary, the server automatically reports to external agencies. These reports include spatial and temporal information, as well as warnings based on emotional data.

[0550] Ultimately, the server stores the analysis results and emotional state data in a database. This data is used as a foundation for analyzing the psychological and safety trends of the community and is utilized in developing future improvement plans.

[0551] As a concrete example, suppose a user is walking through a noisy shopping street when an image acquisition device detects a violent incident occurring nearby. When the emotion engine senses the user's anxiety, the server determines that the situation is high-risk. It issues a warning to the device and simultaneously automatically notifies the police, prompting a swift response. In this way, the system can perform comprehensive risk management, including the user's psychological state.

[0552] The following describes the processing flow.

[0553] Step 1:

[0554] The user puts on the image acquisition device and activates the system. The device is equipped with facial expression detection capabilities and monitors the user's facial movements and voice tone in real time.

[0555] Step 2:

[0556] The terminal (smartphone) receives video data and user facial expression data transmitted from the image acquisition device. The data is transferred using a secure wireless protocol.

[0557] Step 3:

[0558] The terminal uploads video data and facial expression data to an analysis server. The server receives this data in a single process.

[0559] Step 4:

[0560] The server inputs video data into an AI analysis engine to determine whether the people or movements in the video constitute illegal activity. This determination is made by comparing the data with a database of illegal activities.

[0561] Step 5:

[0562] Simultaneously, the server inputs the user's facial expression data into the emotion engine. The emotion engine analyzes the user's emotional state (e.g., stress, anxiety, tension). This analysis combines facial expression recognition algorithms with voice analysis.

[0563] Step 6:

[0564] The server integrates the analysis results from the AI analysis engine and the emotion engine to perform a comprehensive risk assessment. If the user's emotional state is determined to be higher risk than usual, special considerations are added to the assessment results.

[0565] Step 7:

[0566] The server generates feedback that takes the user's emotions into consideration, based on the analysis results. This feedback includes recommendations for specific actions and advice on emotional support.

[0567] Step 8:

[0568] The device receives the generated feedback and immediately notifies the user. The notification includes detailed information about the situation the user is facing and recommended actions.

[0569] Step 9:

[0570] If necessary, the server will trigger an automatic reporting system. The report will include details about the potential illegal activity, as well as an analysis of the user's emotional state.

[0571] Step 10:

[0572] The server stores all analysis results and sentiment data in a database, which will be used for long-term analysis of local security and the psychological state of residents. This data will contribute to planning future security improvement measures.

[0573] (Example 2)

[0574] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0575] Ensuring social safety and maintaining individual psychological health are crucial challenges in modern society. Conventional security systems only detect surrounding dangers, making it difficult to respond appropriately while considering the user's emotional state. Furthermore, they lack sufficient real-time notifications and warnings to respond quickly to abnormal situations. There is a need to realize a safe and psychologically secure society that takes into account emotional states and environmental changes.

[0576] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0577] In this invention, the server includes means for capturing ambient data via image and sound acquisition equipment, means for transmitting the acquired data to an analysis device via a communication device, and means for the analysis device to analyze the video and audio data and compare it with a pre-configured database to evaluate behavior or emotions. This enables comprehensive risk management that takes into account emotional states while ensuring user safety.

[0578] "Image and audio acquisition equipment" refers to devices that capture surrounding video and audio, and acquire the user's visual and auditory information in real time.

[0579] A "communication device" is a device used to transmit acquired data to other devices or servers, and it ensures the security of the data using security protocols.

[0580] An "analysis device" is a device that analyzes behavior and emotions based on received data, and uses AI technology to process the data and make predictions and judgments.

[0581] A "database" is a collection of information that serves as a reference standard for video and audio analysis, and it forms the basis for comparing data with past data and specified information.

[0582] "Assessment of behavior or emotion" is the process by which an analytical device analyzes a person's behavioral patterns and emotional state based on data, and determines their stress level and risk.

[0583] An "information terminal" refers to a device used to notify users of analysis results and feedback, and is a portable communication device.

[0584] "Automatic reporting to external agencies" is a process that automatically reports information to the relevant authorities when an anomaly or danger is detected, in order to facilitate a rapid response.

[0585] To implement this invention, the user first wears an image and audio acquisition device. This device is capable of capturing surrounding video and the user's own voice in real time. Next, the device transmits the acquired data to an analysis device. At this time, communication technologies such as Bluetooth and Wi-Fi are used, and the security of the data is ensured by security protocols.

[0586] The server inputs the received data into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes the video data to detect abnormal behavior. Meanwhile, the emotion analysis engine analyzes the user's emotional state based on audio data and facial expression data to determine the level of stress and anxiety. The analysis device compares the data with a pre-configured database to assess the risk.

[0587] Based on the evaluation results, the device provides feedback to the user. This feedback may include advice and action guidelines tailored to the user's emotional state. Furthermore, if necessary, the server automatically notifies external organizations to prompt a swift response. This information includes location and time data from the analysis device.

[0588] For example, if a user is walking in a noisy area and the tracking device detects danger, and the server determines that the user is experiencing high stress, the device will display instructions to the user such as "move to a safe place" or "take a deep breath." By utilizing this system, user safety can be ensured, and real-time risk management becomes possible.

[0589] Examples of prompts for a generative AI model:

[0590] "Please explain the specific processing steps for a local security system that takes user emotions into consideration. In particular, please focus on the data analysis and feedback provision processes."

[0591] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0592] Step 1:

[0593] The user wears image and audio acquisition equipment. This equipment captures surrounding video and the user's own voice in real time and saves it as digital data. The input is ambient environmental data and the user's voice data, and the output at this stage is digitized video and audio data. Specifically, the camera and microphone become constantly active and begin capturing data.

[0594] Step 2:

[0595] The terminal receives the acquired digital data and transmits it to the analysis device. For security reasons, the data is transmitted via Bluetooth or Wi-Fi. The input is digitized video and audio data, and the output is secure data packets for transmission to the server. Specifically, the data is compressed and encrypted, and the packetized data is transmitted over the network.

[0596] Step 3:

[0597] The server inputs data received from the terminal into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes behavior and abnormal situations from video data, while the emotion analysis engine analyzes voice tone and facial expressions to determine the user's emotional state. The input is compressed data sent to the server, and the output is the behavior analysis results and the emotional state evaluation results. Specifically, the analysis engine uses an image recognition algorithm to detect suspicious activities and emotional patterns through data pattern matching.

[0598] Step 4:

[0599] The server performs a comprehensive risk assessment based on the analysis results. The user's emotional state is a key factor in this assessment. Inputs are behavioral analysis results and emotional state assessment results, while outputs are a risk score and recommended actions. Specifically, the server integrates the data and applies a risk algorithm to quantify the potential risk level.

[0600] Step 5:

[0601] The device receives risk assessment results from the server and provides feedback to the user. This feedback includes emotionally sensitive advice. Input is the risk score and recommended actions from the server, and output is notifications and recommended actions to the user. Specifically, the device displays alerts and draws attention through sound and vibration.

[0602] Step 6:

[0603] If necessary, the server will automatically notify external organizations. This notification will include the user's location and time information. The input is detailed data on situations deemed high-risk, and the output is the notification information sent to external organizations. Specifically, the notification protocol is activated, and a standardized message is sent to pre-configured emergency contacts.

[0604] (Application Example 2)

[0605] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0606] In modern society, deteriorating public safety and increasing psychological stress are serious problems. In public spaces, while prevention of illegal activities and swift response are required, it is also necessary to consider the psychological burden on individuals. However, conventional systems have difficulty accurately grasping the emotional state of users and conducting risk assessments accordingly, which has sometimes led to delays in response. This invention aims to solve these problems and realize both user safety and psychological consideration.

[0607] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0608] In this invention, the server includes means for capturing surrounding video via an image acquisition device, means equipped with an AI analysis engine for analyzing the acquired video and audio data, and means for evaluating the user's emotional state using an emotion analysis engine to determine risk. This enables early detection of illegal activities that also take into account the user's emotional state, and provides feedback to reduce psychological burden.

[0609] An "image acquisition device" is a device that captures surrounding video footage and collects visual information about the user and the environment.

[0610] A "communication device" is a device used to transmit acquired data to a server for analysis, and it has the function of connecting using a secure protocol.

[0611] An "analysis server" is a computer system that processes received video and audio data to perform behavioral analysis, sentiment analysis, and risk assessment.

[0612] An "AI analysis engine" is artificial intelligence that analyzes actions and objects from video data and detects illegal activities by comparing them with a pre-configured database of illegal activities.

[0613] An "emotion analysis engine" is a system that uses voice and facial expression data to analyze the user's emotional state and evaluate their stress and anxiety levels.

[0614] A "database of illegal activities" is a collection of information that records past cases and behavioral patterns based on laws, and is used to cross-reference with analysis results.

[0615] "Risk assessment" is the process of evaluating the degree of potential danger and psychological burden based on analysis results and emotional assessments.

[0616] A "user terminal" is an electronic device that notifies users of analysis results and feedback, and is used for users to receive information.

[0617] "Automatic notification" is a function that sends alerts to external organizations based on analysis results and risk assessments as needed.

[0618] A "database" is a means of storing information that accumulates the results of analysis and is used for analyzing and providing information on local public safety and psychological trends.

[0619] To implement this invention, the following system configuration is primarily required. The user first wears an image acquisition device, which acquires video footage of the surroundings and their own facial expression data. The server receives this acquired video and audio data and performs analysis using an AI analysis engine and an emotion analysis engine. Specifically, the AI analysis engine analyzes actions and objects in the video and compares them with a database of illegal activities to assess potential risks.

[0620] The server uses an emotion analysis engine to analyze the user's emotional state from voice data and determine the level of stress and anxiety. This analysis result, along with the emotional assessment, is comprehensively evaluated to determine the risk level. If necessary, the server uses an automatic notification function to send an alert to an external organization. This feedback is notified to the user's terminal, providing the user with feedback and warnings.

[0621] The software used includes TensorFlow and PyTorch, which enable the functionality of the AI analysis engine. Data processing and calculations are performed through these frameworks. For sentiment analysis, custom algorithms are used to analyze facial expressions and voice tone.

[0622] For example, if a user is walking through a busy area and the emotion engine detects potential illegal activity from surrounding noise and behavior, and assesses the user's high stress level, the system will provide feedback such as, "There is a problem nearby. Please move to a safer location," and will contact the police if necessary.

[0623] An example of a prompt message is: "The user is walking through a busy area and feels uneasy about their surroundings. The emotion engine has assessed the stress level as high. Please provide a safety-conscious action plan for how to respond to this situation." This demonstrates how the system can generate questions based on the user's emotions and circumstances.

[0624] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0625] Step 1:

[0626] The user wears an image acquisition device. The device acquires video of the surroundings and the user's facial expressions, and simultaneously collects audio data. This input data is transmitted to a server via a communication device. The output here is video and audio data.

[0627] Step 2:

[0628] The server inputs the received video data into its AI analysis engine and begins analyzing behavior and objects. It then performs a process of comparing the data with an illegal activity database to determine whether or not illegal activity has occurred. The input here is video data, and the output is the analyzed behavior data and the evaluation results of illegal activity.

[0629] Step 3:

[0630] Simultaneously, the server inputs the voice data into an emotion analysis engine to evaluate the user's emotional state. This analysis focuses on determining stress and anxiety levels. The input is voice data, and the analyzed emotion data is the output.

[0631] Step 4:

[0632] The server integrates behavioral and emotional data to perform a risk assessment. Here, it determines the overall level of risk and decides on the necessary actions. Inputs are behavioral and emotional data, and output is the risk level assessment result.

[0633] Step 5:

[0634] The server generates and sends feedback to the user terminal based on the evaluation results. This feedback includes specific advice and warnings. The input is the risk assessment result, and the output is the content of the notification.

[0635] Step 6:

[0636] In some cases, the server uses an automated notification function to send an alert to the appropriate external organization. The input is the risk assessment result, and the output is the content of the notification.

[0637] Each step incorporates multiple evaluation processes and data analysis methods to enable rapid responses based on the user's safety and psychological state.

[0638] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0639] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0640] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0641] [Fourth Embodiment]

[0642] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0643] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0644] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0645] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0646] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0647] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0648] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0649] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0650] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0651] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0652] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0653] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0654] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0655] This invention provides a system that combines smart devices and AI analysis technology to enable users to safely promote legal compliance in their daily lives. Specific embodiments are described below.

[0656] First, the user attaches a dedicated image acquisition device. This device is equipped with high-performance cameras and sensors that can capture the surrounding environment in high resolution in real time.

[0657] The captured video is transmitted directly to the communication device via the terminal. This communication uses wireless technologies such as Bluetooth, ensuring fast and secure data transmission.

[0658] The transmitted data is accessed by a server in the cloud. The server integrates an AI analysis engine and begins processing the received video data immediately. Specifically, it detects the movement of objects and people in the video and compares it against a pre-configured database of illegal activities.

[0659] If the server performs an analysis and detects illegal activity or related risks, a risk assessment process is initiated. Based on the results of this assessment, feedback is immediately provided to the user. This feedback includes details of the relevant risk and recommended actions, and is communicated to the user through the application on their device.

[0660] Furthermore, depending on the risk assessment results, the server can activate an automatic notification function. This function transmits necessary information to external relevant organizations according to pre-configured criteria. The notification may include location information and video footage from the scene, enabling a rapid response.

[0661] Finally, the server saves the analysis results to a database. This saved data can be used for future analysis of local security and understanding trends, and can serve as basic data for policy decisions and security improvement measures.

[0662] As a concrete example, suppose a user is walking through the city at night and spots a suspicious group near a commercial facility. In this case, the image acquisition device automatically captures video footage, and the server analyzes their actions. Based on the analysis results, potential risks are fed back, allowing the user to choose safer actions. Furthermore, if a significant risk is detected, the server immediately notifies the police, contributing to local safety by promoting preventative measures.

[0663] The following describes the processing flow.

[0664] Step 1:

[0665] The user attaches the image acquisition device and starts capturing video. The device's sensors capture the surrounding environment and continuously generate video data.

[0666] Step 2:

[0667] The terminal (image acquisition device) transmits the captured video data to the smartphone using its built-in communication device. This communication is usually performed via Bluetooth connection.

[0668] Step 3:

[0669] The device (smartphone) uploads the video data received from the smart glasses to an analysis server. The upload is performed using an internet connection, and the data is protected by a secure protocol.

[0670] Step 4:

[0671] The server inputs the received video data into an AI analysis engine. Here, the scenes and movements within the video are analyzed. The AI uses a pre-trained model to identify actions that appear to be abnormal or illegal.

[0672] Step 5:

[0673] The server determines whether there is a possibility of illegal activity based on the analysis results. Risk assessment is performed by comparing the analysis results with a pre-configured database of illegal activities.

[0674] Step 6:

[0675] The server performs a risk assessment and generates a feedback message. The feedback includes specific risk information and recommended actions based on the analysis results.

[0676] Step 7:

[0677] The device (smartphone) receives feedback messages and notifies the user. This notification allows the user to understand the current risks and take safe actions.

[0678] Step 8:

[0679] When the server detects a high-risk event, it activates an automatic notification function. Based on the settings, a notification message is sent to the relevant external organization.

[0680] Step 9:

[0681] The server records the analysis and reporting details in a database. This data is used for subsequent analysis and as planning material for improving local security.

[0682] (Example 1)

[0683] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0684] In modern society, ensuring individual safety and maintaining public order requires quickly and accurately understanding the environment and encouraging appropriate actions. However, conventional monitoring technologies have limitations in real-time risk assessment and rapid feedback to users, making it difficult to implement effective countermeasures. Furthermore, there is a need for a comprehensive system that can detect specific risk behaviors and notify relevant organizations in a timely manner.

[0685] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0686] In this invention, the server includes means for analyzing actions and objects in visual information and comparing them with a pre-configured database of inappropriate actions; means for performing a risk assessment based on the analysis results and notifying the user terminal; and means for providing the user with recommended actions according to specific situations and supporting action selection. This enables real-time risk assessment and rapid feedback, making it possible to improve local security while enhancing user safety.

[0687] An "image acquisition device" is a device that captures surrounding visual information at high resolution and is portable for personal use.

[0688] A "communication device" is a device that uses wireless technology to transmit acquired visual information to an analysis device.

[0689] An "analysis device" is a computer system that uses advanced artificial intelligence technology to analyze received visual information and determine specific actions.

[0690] The "inappropriate behavior database" is a database that stores information on pre-defined inappropriate behaviors and serves as a standard data storage used to compare analysis results.

[0691] "Risk assessment" is the process of evaluating the degree of risk of a specific action or situation based on analyzed information, and then providing advice to the user based on the results.

[0692] A "user terminal" is a personal electronic device designed to receive feedback and warning information.

[0693] "External organizations" refer to organizations that require a rapid response depending on the specific situation, such as the police or related public institutions.

[0694] "Recommended actions" are suggestions that encourage users to choose safe and appropriate actions based on the analysis results.

[0695] "Support for action choices" refers to a function that assists users in making decisions by recommending safe actions based on the results of risk assessments.

[0696] This invention is a system that promotes safe behavior by capturing and analyzing visual information from the surroundings through an image acquisition device worn by the user. The image acquisition device is equipped with a high-performance camera and sensors, and can record visual information in real time with high accuracy even while the user is moving. The acquired visual information is quickly transmitted via a communication device to an analysis device located remotely using wireless technology such as Bluetooth.

[0697] The server functions as a cloud-based analysis device, receiving visual information and executing advanced algorithms using artificial intelligence technology to analyze the actions and objects contained within that visual information. The analysis results are compared against an inappropriate behavior database, and a risk assessment of each action is performed. The server notifies the user's terminal of the assessment results and provides real-time feedback through an application on the terminal.

[0698] As a concrete example, consider a scenario where a user is walking near a commercial facility at night and image acquisition equipment detects unusual movement within a crowd. In this case, the server quickly analyzes the situation and notifies the user of the potential risk. Simultaneously, if the server determines that there is a significant risk, it will automatically report to external organizations such as the police, as necessary, based on pre-configured settings.

[0699] An example of a prompt message is, "Generate recommended safe actions for a user to take when they spot a suspicious group near a commercial facility, based on image analysis." Based on this message, the generating AI model creates action suggestions appropriate to the risk, prompting the user to make a safe choice.

[0700] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0701] Step 1:

[0702] The user wears an image acquisition device to acquire visual information of their surroundings in real time. The input is visual data of the user's environment, and the output is high-resolution visual data. This data is ready to be transmitted directly to a communication device. Specifically, the camera and sensors work together to continuously capture frames and digitize this data.

[0703] Step 2:

[0704] The terminal transmits the acquired visual data to the server via a communication device. The input is the visual data generated in step 1, and the output is the completion of secure data transfer using wireless technology. Communication is mainly carried out using the Bluetooth protocol, and the data is compressed before being transferred to the server.

[0705] Step 3:

[0706] The server analyzes the visual data it receives. The input is compressed visual data sent from the terminal, and the output is behavioral information and risk assessment data as a result of the analysis. Here, an AI analysis engine is used to perform object recognition and behavioral analysis, and the process is carried out by comparing it with an inappropriate behavior database.

[0707] Step 4:

[0708] The server performs a risk assessment based on the analysis results. The input is the behavioral and risk information obtained from the analysis in step 3, and the output is the details of the assessed risk and recommended actions. Specifically, the process involves quantifying the degree of risk based on the analyzed data and incorporating action suggestions.

[0709] Step 5:

[0710] The device receives notifications from the server and provides feedback to the user. Input is a risk assessment and recommended action sent from the server, while output is a specific warning message and action suggestion to the user. Specifically, this includes actions such as the application on the device displaying notifications in real time and providing information in a way that is easily understandable to the user.

[0711] Step 6:

[0712] The server automatically notifies external organizations if a significant risk is identified. Inputs are the risk assessment results and location information obtained in step 4, and output is a report to the relevant authorities. Specific actions include organizing information that meets the notification criteria and sending data via automated email or API.

[0713] Step 7:

[0714] The server stores the analysis results and the evaluation information based on them in a database. The input is the data and evaluation information from steps 3 to 6, and the output is an update to the database that can be used for future analysis and security improvement measures. Here, we will implement specific operations to securely store data using a database management system and enable analysis as needed.

[0715] (Application Example 1)

[0716] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0717] In modern society, it is difficult for individual users to accurately assess the safety of their current environment and take appropriate action, and in many cases, accidents and troubles occur due to a lack of caution. To improve this situation and enable users to live their daily lives with peace of mind, a system is needed that can assess environmental risks in real time and provide swift and concrete safety measures.

[0718] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0719] In this invention, the server includes means for providing feedback to the user to identify surrounding hazards and recommend safe actions, means for coordinating with external law enforcement organizations as needed, and means for generating feedback, including details, in real time when the analysis server detects suspicious behavior, and providing it immediately to the user terminal. This enables the user to continuously take safe and appropriate actions.

[0720] An "image acquisition device" is a device that captures surrounding visual information at high resolution and outputs it as digital data.

[0721] A "communication device" is a device used to transmit digital data to other devices or servers using wireless or wired technology.

[0722] An "analysis server" is a computer system installed to process received data and perform analysis based on specific conditions.

[0723] A "database of illegal activities" is a database that records information about actions and objects that violate laws and regulations that have been registered in advance.

[0724] "Risk assessment" is the process of determining potential risks related to the current situation based on the results of an analysis.

[0725] A "user terminal" is an electronic device used by a user to receive information and to receive feedback and recommendations through an interface.

[0726] "Automatic reporting" refers to a function where the system automatically reports the situation to external relevant organizations when certain criteria are met.

[0727] "External organizations" refer to groups such as the police and private security organizations that cooperate for the purpose of maintaining public order and ensuring safety.

[0728] "Feedback" refers to information provided based on analysis results to inform users about the situation they are facing and to guide them toward appropriate actions.

[0729] A "public safety organization" is a public or private group that works to ensure local security and compliance with the law.

[0730] In the system based on this invention, the user first utilizes an image acquisition device equipped with a high-performance camera and sensors. This image acquisition device collects surrounding visual information in real time and transmits this acquired data to an analysis server via a communication device. Wireless communication technologies such as Bluetooth and 4G / 5G are used as the communication technology.

[0731] The server compares the received video data with a pre-configured database of illegal activities and performs analysis using an AI analysis engine. A cloud-based solution such as Google Cloud AI could be used as the AI analysis engine. Based on the analysis results, the server assesses the risk and immediately notifies the user's device. The user's device is a mobile device such as a smartphone or tablet, and feedback is displayed on its screen to provide the user with recommendations for safe behavior.

[0732] Furthermore, the server uses an automated reporting function as needed to report the situation to external law enforcement organizations. This function mitigates potential risks faced by users and enables a rapid response. Users can take actions to avoid risks based on feedback from the system.

[0733] As a concrete example, if a user is walking at night using their smartphone, the smartphone's AI security assistant will detect suspicious activity in the vicinity, issue a warning to the user, and suggest a safe route. Furthermore, if the situation is deemed particularly dangerous, it will immediately notify the police. In this way, users can move around with peace of mind.

[0734] An example of a prompt for a generated AI model might be: "Please describe a risk detection algorithm for ensuring safety at night. In particular, please describe in detail how to efficiently recognize suspicious activity in the surroundings and how to design a system that provides feedback to the user."

[0735] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0736] Step 1:

[0737] The user uses an image acquisition device to acquire high-resolution video data of the surroundings in real time. During this process, the camera and sensors of the image acquisition device capture visual information, and this data is transmitted to the communication device in digital format. The input is real-time acquired visual data, and the output is digital video data.

[0738] Step 2:

[0739] The communication device transmits the acquired digital video data to the server. Wireless technologies such as Bluetooth and LTE are used for fast and secure data transfer. The input is the video data generated in step 1, and the data is the output delivered to the server.

[0740] Step 3:

[0741] The server receives video data and processes it using an AI analysis engine. This AI analysis engine uses a cloud-based solution such as Google Cloud AI to analyze actions and objects in the video and compare them against a database of illegal activities. The input is the video data sent to the server, and the output is the analyzed information on actions and objects.

[0742] Step 4:

[0743] The server assesses the risks of the current environment based on the analyzed data. Using a risk assessment algorithm, it performs a data evaluation process, obtaining the level of risk as output. The input is the analysis result from step 3.

[0744] Step 5:

[0745] The server notifies the user terminal of the assessed risk results. The user terminal displays warnings and feedback on safety actions on the screen of a smartphone or tablet. The input is the risk assessment results obtained in step 4, and the output is the notification content to the user.

[0746] Step 6:

[0747] The server automatically notifies external law enforcement organizations of risk information when the risk exceeds a certain threshold. The notification includes location information and details of the risk, prompting a swift external response. The input is the risk assessment result from step 4, and the output is the reported risk information.

[0748] Step 7:

[0749] The user takes safe actions based on the feedback. They follow the provided safe routes and guidelines, and take specific actions to avoid risks. The input is the feedback information from step 5, and the output is the user's actions.

[0750] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0751] This invention provides a system that incorporates an emotion engine in addition to an image acquisition device and an AI analysis engine, thereby providing feedback and notifications that take into account the user's emotional state. This system ensures user safety while simultaneously improving local public safety by considering psychological factors.

[0752] First, the user puts on an image acquisition device. This device not only captures images of the surroundings but also has the function of detecting the user's own facial expressions and voice. The device acquires the user's real-time facial expression data and sends it to the emotion engine.

[0753] The device (usually a smartphone) transmits the acquired video data and facial expression data to an analysis server. This communication is conducted using a secure protocol.

[0754] The server inputs the received data into an AI analysis engine and an emotion engine. The AI analysis engine works to detect illegal activities in the area from the video data. Meanwhile, the emotion engine analyzes the user's emotional state from their facial expressions and voice tone to determine their stress and anxiety levels.

[0755] Based on the analysis results, the server assesses the overall risk. Here, the user's emotional state is reflected in the risk assessment and incorporated into the final feedback. For example, if the user is in a high-stress state, the tone of the feedback may be changed, or additional warnings may be issued.

[0756] The device receives feedback from the server and notifies the user. This feedback includes not only potential illegal activity but also emotionally sensitive advice and recommendations. If necessary, the server automatically reports to external agencies. These reports include spatial and temporal information, as well as warnings based on emotional data.

[0757] Ultimately, the server stores the analysis results and emotional state data in a database. This data is used as a foundation for analyzing the psychological and safety trends of the community and is utilized in developing future improvement plans.

[0758] As a concrete example, suppose a user is walking through a noisy shopping street when an image acquisition device detects a violent incident occurring nearby. When the emotion engine senses the user's anxiety, the server determines that the situation is high-risk. It issues a warning to the device and simultaneously automatically notifies the police, prompting a swift response. In this way, the system can perform comprehensive risk management, including the user's psychological state.

[0759] The following describes the processing flow.

[0760] Step 1:

[0761] The user puts on the image acquisition device and activates the system. The device is equipped with facial expression detection capabilities and monitors the user's facial movements and voice tone in real time.

[0762] Step 2:

[0763] The terminal (smartphone) receives video data and user facial expression data transmitted from the image acquisition device. The data is transferred using a secure wireless protocol.

[0764] Step 3:

[0765] The terminal uploads video data and facial expression data to an analysis server. The server receives this data in a single process.

[0766] Step 4:

[0767] The server inputs video data into an AI analysis engine to determine whether the people or movements in the video constitute illegal activity. This determination is made by comparing the data with a database of illegal activities.

[0768] Step 5:

[0769] Simultaneously, the server inputs the user's facial expression data into the emotion engine. The emotion engine analyzes the user's emotional state (e.g., stress, anxiety, tension). This analysis combines facial expression recognition algorithms with voice analysis.

[0770] Step 6:

[0771] The server integrates the analysis results from the AI analysis engine and the emotion engine to perform a comprehensive risk assessment. If the user's emotional state is determined to be higher risk than usual, special considerations are added to the assessment results.

[0772] Step 7:

[0773] The server generates feedback that takes the user's emotions into consideration, based on the analysis results. This feedback includes recommendations for specific actions and advice on emotional support.

[0774] Step 8:

[0775] The device receives the generated feedback and immediately notifies the user. The notification includes detailed information about the situation the user is facing and recommended actions.

[0776] Step 9:

[0777] If necessary, the server will trigger an automatic reporting system. The report will include details about the potential illegal activity, as well as an analysis of the user's emotional state.

[0778] Step 10:

[0779] The server stores all analysis results and sentiment data in a database, which will be used for long-term analysis of local security and the psychological state of residents. This data will contribute to planning future security improvement measures.

[0780] (Example 2)

[0781] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0782] Ensuring social safety and maintaining individual psychological health are crucial challenges in modern society. Conventional security systems only detect surrounding dangers, making it difficult to respond appropriately while considering the user's emotional state. Furthermore, they lack sufficient real-time notifications and warnings to respond quickly to abnormal situations. There is a need to realize a safe and psychologically secure society that takes into account emotional states and environmental changes.

[0783] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0784] In this invention, the server includes means for capturing ambient data via image and sound acquisition equipment, means for transmitting the acquired data to an analysis device via a communication device, and means for the analysis device to analyze the video and audio data and compare it with a pre-configured database to evaluate behavior or emotions. This enables comprehensive risk management that takes into account emotional states while ensuring user safety.

[0785] "Image and audio acquisition equipment" refers to devices that capture surrounding video and audio, and acquire the user's visual and auditory information in real time.

[0786] A "communication device" is a device used to transmit acquired data to other devices or servers, and it ensures the security of the data using security protocols.

[0787] An "analysis device" is a device that analyzes behavior and emotions based on received data, and uses AI technology to process the data and make predictions and judgments.

[0788] A "database" is a collection of information that serves as a reference standard for video and audio analysis, and it forms the basis for comparing data with past data and specified information.

[0789] "Assessment of behavior or emotion" is the process by which an analytical device analyzes a person's behavioral patterns and emotional state based on data, and determines their stress level and risk.

[0790] An "information terminal" refers to a device used to notify users of analysis results and feedback, and is a portable communication device.

[0791] "Automatic reporting to external agencies" is a process that automatically reports information to the relevant authorities when an anomaly or danger is detected, in order to facilitate a rapid response.

[0792] To implement this invention, the user first wears an image and audio acquisition device. This device is capable of capturing surrounding video and the user's own voice in real time. Next, the device transmits the acquired data to an analysis device. At this time, communication technologies such as Bluetooth and Wi-Fi are used, and the security of the data is ensured by security protocols.

[0793] The server inputs the received data into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes the video data to detect abnormal behavior. Meanwhile, the emotion analysis engine analyzes the user's emotional state based on audio data and facial expression data to determine the level of stress and anxiety. The analysis device compares the data with a pre-configured database to assess the risk.

[0794] Based on the evaluation results, the device provides feedback to the user. This feedback may include advice and action guidelines tailored to the user's emotional state. Furthermore, if necessary, the server automatically notifies external organizations to prompt a swift response. This information includes location and time data from the analysis device.

[0795] For example, if a user is walking in a noisy area and the tracking device detects danger, and the server determines that the user is experiencing high stress, the device will display instructions to the user such as "move to a safe place" or "take a deep breath." By utilizing this system, user safety can be ensured, and real-time risk management becomes possible.

[0796] Examples of prompts for a generative AI model:

[0797] "Please explain the specific processing steps for a local security system that takes user emotions into consideration. In particular, please focus on the data analysis and feedback provision processes."

[0798] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0799] Step 1:

[0800] The user wears image and audio acquisition equipment. This equipment captures surrounding video and the user's own voice in real time and saves it as digital data. The input is ambient environmental data and the user's voice data, and the output at this stage is digitized video and audio data. Specifically, the camera and microphone become constantly active and begin capturing data.

[0801] Step 2:

[0802] The terminal receives the acquired digital data and transmits it to the analysis device. For security reasons, the data is transmitted via Bluetooth or Wi-Fi. The input is digitized video and audio data, and the output is secure data packets for transmission to the server. Specifically, the data is compressed and encrypted, and the packetized data is transmitted over the network.

[0803] Step 3:

[0804] The server inputs data received from the terminal into the AI analysis engine and the emotion analysis engine. The AI analysis engine analyzes behavior and abnormal situations from video data, while the emotion analysis engine analyzes voice tone and facial expressions to determine the user's emotional state. The input is compressed data sent to the server, and the output is the behavior analysis results and the emotional state evaluation results. Specifically, the analysis engine uses an image recognition algorithm to detect suspicious activities and emotional patterns through data pattern matching.

[0805] Step 4:

[0806] The server performs a comprehensive risk assessment based on the analysis results. The user's emotional state is a key factor in this assessment. Inputs are behavioral analysis results and emotional state assessment results, while outputs are a risk score and recommended actions. Specifically, the server integrates the data and applies a risk algorithm to quantify the potential risk level.

[0807] Step 5:

[0808] The device receives risk assessment results from the server and provides feedback to the user. This feedback includes emotionally sensitive advice. Input is the risk score and recommended actions from the server, and output is notifications and recommended actions to the user. Specifically, the device displays alerts and draws attention through sound and vibration.

[0809] Step 6:

[0810] If necessary, the server will automatically notify external organizations. This notification will include the user's location and time information. The input is detailed data on situations deemed high-risk, and the output is the notification information sent to external organizations. Specifically, the notification protocol is activated, and a standardized message is sent to pre-configured emergency contacts.

[0811] (Application Example 2)

[0812] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0813] In modern society, deteriorating public safety and increasing psychological stress are serious problems. In public spaces, while prevention of illegal activities and swift response are required, it is also necessary to consider the psychological burden on individuals. However, conventional systems have difficulty accurately grasping the emotional state of users and conducting risk assessments accordingly, which has sometimes led to delays in response. This invention aims to solve these problems and realize both user safety and psychological consideration.

[0814] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0815] In this invention, the server includes means for capturing surrounding video via an image acquisition device, means equipped with an AI analysis engine for analyzing the acquired video and audio data, and means for evaluating the user's emotional state using an emotion analysis engine to determine risk. This enables early detection of illegal activities that also take into account the user's emotional state, and provides feedback to reduce psychological burden.

[0816] An "image acquisition device" is a device that captures surrounding video footage and collects visual information about the user and the environment.

[0817] A "communication device" is a device used to transmit acquired data to a server for analysis, and it has the function of connecting using a secure protocol.

[0818] An "analysis server" is a computer system that processes received video and audio data to perform behavioral analysis, sentiment analysis, and risk assessment.

[0819] An "AI analysis engine" is artificial intelligence that analyzes actions and objects from video data and detects illegal activities by comparing them with a pre-configured database of illegal activities.

[0820] An "emotion analysis engine" is a system that uses voice and facial expression data to analyze the user's emotional state and evaluate their stress and anxiety levels.

[0821] A "database of illegal activities" is a collection of information that records past cases and behavioral patterns based on laws, and is used to cross-reference with analysis results.

[0822] "Risk assessment" is the process of evaluating the degree of potential danger and psychological burden based on analysis results and emotional assessments.

[0823] A "user terminal" is an electronic device that notifies users of analysis results and feedback, and is used for users to receive information.

[0824] "Automatic notification" is a function that sends alerts to external organizations based on analysis results and risk assessments as needed.

[0825] A "database" is a means of storing information that accumulates the results of analysis and is used for analyzing and providing information on local public safety and psychological trends.

[0826] To implement this invention, the following system configuration is primarily required. The user first wears an image acquisition device, which acquires video footage of the surroundings and their own facial expression data. The server receives this acquired video and audio data and performs analysis using an AI analysis engine and an emotion analysis engine. Specifically, the AI analysis engine analyzes actions and objects in the video and compares them with a database of illegal activities to assess potential risks.

[0827] The server uses an emotion analysis engine to analyze the user's emotional state from voice data and determine the level of stress and anxiety. This analysis result, along with the emotional assessment, is comprehensively evaluated to determine the risk level. If necessary, the server uses an automatic notification function to send an alert to an external organization. This feedback is notified to the user's terminal, providing the user with feedback and warnings.

[0828] The software used includes TensorFlow and PyTorch, which enable the functionality of the AI analysis engine. Data processing and calculations are performed through these frameworks. For sentiment analysis, custom algorithms are used to analyze facial expressions and voice tone.

[0829] For example, if a user is walking through a busy area and the emotion engine detects potential illegal activity from surrounding noise and behavior, and assesses the user's high stress level, the system will provide feedback such as, "There is a problem nearby. Please move to a safer location," and will contact the police if necessary.

[0830] An example of a prompt message is: "The user is walking through a busy area and feels uneasy about their surroundings. The emotion engine has assessed the stress level as high. Please provide a safety-conscious action plan for how to respond to this situation." This demonstrates how the system can generate questions based on the user's emotions and circumstances.

[0831] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0832] Step 1:

[0833] The user wears an image acquisition device. The device acquires video of the surroundings and the user's facial expressions, and simultaneously collects audio data. This input data is transmitted to a server via a communication device. The output here is video and audio data.

[0834] Step 2:

[0835] The server inputs the received video data into its AI analysis engine and begins analyzing behavior and objects. It then performs a process of comparing the data with an illegal activity database to determine whether or not illegal activity has occurred. The input here is video data, and the output is the analyzed behavior data and the evaluation results of illegal activity.

[0836] Step 3:

[0837] Simultaneously, the server inputs the voice data into an emotion analysis engine to evaluate the user's emotional state. This analysis focuses on determining stress and anxiety levels. The input is voice data, and the analyzed emotion data is the output.

[0838] Step 4:

[0839] The server integrates behavioral and emotional data to perform a risk assessment. Here, it determines the overall level of risk and decides on the necessary actions. Inputs are behavioral and emotional data, and output is the risk level assessment result.

[0840] Step 5:

[0841] The server generates and sends feedback to the user terminal based on the evaluation results. This feedback includes specific advice and warnings. The input is the risk assessment result, and the output is the content of the notification.

[0842] Step 6:

[0843] In some cases, the server uses an automated notification function to send an alert to the appropriate external organization. The input is the risk assessment result, and the output is the content of the notification.

[0844] Each step incorporates multiple evaluation processes and data analysis methods to enable rapid responses based on the user's safety and psychological state.

[0845] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0846] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0847] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0848] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0849] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0850] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0851] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0852] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0853] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0854] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0855] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0856] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0857] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0858] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0859] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0860] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0861] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0862] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0863] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0864] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0865] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0866] The following is further disclosed regarding the embodiments described above.

[0867] (Claim 1)

[0868] Means for capturing surrounding images via an image acquisition device,

[0869] A means for transmitting acquired video data to an analysis server via a communication device,

[0870] The server includes means for analyzing actions and objects in the video and comparing them with a pre-configured database of illegal activities,

[0871] A means of performing a risk assessment based on the analysis results and notifying the user terminal of the assessment results,

[0872] A means of automatically notifying external organizations as needed,

[0873] The results of the analysis are stored in a database and used as a means to provide information for improving local security.

[0874] A system that includes this.

[0875] (Claim 2)

[0876] The system according to claim 1, further comprising means for generating real-time feedback, including details thereof, when an analysis server detects a suspected illegal activity, and immediately providing it to the user terminal.

[0877] (Claim 3)

[0878] The system according to claim 1, further comprising means for detecting specific disruptive behaviors within acquired video data and sending a warning to the user based on such behavior.

[0879] "Example 1"

[0880] (Claim 1)

[0881] A means of capturing surrounding visual information via an image acquisition device,

[0882] A means for transmitting acquired visual information to an analysis device via a communication device,

[0883] The analysis device includes means for analyzing actions and objects within visual information and comparing them with a pre-configured database of inappropriate behaviors,

[0884] A means of performing a risk assessment based on the analysis results and notifying the assessment results to the user's terminal,

[0885] A means of automatically reporting to external organizations as needed,

[0886] The results of the analysis are stored in a database and used as a means to provide information for improving local security.

[0887] A means of providing users with recommended actions based on specific situations and supporting their action choices,

[0888] A system that includes this.

[0889] (Claim 2)

[0890] The system according to claim 1, further comprising means for generating real-time feedback, including details thereof, when the analysis device detects the possibility of inappropriate behavior, and immediately providing it to the user terminal.

[0891] (Claim 3)

[0892] The system according to claim 1, comprising means for detecting specific nuisance behaviors within acquired visual information and transmitting a warning to the user based on that.

[0893] "Application Example 1"

[0894] (Claim 1)

[0895] Means for capturing surrounding images via an image acquisition device,

[0896] A means for transmitting acquired video data to an analysis server via a communication device,

[0897] The server includes means for analyzing actions and objects in the video and comparing them with a pre-configured database of illegal activities,

[0898] A means of performing a risk assessment based on the analysis results and notifying the user terminal of the assessment results,

[0899] A means of automatically notifying external organizations as needed,

[0900] The results of the analysis are stored in a database and used as a means to provide information for improving local security.

[0901] A means of providing users with feedback to identify surrounding hazards and recommend safe actions,

[0902] If necessary, means of cooperating with external security organizations,

[0903] A system that includes this.

[0904] (Claim 2)

[0905] The system according to claim 1, further comprising means for generating real-time feedback, including details thereof, when an analysis server detects suspicious behavior, and immediately providing it to the user terminal.

[0906] (Claim 3)

[0907] The system according to claim 1, further comprising means for detecting specific risks in acquired video data and transmitting warnings and information on safe routes to the user based on those risks.

[0908] "Example 2 of combining an emotion engine"

[0909] (Claim 1)

[0910] Means for capturing surrounding data via image and sound acquisition devices,

[0911] A means for transmitting the acquired data to an analysis device via a communication device,

[0912] The analysis device includes means for analyzing video and audio data and comparing it with a pre-configured database to evaluate behavior or emotion,

[0913] A means for performing a risk assessment based on the analysis results and the user's emotional state, and for notifying the assessment results to an information terminal,

[0914] A means of automatically notifying external organizations as needed,

[0915] The results of the analysis will be accumulated and used as a means to provide information for improving public safety,

[0916] A system that includes this.

[0917] (Claim 2)

[0918] The system according to claim 1, further comprising means for generating real-time feedback, including details thereof, when the analysis device detects suspected illegal activity or a high-stress state, and immediately providing it to an information terminal.

[0919] (Claim 3)

[0920] The system according to claim 1, further comprising means for detecting specific behaviors or psychological states within acquired data and sending a warning to the user based thereon.

[0921] "Application example 2 when combining with an emotional engine"

[0922] (Claim 1)

[0923] Means for capturing surrounding images via an image acquisition device,

[0924] A means for transmitting acquired video and audio data to an analysis server via a communication device,

[0925] The server includes means for analyzing actions and objects in the video and comparing them with a pre-configured database of illegal activities,

[0926] A means for analyzing the user's emotional state using an emotion analysis engine and evaluating stress and anxiety levels,

[0927] A means for performing a risk assessment based on the analysis results and sentiment evaluation, and notifying the user terminal of the assessment results,

[0928] A means of automatically notifying external organizations as needed,

[0929] The results of the analysis are stored in a database and used as a means to provide information for improving local public safety and psychological trends.

[0930] A system that includes this.

[0931] (Claim 2)

[0932] The system according to claim 1, further comprising means for generating and immediately providing to the user terminal detailed and emotional feedback in real time when the analysis server detects suspected illegal activity and confirms the user is in a high-stress state.

[0933] (Claim 3)

[0934] The system according to claim 1, further comprising means for detecting specific disruptive behaviors within acquired video data and sending warnings and advice to the user according to the user's emotional state. [Explanation of Symbols]

[0935] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. Means for capturing surrounding images via an image acquisition device, A means for transmitting acquired video data to an analysis server via a communication device, The server includes means for analyzing actions and objects in the video and comparing them with a pre-configured database of illegal activities, A means of performing a risk assessment based on the analysis results and notifying the user terminal of the assessment results, A means of automatically notifying external organizations as needed, The results of the analysis are stored in a database and used as a means to provide information for improving local security. A means of providing users with feedback to identify surrounding hazards and recommend safe actions, If necessary, means of cooperating with external security organizations, A system that includes this.

2. The system according to claim 1, further comprising means for generating real-time feedback, including details thereof, when an analysis server detects suspicious behavior, and immediately providing it to the user terminal.

3. The system according to claim 1, further comprising means for detecting specific risks in acquired video data and transmitting warnings and information on safe routes to the user based on those risks.