An AI vision-based smart city security risk intelligent investigation system
By integrating multi-source data and AI models, a comprehensive security data resource pool is constructed, which accurately identifies and dynamically manages urban security risks. This solves the problems of incomplete coverage, insufficient real-time performance, and poor adaptability of traditional security systems, and achieves efficient and intelligent risk management.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 深圳柏成科技有限公司
- Filing Date
- 2026-03-04
- Publication Date
- 2026-06-30
AI Technical Summary
Traditional urban security systems suffer from incomplete data coverage, insufficient real-time performance, low level of intelligence, poor system linkage, and insufficient adaptability, making it difficult to achieve full coverage and accurate identification of urban security risks.
By integrating video surveillance, IoT sensors, geographic information, and meteorological data, a multi-dimensional security data resource pool is constructed. The spatiotemporal curvature attention model is used to identify abnormal behavior, and the physical verification module is used to filter out false judgments. Furthermore, the AI prediction and early warning model is used to dynamically predict the development trend of potential hazards, forming a closed-loop management system for potential hazards across the entire chain.
It achieves full coverage of urban security systems, accurately identifies abnormal behavior, reduces false alarm rates, ensures the timeliness and effectiveness of hazard handling, adapts to security needs in different scenarios, and provides more efficient and intelligent solutions.
Smart Images

Figure CN121766939B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of smart city security technology, specifically to an intelligent system for identifying potential security risks in smart cities based on AI vision. Background Technology
[0002] With the acceleration of urbanization, the scale of cities is constantly expanding, and the density of population and facilities is increasing significantly. Urban security is facing increasingly complex challenges. Traditional security systems mainly rely on a single data source for manual or semi-automatic analysis, which has problems such as incomplete data coverage, insufficient real-time performance, and low level of intelligence. At the same time, urban security risks are characterized by spatiotemporal dynamics and multi-factor coupling, making it difficult for traditional technologies to achieve full coverage and accurate identification. In recent years, the rapid development of AI vision, IoT, and big data technologies has provided new solutions for smart city security. Through multi-source data fusion and intelligent analysis, real-time perception, anomaly identification, and trend prediction of urban security scenarios can be achieved, providing more efficient and intelligent technical means for urban security management.
[0003] Traditional urban security technologies suffer from the following shortcomings: First, data sources are limited, relying on video surveillance or manual patrols, making it difficult to comprehensively cover security scenarios across the entire city and easily overlooking hidden or sudden security risks. Second, analytical methods are outdated, often relying on post-event manual review or simple rule matching, failing to identify abnormal behavior in real time and predict the development trend of potential hazards, resulting in delayed response. Third, system interoperability is poor, with each module operating independently and lacking a closed-loop management mechanism, making it difficult to form a collaborative response capability. Finally, adaptability is insufficient; traditional model parameters are fixed and cannot dynamically adapt to the security needs of different scenarios, leading to high false alarm or false negative rates. These problems restrict the application effectiveness of traditional security technologies in smart cities and urgently require breakthroughs through technological innovation. Summary of the Invention
[0004] The purpose of this invention is to overcome the shortcomings of existing technologies and provide an intelligent security hazard investigation system for smart cities based on AI vision. By integrating video surveillance, IoT sensors, geographic information and meteorological data, a multi-dimensional security data resource pool is constructed. The system uses a spatiotemporal curvature attention model to accurately identify abnormal behavior, combines a physical verification module to filter out false judgments, and relies on an AI prediction and early warning model to dynamically predict the development trend of hazards. Finally, through a response linkage module, closed-loop management of hazards is achieved, forming a full-chain solution of "perception-identification-early warning-handling".
[0005] To solve the above-mentioned technical problems, the present invention provides the following technical solution: an intelligent system for investigating security risks in smart cities based on AI vision, the system comprising: a data acquisition and fusion module, a spatiotemporal anomaly identification module, a physical verification module, an AI prediction and early warning module, and a response and linkage module;
[0006] The data acquisition and fusion module collects video surveillance data, IoT sensor data, geographic information data, and meteorological data. Through standardized conversion, correlation mapping algorithms, and dynamic entropy weight fusion models, it constructs a multi-dimensional, interconnected, and secure data resource pool covering the entire city.
[0007] The spatiotemporal anomaly identification module: Based on the data output by the data acquisition and fusion module, it uses the spatiotemporal curvature attention model to process the video stream, extract spatiotemporal features, identify and mark abnormal behaviors and scenes in urban security scenarios, and generate an abnormal sample list;
[0008] The physical verification module calls the multi-domain coupling verification rule set, combines it with real-time data from IoT sensors, and uses a physical consistency verification algorithm to verify the abnormal samples output by the spatiotemporal anomaly identification module, filters out false judgment information, and outputs a list of credible anomalies.
[0009] The AI prediction and early warning module: Based on the causal emergence prediction model, it learns the semantics and correlations of multi-source data, predicts the development trend of potential hazards for credible anomalies processed by the physical verification module, and outputs standardized early warning information.
[0010] The response and linkage module receives early warning information output by the AI prediction and early warning module, links with the city security command platform, executes hazard handling operations, records information throughout the entire handling process, and forms a closed-loop management of hazards.
[0011] Furthermore, in the data acquisition and fusion module, the data acquisition methods and the acquired data are as follows: Video surveillance data: acquired through ultra-high-definition network cameras deployed at urban road intersections, square centers, building entrances and exits, and bridge sections. The acquired content includes the movement trajectories of pedestrians, vehicles, and moving objects, as well as image information of building exteriors, road markings, and facility layouts; Internet of Things sensor data: air quality sensors are deployed at the boundaries of industrial areas, along both sides of main traffic arteries, and in public areas of residential areas to collect PM2.5, PM10, sulfur dioxide, and nitrogen oxide concentrations; water level sensors are installed in rivers... Real-time water level data is collected at embankments, underground parking garage entrances, and low-lying water accumulation points. Vibration sensors are attached to bridge bearings, building load-bearing walls, and the outer walls of underground utility tunnels to collect vibration frequency and acceleration values. Geographic information data is obtained from the urban surveying database at a scale of 1:500, including coordinate information of road red lines, building outlines, and pipeline routes. Meteorological data is collected by connecting to the real-time database of the municipal meteorological observation station, collecting temperature, relative humidity, wind speed, wind direction, and precipitation data updated every 10 minutes, as well as information on the level, release time, and impact range of meteorological warning signals.
[0012] Furthermore, in the data acquisition and fusion module, the calculation formula for the dynamic entropy weight fusion model is as follows: ,in Let be the fused data vector at time t. For time t, the first Dynamic weighting of security-related data collection For time t, the first Similar to raw security data collection, For time t, the first Information entropy of class data, This is the entropy weight adjustment coefficient. It refers to the number of security data categories involved in the integration.
[0013] Furthermore, in the spatiotemporal anomaly recognition module, the spatiotemporal curvature attention model includes a spatial feature extraction unit, a temporal feature learning unit, a curvature calculation unit, and a weight allocation unit. The spatial feature extraction unit uses a convolutional neural network to extract the contours, positions, and dimensions of people, vehicles, and objects in video frames, generating a spatial feature map. The temporal feature learning unit uses a long short-term memory network to learn the speed, direction, and trajectory parameters of target motion in a continuous video frame sequence, constructing a temporal feature chain. The curvature calculation unit calculates the curvature value and rate of change of curvature of the target's motion trajectory based on the spatial feature map and the temporal feature chain. The weight allocation unit assigns attention weights to targets in different regions and time periods within the video frame according to the curvature value, rate of change of curvature, and preset scene parameters, with weight values ranging from 0 to 1.
[0014] Furthermore, in the spatiotemporal anomaly recognition module, the spatial feature extraction unit of the spatiotemporal curvature attention model employs a convolutional neural network containing 5-8 convolutional layers. The first 3 layers are basic feature extraction layers, using 3×3 convolutional kernels with a stride of 1; the latter 2-5 layers are advanced feature fusion layers, using 5×5 convolutional kernels with a stride of 2. Each convolutional layer is followed by batch normalization and a ReLU activation function, generating a spatial feature map with a resolution of 1 / 4-1 / 8 of the input video frame. The temporal feature learning unit of the spatiotemporal curvature attention model employs a long short-term memory network containing 2-4 hidden layers, each with 128-512 neurons. The input is a sequence of spatial feature maps from 16-32 consecutive video frames. A gating mechanism preserves the long-term dependencies of the target motion, and the output temporal feature chain includes parameters such as the target motion's velocity change, direction deflection angle, and trajectory offset distance. In the curvature calculation unit of the spatiotemporal curvature attention model, the curvature value of the target motion trajectory... Based on position coordinates of three consecutive frames , , The calculation formula is as follows: , , , The goal is to , , Frames Axis coordinates , , The goal is to , , Frame's y-axis coordinate; rate of change of curvature The calculation method is as follows: ,in For the goal in the The trajectory curvature value corresponding to the frame. It is the trajectory curvature value corresponding to frame t. It is the time interval between frame t and frame t-1; the attention weight allocation process of the weight allocation unit of the spatiotemporal curvature attention model is as follows: the curvature value is standardized to the range of 0-0.4, the curvature change rate is standardized to the range of 0-0.3, the preset scene parameter value range is 0.2-0.5, and the three are weighted and summed according to the weight ratio of 0.4:0.3:0.3 to obtain the initial weight value; the initial weight value is normalized by min-max so that the final attention weight value is mapped to the range of 0-1, wherein the larger the curvature value, the higher the curvature change rate, and the larger the preset scene parameter value, the larger the corresponding attention weight value.
[0015] Furthermore, the physical verification module includes a multi-domain coupled verification rule set comprising: structural safety verification rules, environmental correlation verification rules, and traffic operation verification rules. The structural safety verification rules cover building load-bearing limit thresholds, bridge vibration frequency ranges, and pipeline pressure tolerance ranges, and are associated with building material strength parameters and structural design parameters. The environmental correlation verification rules include the correspondence between temperature and humidity and road surface conditions, the matching standard between precipitation and water level rise rate, and the correlation table between wind force and outdoor facility stability coefficients, and are associated with meteorological parameters and geographical feature parameters. The traffic operation verification rules cover the range of vehicle braking distances under different road surface conditions, the correspondence between road curvature and safe vehicle speed, and the benchmark value of intersection traffic efficiency, and are associated with traffic flow parameters and road attribute parameters.
[0016] Furthermore, in the physical verification module, the calculation formula for the physical consistency verification algorithm is as follows: ,in For physical verification deviation rate, Physical parameters identified by AI These are values calculated using a cross-scale physical model. To verify the number of parameters, It is an index of the verification parameters, used to distinguish different types of physical verification parameters. As the confidence threshold, when When an abnormal sample is deemed credible, it is determined that the sample is indeed abnormal.
[0017] Furthermore, in the AI prediction and early warning module, the causal emergence prediction model learns the semantics and correlations of multi-source data and predicts the development trend of potential hazards through the following steps:
[0018] Data preprocessing: Receives reliable anomaly data output from the physical verification module and combines it with fused data from the data acquisition and fusion module. Extract semantic features from the data, including anomaly type semantics, spatial location semantics, and time series semantics, and construct a semantic feature vector library;
[0019] Causal Relationship Mining: Causal Emergent Networks Based on Transformer Architecture Input semantic feature vector library and scene causal chain at time t By using a multi-head attention mechanism to capture implicit correlations between data, we can mine the causal relationship of "abnormal causes - evolutionary paths - potential consequences" and generate a causal relationship graph. The graph nodes contain abnormal events and related factors, and the edge weights are the causal strength.
[0020] Trend prediction calculation: combining the risk amplification matrix at time t Weighted calculations are performed on the causal relationship graph to predict the future. The probability distribution of potential hazards developing within a given timeframe, the trend of their expanding impact, and the possible derivative secondary risks are used to output a risk prediction vector. ;
[0021] Result Transformation Output: Risk Prediction Vector The information is converted into standardized early warning information, including the development level of potential hazards, key influencing factors, and suggestions for handling priorities, and then pushed to the response and linkage module.
[0022] Furthermore, in the AI prediction and early warning module, the calculation formula for the causal emergence prediction model is as follows: ,in for Risk prediction vector after time, For causal emergent networks, It is the fused data after fusing multi-source heterogeneous data at time t. Let be the causal chain of the scene at time t. Let be the risk amplification matrix at time t. It is the prediction time interval, let's assume The modulus is ,when At that time, it was determined to be low risk, among which The risk level threshold is determined by statistical analysis of historical security incident data; when At that time, it was determined to be of medium risk. This is the critical threshold for medium to high risk; when When the risk level is deemed high, the highest level of early warning response is triggered; threshold , Supports dynamic updates.
[0023] Furthermore, the standardized early warning information output by the AI prediction and early warning module includes GIS annotation information such as the name of the hazard, risk level, predicted occurrence time, and affected area, as well as risk descriptions after semantic parsing and conversion, and multi-department collaborative handling suggestions related to the emergency response plan knowledge base.
[0024] Compared with existing technologies, this AI vision-based intelligent system for identifying potential security risks in smart cities has the following advantages:
[0025] I. This invention integrates video surveillance, IoT sensors, geographic information, and meteorological data to construct a security data resource pool covering the entire city. This breaks through the limitations of traditional security systems that rely on a single data source. Based on the spatiotemporal curvature attention model, the system can deeply extract spatiotemporal features from video streams, accurately identify abnormal behaviors and scenes, and generate a highly reliable list of abnormal samples. This combination of multi-source data fusion and intelligent analysis not only improves the comprehensiveness of anomaly detection but also significantly reduces the false alarm rate, providing more reliable technical support for urban security.
[0026] Second, this invention filters out false positives through a physical verification module, combines an AI prediction and early warning model to predict trends in credible anomalies, outputs standardized early warning information, and links with the city security command platform to execute handling operations. This process ensures the timeliness and effectiveness of hazard handling, while recording information throughout the entire process to form closed-loop management. In addition, the causal emergence prediction model can dynamically update risk level thresholds to adapt to security needs in different scenarios, further improving the system's flexibility and adaptability. This full-chain, dynamic management approach provides a more efficient and intelligent solution for smart city security.
[0027] Other advantages, objectives and features of the invention will be set forth in part in the description which follows, and in part will be apparent to those skilled in the art from the following examination or study, or may be learned from the practice of the invention. Attached Figure Description
[0028] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are merely some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without any creative effort.
[0029] Figure 1 A flowchart illustrating the core module workflow of an AI-based vision-based intelligent system for identifying security risks in smart cities.
[0030] Figure 2 This is a data flow framework diagram for an AI vision-based intelligent system for identifying security risks in smart cities.
[0031] Figure 3 This is a flowchart illustrating the entire process of an AI-based vision-based intelligent system for identifying security risks in smart cities. Detailed Implementation
[0032] To further illustrate the technical means and effects of the present invention in achieving its intended purpose, the following detailed description of the specific implementation methods, structures, features, and effects of the present invention, in conjunction with the accompanying drawings and preferred embodiments, is provided below.
[0033] Example 1:
[0034] Investigation of potential safety hazards in urban bridge structures.
[0035] The data acquisition and fusion module initiates multi-source data acquisition. In terms of video surveillance data, ultra-high-definition network cameras deployed at both ends of the bridge entrances and exits, the middle section of the bridge deck, and near the bridge supports continuously collect the movement trajectories of pedestrians and passing vehicles on the bridge deck, as well as image information of the bridge appearance, road markings on the bridge deck, and the layout of bridge support facilities.
[0036] During the data collection process of IoT sensors, air quality sensors are deployed on both sides of the main traffic arteries around the bridge to collect air quality data related to the surrounding area; water level sensors are installed on the embankment of the river under the bridge to collect the river water level in real time; and vibration sensors are attached to the load-bearing structures of the bridge bearings and the main body of the bridge to collect vibration data related to the bridge under the influence of vehicle traffic and the natural environment.
[0037] Geographic information data is obtained from the city's surveying and mapping database, which provides 1:500 scale GIS vector data of the bridge and its surrounding area. This data includes the bridge's outline coordinates, the road red line coordinates on the bridge deck, and the coordinates of the surrounding pipeline network. Meteorological data is obtained by connecting to the real-time database of the municipal meteorological observation station, which collects temperature, relative humidity, wind speed, wind direction, and precipitation data of the area where the bridge is located every 10 minutes. At the same time, the level, release time, and impact range of the meteorological warning signals in the area are also obtained.
[0038] After collecting the aforementioned data, the data acquisition and fusion module first standardizes and transforms the data, converting data of different formats and units into a standard format that the system can recognize and process. Then, it establishes relationships between the various data types through association mapping. Finally, it uses a dynamic entropy weight fusion model to perform fusion calculations on the processed multi-source data. The calculation formula for the dynamic entropy weight fusion model is as follows: ,in Let be the fused data vector at time t. For time t, the first Dynamic weighting of security-related data collection For time t, the first Similar to raw security data collection, For time t, the first Information entropy of class data, This is the entropy weight adjustment coefficient. It refers to the number of security data categories involved in the integration, constructing a security data resource pool that covers the bridge and its surrounding area and has multiple dimensions of correlation, such as... Figure 1 As shown, this provides data support for subsequent modules.
[0039] The spatiotemporal anomaly identification module calls upon bridge-related safety data from the data acquisition and fusion module's output resource pool, such as... Figure 2 As shown, anomaly identification is carried out based on the spatiotemporal curvature attention model.
[0040] The spatial feature extraction unit of this model uses a convolutional neural network to process video frames in video surveillance data, extract parameters such as the appearance of bridge supports, the outline, position, and size of vehicles and pedestrians on the bridge deck, and generate a spatial feature map that reflects the spatial condition of the bridge.
[0041] The temporal feature learning unit uses a long short-term memory network to analyze continuous video frame sequences, learn the speed, direction, and trajectory change parameters of vehicles moving on the bridge, as well as the appearance change parameters of the bridge in different time periods, and construct a temporal feature chain that reflects the temporal change pattern of the bridge and surrounding targets.
[0042] The curvature calculation unit combines the generated spatial feature map with the temporal feature chain to calculate the curvature value and rate of change of curvature of the vehicle motion trajectory on the bridge surface. The formula for calculating the curvature value is as follows: , , , The goal is to , , Frames Axis coordinates , , The goal is to , , Frame's y-axis coordinate, rate of curvature change The calculation method is as follows: ,in For the goal in the The trajectory curvature value corresponding to the frame. It is the trajectory curvature value corresponding to frame t. It is the time interval between frame t and frame t-1, and it also calculates the curvature-related parameters corresponding to the morphological changes of key parts of the bridge (such as supports and load-bearing structures) in different time periods.
[0043] The weight allocation unit calculates the curvature value, curvature change rate, and pre-set bridge safety scenario parameters (such as the safety importance coefficient of different parts of the bridge and the safety judgment criteria under different traffic flows) and performs a weighted summation according to a specific weight ratio to obtain the initial weight value. Then, the initial weight value is normalized by min-max to map the final attention weight value to the 0-1 range. Higher attention weights are assigned to key areas of the bridge in the video frame (such as supports and load-bearing sections of the bridge deck) and key time periods (such as peak traffic hours and severe weather periods).
[0044] Through the above processing, the spatiotemporal anomaly recognition module can identify and mark abnormal behaviors and scenarios in bridge security scenarios, such as vehicles changing lanes abnormally or braking suddenly on the bridge deck, and abnormal displacement or cracks appearing in bridge supports, generating an anomaly sample list containing this abnormal information.
[0045] The physical verification module calls the bridge-related verification rules in the multi-domain coupled verification rule set, including structural safety verification rules, environmental correlation verification rules, and traffic operation verification rules.
[0046] The structural safety verification rules cover the bridge's load-bearing limit threshold, the vibration frequency range of bridge bearings and main load-bearing structures, and the pressure tolerance range of bridge-related pipelines (such as fire protection pipelines), while also relating to the strength parameters of bridge building materials and bridge structural design parameters. The environmental correlation verification rules include the correspondence between temperature and humidity and bridge deck conditions (such as whether it is icy or waterlogged), the matching standard between precipitation and the rate of rise in river level below the bridge, the correlation table between wind force and the stability coefficient of bridge outdoor facilities (such as bridge railings and signs), and the correlation between meteorological parameters and bridge and surrounding geographical feature parameters. The traffic operation verification rules cover the range of vehicle braking distance under different bridge deck conditions (dry, wet, icy), the correspondence between bridge deck curvature and safe vehicle speed, the benchmark value of traffic efficiency at intersections at both ends of the bridge, and the correlation between traffic flow parameters and bridge deck attribute parameters.
[0047] By combining real-time data on bridge vibration, water level, and air quality collected by IoT sensors, a physical consistency verification algorithm is used to verify the abnormal samples output by the spatiotemporal anomaly identification module. The calculation formula for the physical consistency verification algorithm is as follows: ,in For physical verification deviation rate, Physical parameters identified by AI These are values calculated using a cross-scale physical model. To verify the number of parameters, As the confidence threshold, when When determining the credibility of anomaly samples, It is an index of verification parameters, used to distinguish different types of physical verification parameters. It compares the physical parameters related to bridge anomalies identified by AI (such as bridge bearing vibration frequency and vehicle braking distance on the bridge deck) with the corresponding parameters calculated by the cross-scale physical model to obtain the physical verification deviation rate. If the physical verification deviation rate is less than the preset confidence threshold, the anomaly sample is determined to be credible; if it is greater than or equal to the confidence threshold, it is determined to be misjudged information and filtered out. Finally, a list of credible anomalies is output.
[0048] The AI prediction and early warning module operates based on a causal emergence prediction model. The calculation formula for the causal emergence prediction model is as follows: ,in for Risk prediction vector after time, For causal emergent networks, It is the fused data after fusing multi-source heterogeneous data at time t. Let be the causal chain of the scene at time t. Let be the risk amplification matrix at time t. It is the prediction time interval, let's assume The modulus is ,when At that time, it was determined to be low risk, among which The risk level threshold is determined by statistical analysis of historical security incident data; when At that time, it was determined to be of medium risk. This is the critical threshold for medium to high risk; when When the risk level is deemed high, the highest level of early warning response is triggered; threshold , Supporting dynamic updates, the system first performs data preprocessing, receiving reliable bridge anomaly data output from the physical verification module, and combining it with the fused data at time t output from the data acquisition and fusion module. Semantic features are then extracted from the data, including bridge anomaly type semantics (such as bearing vibration anomaly, bridge water accumulation anomaly), anomaly spatial location semantics (such as specific bearing location coordinates, bridge water accumulation area range), and anomaly time series semantics (such as anomaly start time, duration, and trend), and a semantic feature vector library is constructed.
[0049] In the causal relationship mining stage, a causal emergence network based on the Transformer architecture is input with a pre-constructed semantic feature vector library and the causal chain of the bridge scene at time t (such as "rainfall - bridge water accumulation - longer vehicle braking distance - increased risk of traffic accidents" and "bridge bearing aging - abnormal vibration frequency - aggravated structural safety hazards"). Through a multi-head attention mechanism, the implicit correlations between data are captured, and the causal relationship of "abnormal cause - evolution path - potential consequences" is mined to generate a causal relationship graph. The graph nodes contain abnormal bridge events (such as abnormal bearing vibration) and related factors (such as rainfall, bearing service life, and traffic flow). The edge weights are the causal strength between each factor and the abnormal event.
[0050] Entering the trend prediction calculation stage, the risk amplification matrix at time t (which considers the amplification coefficient of risk due to current weather conditions, traffic flow, and bridge service life) is used to perform weighted calculations on the causal relationship graph. This predicts the probability distribution of bridge hazard development in the next τ time period (e.g., the probability of abnormal bearing vibration intensification, the probability of traffic accidents caused by bridge water accumulation), the trend of influence range expansion (e.g., whether the bridge water accumulation area will expand, whether abnormal bearing vibration will affect the surrounding structure), and the potential for secondary risk generation (e.g., whether bridge water accumulation will lead to water accumulation on the road below the bridge, whether abnormal bearing will cause other load-bearing structure problems). The risk prediction vector is then output.
[0051] Finally, the results are transformed and output, converting the risk prediction vector into standardized early warning information, including the bridge hazard development level (low risk, medium risk, high risk), key influencing factors (such as rainfall, traffic flow, and bearing aging), and disposal priority suggestions (such as prioritizing the disposal of high-risk hazards and prioritizing the dispatch of bridge maintenance personnel and traffic control personnel), and pushed to the response linkage module.
[0052] After receiving standardized early warning information related to bridges from the AI prediction and early warning module, the response linkage module immediately links with the city's security command platform.
[0053] If the early warning information indicates that the abnormal vibration of the bridge bearings is of high risk, the city's security command platform will dispatch the bridge maintenance unit to arrange professional maintenance personnel with relevant testing equipment to the bridge site for further testing and repair. At the same time, it will dispatch the traffic management department to set up traffic warning signs at both ends of the bridge and take traffic control measures such as flow restriction, speed limit or even temporary closure of the bridge as appropriate to guide vehicles and pedestrians to detour.
[0054] During the handling process, the response and linkage module records the entire process information in real time, including the handling unit, handling personnel, handling time, handling measures, handling progress, and handling results. Once the abnormal vibration problem of the bridge bearing is resolved and the bridge is confirmed to be back to a safe state after testing, the traffic control measures are lifted, and the relevant handling information is compiled and archived, forming a closed-loop management of bridge hazards to ensure that bridge safety hazards are handled in a timely and effective manner.
[0055] In summary, in the scenario of investigating potential safety hazards in urban bridge structures, the system integrates multi-source data through a data acquisition and fusion module and processes it using a dynamic entropy weight fusion model to build a comprehensive safety data resource pool, laying the foundation for subsequent work. The spatiotemporal anomaly identification module accurately identifies anomalies based on a spatiotemporal curvature attention model, the physical verification module filters out false positives using a physical consistency verification algorithm, the AI prediction and early warning module predicts trends based on a causal emergence prediction model, and the response and linkage module links with the platform to achieve closed-loop management. The entire process is progressive, efficiently investigating potential bridge hazards and ensuring bridge safety and urban traffic order.
[0056] Example 2:
[0057] Fire safety hazard investigation in urban residential areas.
[0058] The data acquisition and fusion module initiates multi-source data acquisition for urban residential areas. Video surveillance data is collected through ultra-high-definition network cameras deployed at entrances and exits of residential areas, main roads, building entrances and exits, parking lots, and fire lanes. The content includes the activity trajectories of pedestrians in residential areas, the driving and parking trajectories of vehicles, as well as image information of the exterior of residential buildings, fire-fighting facilities (such as fire extinguishers, fire hydrants, and emergency indicator lights) in building corridors, and image information on the unobstructed status of fire lanes.
[0059] In terms of IoT sensor data acquisition, air quality sensors are deployed in public areas of residential areas, around garbage stations, and near kitchen exhaust vents to collect PM2.5, PM10, sulfur dioxide, and nitrogen oxide concentrations, while also monitoring for leaks of combustible gases (such as natural gas and liquefied petroleum gas). Temperature sensors are installed in residential building corridors, power distribution rooms, cable wells, garbage stations, and other areas prone to fire hazards to collect ambient temperature in real time. Smoke sensors are deployed in residential building corridors, residents' homes (with residents' consent), power distribution rooms, cable wells, and other locations to collect smoke concentration. Geographic information data is obtained from the city's surveying and mapping database, using 1:500 scale GIS vector data of the residential area, including the outlines of residential buildings, road red lines, fire lane directions, fire hydrant locations, and coordinate information of the distribution of power distribution rooms and cable wells. Meteorological data is connected to the real-time database of the municipal meteorological observation station, collecting temperature, relative humidity, wind speed, wind direction, and precipitation data for the residential area every 10 minutes, as well as information on the level, release time, and impact range of meteorological warning signals (such as high temperature, drought, and strong wind warnings).
[0060] After data acquisition is complete, the data acquisition and fusion module standardizes and transforms various types of data, unifying data formats and units. It establishes relationships between different data sources through association mapping (e.g., associating smoke sensor data with corresponding video surveillance data). Then, it uses a dynamic entropy weight fusion model to fuse the multi-source data. The calculation formula for the dynamic entropy weight fusion model is as follows: To construct a fire safety data resource pool covering the entire residential area and linking multiple dimensions, such as... Figure 3 As shown.
[0061] The spatiotemporal anomaly identification module calls upon data from the residential area fire safety data resource pool and uses a spatiotemporal curvature attention model for anomaly identification.
[0062] The spatial feature extraction unit uses a convolutional neural network to extract the outline, location, and size parameters of targets in the video frames, such as the appearance of fire-fighting facilities in residential building corridors (e.g., whether fire extinguishers are missing or fire hydrants are intact), whether fire lanes are blocked (e.g., whether there are parked vehicles or piles of debris), and whether there are illegal fire and electricity use behaviors in the residential area (e.g., open burning of debris or illegal parking and charging of electric vehicles). This generates a spatial feature map that clearly presents the spatial safety status of each area in the residential area.
[0063] The temporal feature learning unit uses a long short-term memory network to learn parameters such as the speed of pedestrian activities in residential areas (e.g., whether people are running quickly in the corridor, which may indicate an emergency), the speed and direction of vehicles (e.g., whether vehicles are illegally occupying fire lanes and parked for a long time), and the changing trends of smoke sensor data (e.g., whether the smoke concentration rises rapidly in a short period of time) in a continuous video frame sequence, and constructs a temporal feature chain to reflect the changes of each target over time.
[0064] The curvature calculation unit, based on spatial feature maps and temporal feature chains, calculates the curvature values and curvature change rates of the trajectories of abnormal targets (such as illegally parked vehicles or rapidly moving people) within residential areas. It also analyzes the curvature-related parameters of the data curves showing changes in smoke concentration and temperature. The calculation formula is: Rate of change of curvature The calculation method is as follows: To determine whether there are abnormal fluctuations in the data.
[0065] The weight allocation unit calculates the curvature value and curvature change rate based on the calculated curvature value and the preset residential fire safety scenario parameters (such as the importance weight of fire lanes, the safety weight of key areas such as power distribution rooms, and the weight of peak periods of resident activity). It then calculates the initial weight value by weighting and summing the values according to the prescribed weight ratio. After min-max normalization, the attention weight value is mapped to the 0-1 range, and higher attention weights are assigned to key areas such as fire lanes, power distribution rooms, and building entrances and exits, as well as peak periods of resident fire and electricity use (such as cooking time).
[0066] Through the above processing, abnormal behaviors and scenarios in residential fire safety scenarios are identified and marked, such as fire lanes being blocked by vehicles, abnormally high smoke concentration in corridors, and residents using fire illegally in public areas, generating an abnormal sample list.
[0067] The physical verification module calls the rules related to fire protection in residential areas from the multi-domain coupled verification rule set, including structural safety verification rules, environmental correlation verification rules, and traffic operation verification rules (which mainly involve traffic and fire lanes within residential areas).
[0068] The structural safety verification rules cover the integrity standards of fire protection facilities (such as fire hydrants and fire extinguishers) in residential buildings, the load-bearing limits of building corridors and fire access routes, the fire resistance rating and temperature tolerance range of power distribution rooms and cable wells, and the fire performance parameters of building materials and the design standard parameters of fire protection facilities. The environmental correlation verification rules include the correspondence between temperature and humidity and the combustion risk of flammable items (such as garbage and wooden furniture) in residential areas, the matching standards between wind force and fire spread speed, the correlation table between smoke concentration and fire development stages (initial stage, development stage, and intense stage), and the correlation between meteorological parameters and the distribution characteristics of buildings and facilities in residential areas. The traffic operation verification rules cover the standard for the passage width of fire access routes in residential areas, the speed and turning radius requirements for fire trucks in residential areas, the benchmark values for the passage efficiency of entrances and exits and main roads in residential areas in emergency situations, and the correlation between traffic flow parameters and road and passage attribute parameters within residential areas.
[0069] By combining real-time data on residential area temperature, smoke concentration, and combustible gas concentration collected by IoT sensors, a physical consistency verification algorithm is used to verify the abnormal samples output by the spatiotemporal anomaly identification module. The calculation formula for the physical consistency verification algorithm is as follows: , As the confidence threshold, when When determining the credibility of an abnormal sample, the AI identifies the relevant physical parameters (such as the width of the fire lane being occupied, and the smoke concentration in the corridor) and compares them with the corresponding parameters calculated by the cross-scale physical model (such as the standard passage width of the fire lane and the baseline value of the smoke concentration in the corridor under normal conditions). The physical verification deviation rate is calculated. If the physical verification deviation rate is less than the confidence threshold, the abnormal sample is determined to be credible; otherwise, it is considered a misjudgment and filtered out. Finally, a list of credible anomalies is output, such as the width of the fire lane being occupied exceeding the safety standard, the smoke concentration in the corridor exceeding the normal range, and the detection of combustible gas leakage.
[0070] The AI prediction and early warning module operates based on a causal emergence prediction model. The calculation formula for the causal emergence prediction model is: [Formula omitted for brevity]. The modulus is ,when At that time, it was determined to be low risk, among which The risk level threshold is determined by statistical analysis of historical security incident data; when At that time, it was determined to be of medium risk. This is the critical threshold for medium to high risk; when When the risk level is deemed high, the highest level of early warning response is triggered; threshold , Supporting dynamic updates, the system first performs data preprocessing, receiving reliable fire safety anomaly data from the physical verification module, and combining it with the fused data at time t from the data acquisition and fusion module. Semantic features are extracted from the data, including anomaly type semantics (such as fire lane occupancy, abnormal smoke concentration, and combustible gas leakage), anomaly spatial location semantics (such as the specific location of fire lane occupancy, the building number of the building with abnormal smoke concentration, and the approximate area of combustible gas leakage), and anomaly time series semantics (such as the time of anomaly occurrence, duration, and trend of changes in concentration or occupancy). A semantic feature vector library is then constructed.
[0071] In the causal relationship mining stage, a causal emergence network based on the Transformer architecture is used to input semantic feature vector library and causal chains of residential fire scene at time t (such as "combustible gas leak - encounter with open flame - explosion - fire", "fire lane obstruction - fire trucks cannot enter - fire rescue delay - increased losses", "increased smoke concentration - obstructed personnel visibility - difficult evacuation - increased risk of casualties"). Through a multi-head attention mechanism, implicit associations between data are captured, and causal relationships of "abnormal causes - evolution path - potential consequences" are mined to generate a causal relationship graph. The graph nodes include abnormal fire events in residential areas (such as combustible gas leak, fire lane obstruction) and related factors (such as residents' safety awareness, aging gas pipelines, property management, and weather conditions). The edge weights represent the causal strength between each related factor and the abnormal event.
[0072] Entering the trend prediction calculation stage, the risk amplification matrix at time t is combined with the causal relationship graph to perform weighted calculations, predicting the probability distribution of fire hazards in residential areas within the next τ time period (such as the probability of combustible gas leaks causing explosions, the probability of smoke spreading to other corridors), the expansion trend of the scope of influence (such as the building range that may be spread if a fire occurs, the number of households affected by smoke), and the possible secondary risks (such as fire causing building collapse, toxic smoke causing residents to be poisoned), and outputting a risk prediction vector.
[0073] Finally, the results are transformed and output, converting the risk prediction vector into standardized early warning information, including the development level of fire hazards in residential areas (low, medium, and high risk), key influencing factors (such as the amount of combustible gas leakage, the time fire lanes are occupied, and wind speed), and priority suggestions for handling (such as prioritizing the handling of hazards with high concentrations of combustible gas leakage, and prioritizing the dispatch of fire departments, gas companies, and residential property management), and pushing them to the response and linkage module.
[0074] After receiving standardized fire safety warning information for residential areas from the AI prediction and early warning module, the response and linkage module immediately links with the city's security command platform.
[0075] If an early warning message indicates that there is a high concentration of flammable gas leaking in a stairwell of a residential building, and it is determined to be a high-risk hazard, the city's security command platform will immediately dispatch the fire department to send fire trucks and personnel to the residential area; at the same time, it will dispatch the gas company to arrange professional maintenance personnel with detection and maintenance equipment to rush to the scene; and notify the property management department of the residential area to organize staff to evacuate residents in and around the stairwell who may be affected, and guide fire and gas maintenance vehicles to enter the residential area quickly at the entrance.
[0076] After firefighters arrive at the scene, they will inspect the leak area and take emergency measures such as cutting off the gas supply and diluting the concentration of combustible gas. Gas maintenance personnel will inspect the gas pipeline, locate the leak point and carry out repairs. During the handling process, the response linkage module records the entire handling process information in real time, including the arrival time of each handling unit, the handling measures taken, data changes during the handling process (such as the decrease in combustible gas concentration), the evacuation of residents, and the handling results.
[0077] Once the combustible gas leak is resolved and the area is confirmed to be safe through testing, residents are notified that they can return to their homes. The property management department then cleans up the site and compiles and archives the relevant information, forming a closed-loop management system for fire safety hazards in the residential area to protect the lives and property of residents.
[0078] In summary, in the scenario of investigating fire safety hazards in urban residential areas, the data acquisition and fusion module collects various types of fire-related data, processes it to build a fire safety data resource pool, the spatiotemporal anomaly identification module uses a spatiotemporal curvature attention model to capture fire anomalies, the physical verification module filters credible anomalies based on multi-domain coupling verification rules and physical consistency verification algorithms, the AI prediction and early warning module predicts the development of hazards through a causal emergence prediction model, and the response and linkage module coordinates with the platform for rapid response. All modules work together to accurately investigate fire hazards in residential areas and protect the lives and property of residents.
[0079] The above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention in any way. Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art can make some modifications or alterations to the above-disclosed technical content to create equivalent embodiments without departing from the scope of the present invention. Any simple modifications, equivalent changes and alterations made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the scope of the present invention.
Claims
1. A smart city security hazard investigation system based on AI vision, characterized in that, The system includes: a data acquisition and fusion module, a spatiotemporal anomaly identification module, a physical verification module, an AI prediction and early warning module, and a response and linkage module; The data acquisition and fusion module collects video surveillance data, IoT sensor data, geographic information data, and meteorological data. Through standardized conversion, correlation mapping algorithms, and dynamic entropy weight fusion models, it constructs a multi-dimensional, interconnected, and secure data resource pool covering the entire city. The spatiotemporal anomaly identification module: Based on the data output by the data acquisition and fusion module, it uses the spatiotemporal curvature attention model to process the video stream, extract spatiotemporal features, identify and mark abnormal behaviors and scenes in urban security scenarios, and generate an abnormal sample list; The physical verification module calls the multi-domain coupling verification rule set, combines it with real-time data from IoT sensors, and uses a physical consistency verification algorithm to verify the abnormal samples output by the spatiotemporal anomaly identification module, filters out false judgment information, and outputs a list of credible anomalies. The AI prediction and early warning module: Based on the causal emergence prediction model, it learns the semantics and correlations of multi-source data, predicts the development trend of potential hazards for credible anomalies processed by the physical verification module, and outputs standardized early warning information. The response and linkage module receives early warning information output by the AI prediction and early warning module, links with the city security command platform, executes hazard handling operations, records information throughout the entire handling process, and forms a closed-loop management system for hazards. In the AI prediction and early warning module, the causal emergence prediction model learns the semantics and correlations of multi-source data and predicts the development trend of potential hazards through the following steps: Data preprocessing: Receive the credible anomaly data output by the physical verification module, combine it with the fused data from the data acquisition and fusion module, extract semantic features from the data, including anomaly type semantics, spatial location semantics, and time series semantics, and construct a semantic feature vector library; Causal Relationship Mining: Causal Emergent Networks Based on Transformer Architecture Input semantic feature vector library and scene causal chain at time t By using a multi-head attention mechanism to capture implicit correlations between data, we can mine the causal relationship of "abnormal causes - evolutionary paths - potential consequences" and generate a causal relationship graph. The graph nodes contain abnormal events and related factors, and the edge weights are the causal strength. Trend prediction calculation: combining the risk amplification matrix at time t Weighted calculations are performed on the causal relationship graph to predict the future. The probability distribution of potential hazards developing within a given timeframe, the trend of their expanding impact, and the possible derivative secondary risks are used to output a risk prediction vector. ; Result Transformation Output: Risk Prediction Vector The information is converted into standardized early warning information, including the development level of potential hazards, key influencing factors, and suggestions for handling priorities, and then pushed to the response and linkage module.
2. The intelligent smart city security hazard investigation system based on AI vision according to claim 1, characterized in that, The data acquisition and fusion module collects data in the following ways: Video surveillance data: collected by ultra-high-definition network cameras deployed at urban road intersections, square centers, building entrances and exits, and bridge sections. The collected content includes the movement trajectories of pedestrians, vehicles, and moving objects, as well as image information of building exteriors, road markings, and facility layouts; IoT sensor data: air quality sensors are deployed at industrial zone boundaries, along main roads, and in public areas of residential areas to collect PM2.5, PM10, sulfur dioxide, and nitrogen oxide concentrations; water level sensors are installed on river embankments. Real-time water level data is collected at dams, underground parking garage entrances, and low-lying road sections prone to water accumulation. Vibration sensors are attached to bridge bearings, building load-bearing walls, and the outer walls of underground utility tunnels to collect vibration frequency and acceleration values. Geographic information data is obtained from the urban surveying database at a scale of 1:500, including coordinate information of road red lines, building outlines, and pipeline routes. Meteorological data is collected by connecting to the real-time database of the municipal meteorological observation station, collecting temperature, relative humidity, wind speed, wind direction, and precipitation data updated every 10 minutes, as well as information on the level, release time, and impact range of meteorological warning signals.
3. The intelligent smart city security hazard investigation system based on AI vision according to claim 1, characterized in that, In the data acquisition and fusion module, the calculation formula for the dynamic entropy weight fusion model is as follows: ,in Let be the fused data vector at time t. For time t, the first Dynamic weighting of security-related data collection For time t, the first Similar to raw security data collection, For time t, the first Information entropy of class data, This is the entropy weight adjustment coefficient. It refers to the number of security data categories involved in the integration.
4. The intelligent smart city security hazard investigation system based on AI vision according to claim 1, characterized in that, The spatiotemporal anomaly recognition module includes a spatiotemporal curvature attention model comprising a spatial feature extraction unit, a temporal feature learning unit, a curvature calculation unit, and a weight allocation unit. The spatial feature extraction unit uses a convolutional neural network to extract the contours, positions, and dimensions of people, vehicles, and objects in video frames, generating a spatial feature map. The temporal feature learning unit uses a long short-term memory network to learn the speed, direction, and trajectory parameters of target motion in a continuous video frame sequence, constructing a temporal feature chain. The curvature calculation unit calculates the curvature value and rate of change of curvature of the target's motion trajectory based on the spatial feature map and the temporal feature chain. The weight allocation unit assigns attention weights to targets in different regions and time periods within the video frames based on the curvature value, rate of change of curvature, and preset scene parameters, with weight values ranging from 0 to 1.
5. The intelligent smart city security hazard investigation system based on AI vision according to claim 4, characterized in that, In the spatiotemporal anomaly identification module, the spatial feature extraction unit of the spatiotemporal curvature attention model uses a convolutional neural network containing 5-8 convolutional layers. The first 3 layers are basic feature extraction layers, using 3×3 convolutional kernels with a stride of 1; the last 2-5 layers are advanced feature fusion layers, using 5×5 convolutional kernels with a stride of 2. Each convolutional layer is followed by batch normalization and a ReLU activation function, generating a spatial feature map with a resolution of 1 / 4-1 / 8 of the input video frame. The temporal feature learning unit of the spatiotemporal curvature attention model uses a long short-term memory network containing 2-4 hidden layers, with 128-512 neurons in each hidden layer. The input is a sequence of spatial feature maps from 16-32 consecutive video frames. A gating mechanism preserves the long-term dependencies of the target motion, and the output temporal feature chain includes parameters such as the target motion's velocity change, direction deflection angle, and trajectory offset distance. In the curvature calculation unit of the spatiotemporal curvature attention model, the curvature value of the target motion trajectory is... Based on position coordinates of three consecutive frames , , The calculation formula is as follows: , , , The goal is to , , Frames Axis coordinates , , The goal is to , , The y-coordinate of the frame; Rate of change of curvature The calculation method is as follows: ,in For the goal in the The trajectory curvature value corresponding to the frame. It is the trajectory curvature value corresponding to frame t. It is the time interval between frame t and frame t-1; the attention weight allocation process of the weight allocation unit of the spatiotemporal curvature attention model is as follows: the curvature value is standardized to the range of 0-0.4, the curvature change rate is standardized to the range of 0-0.3, the preset scene parameter value range is 0.2-0.5, and the three are weighted and summed according to the weight ratio of 0.4:0.3:0.3 to obtain the initial weight value; The initial weight values are normalized by min-max to map the final attention weight values to the 0-1 range. The larger the curvature value, the higher the rate of curvature change, and the larger the preset scene parameter value, the larger the corresponding attention weight value.
6. The intelligent smart city security hazard investigation system based on AI vision according to claim 1, characterized in that, The physical verification module includes a multi-domain coupled verification rule set comprising: structural safety verification rules, environmental correlation verification rules, and traffic operation verification rules. The structural safety verification rules cover building load-bearing limit thresholds, bridge vibration frequency ranges, and pipeline pressure tolerance ranges, and are associated with building material strength parameters and structural design parameters. The environmental correlation verification rules include the correspondence between temperature and humidity and road surface conditions, the matching standard between precipitation and water level rise rate, and the correlation table between wind force and outdoor facility stability coefficients, and are associated with meteorological parameters and geographical feature parameters. The traffic operation verification rules cover the range of vehicle braking distances under different road surface conditions, the correspondence between road curvature and safe vehicle speed, and the benchmark value of intersection traffic efficiency, and are associated with traffic flow parameters and road attribute parameters.
7. The intelligent smart city security hazard investigation system based on AI vision according to claim 1, characterized in that, In the physical verification module, the calculation formula for the physical consistency verification algorithm is as follows: ,in For physical verification deviation rate, Physical parameters identified by AI These are values calculated using a cross-scale physical model. To verify the number of parameters, It is an index of the verification parameters, used to distinguish different types of physical verification parameters. As the confidence threshold, when When an abnormal sample is deemed credible, it is determined that the sample is indeed abnormal.
8. The intelligent smart city security hazard investigation system based on AI vision according to claim 7, characterized in that, In the AI prediction and early warning module, the calculation formula for the causal emergence prediction model is as follows: ,in for Risk prediction vector after time, For causal emergent networks, It is the fused data after fusing multi-source heterogeneous data at time t. Let be the causal chain of the scene at time t. Let be the risk amplification matrix at time t. It is the prediction time interval, let's assume The modulus length is ,when At that time, it was determined to be low risk, among which The risk level threshold is determined by statistical analysis of historical security incident data; when At that time, it was determined to be of medium risk. This is the threshold for medium to high risk; when When the risk level is deemed high, the highest level of early warning response is triggered; threshold , Supports dynamic updates.
9. The intelligent smart city security hazard investigation system based on AI vision according to claim 1, characterized in that, The standardized early warning information output by the AI prediction and early warning module includes the GIS annotation information of the hazard name, risk level, predicted occurrence time, and affected area, as well as the risk description after semantic parsing and conversion, and multi-department collaborative handling suggestions related to the emergency plan knowledge base.