A seismic visual warning method combining monocular depth estimation and optical flow method

By combining monocular depth estimation with optical flow, ground motion is inverted using streetlight-type visual sensing equipment, accelerating the deployment of earthquake early warning systems and solving the problems of high cost and insufficient coverage density of traditional seismographs, thus achieving cost-effective earthquake monitoring at the city level.

CN120802340BActive Publication Date: 2026-06-23HARBIN INST OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HARBIN INST OF TECH
Filing Date
2025-07-02
Publication Date
2026-06-23

Smart Images

  • Figure CN120802340B_ABST
    Figure CN120802340B_ABST
Patent Text Reader

Abstract

The application provides a seismic visual early warning method combining monocular depth estimation and an optical flow method. The method realizes inversion of ground acceleration at a base of a street lamp type device by fusing monocular depth estimation and a robust optical flow algorithm suitable for a traffic scene, and can construct a distributed seismic monitoring network based on inversion data, solves the industry pain points of high deployment cost and insufficient coverage density of a traditional seismograph, and provides a high cost-effective solution for city-level earthquake early warning.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of earthquake early warning technology, and in particular to an earthquake visual early warning method that combines monocular depth estimation and optical flow method. Background Technology

[0002] Earthquakes are the sudden release of enormous energy from the Earth's interior, typically originating from the movement or fracturing of tectonic plates. This violent energy release propagates outwards in the form of elastic waves, causing immense damage when they strike the Earth's surface. Strong earthquakes can directly destroy buildings, bridges, and other infrastructure, and trigger severe secondary disasters such as landslides, liquefaction, tsunamis, and fires, posing a significant threat to human life and property. Earthquakes primarily propagate through seismic waves, which include P-waves (high-speed waves that mainly cause vertical shaking), S-waves (lower-speed waves that mainly cause horizontal shaking and have strong destructive power), and surface waves (slow-speed but large-amplitude waves that are the main factor causing severe structural swaying and damage).

[0003] With breakthroughs in seismic wave propagation theory, P-wave first-arrival detection algorithms (P-waves, which are less destructive, propagate faster but have less destructive force, while S-waves and surface waves, which are more destructive, propagate slower but have greater destructive force; by detecting P-waves and quickly estimating seismic parameters, early warning can be issued before the arrival of S-waves), and real-time data transmission technology, earthquake early warning systems have achieved second-level response capabilities, and their technical framework is becoming increasingly mature. However, system performance is limited by the coverage density of the sensor network and its sustainable operation and maintenance capabilities, and the current mainstream deployment model faces significant bottlenecks: although traditional broadband seismographs have high sensitivity characteristics, their unit procurement cost is as high as tens of thousands of yuan, and they require professional seismic-resistant equipment rooms and continuous power supply, resulting in high station construction and operation and maintenance costs. Although microelectromechanical systems (MEMS) sensors are gradually being applied to earthquake monitoring networks due to their cost advantage (unit price reduced to the thousands of yuan), the actual deployment spacing remains within the range of 15 kilometers due to limitations in the accuracy of signal denoising algorithms and the supporting requirements for station power supply / communication. In addition, the low-frequency performance of MEMS sensors is limited by electronic noise and mechanical thermal drift, requiring high-precision signal amplification and filtering techniques for processing. As a result, their performance is often problematic in low-frequency signal monitoring (such as long-period ground motion, vibration of high-rise buildings, bridges, and dams).

[0004] Traditional seismic observation technologies are limited by high cost, low density, and poor environmental adaptability, while the large-scale application of MEMS sensors still faces bottlenecks in power consumption and intelligence. Furthermore, due to the impact of urbanization and other production activities, the observation environment of a significant number of seismic ground motion observation stations has been affected to varying degrees, resulting in a significant decline in observation quality. However, with the development of new-generation information technologies such as artificial intelligence, it has become possible to create "low-power, miniaturized, and intelligent" seismic observation technologies and equipment based on computer vision. Currently, the camera coverage density in key cities exceeds 150 units / km² (and even exceeds 500 units / km² in areas such as Shanghai and Shenzhen). The dense near-ground deployment and the complete power supply / communication infrastructure provide a new solution for low-cost, high-resolution seismic ground motion monitoring. Relying on urban camera networks to build a "de-specialized" monitoring model is expected to significantly shorten the spacing between seismic monitoring stations. Summary of the Invention

[0005] The purpose of this invention is to solve the problems in the prior art and to propose an earthquake visual early warning method that combines monocular depth estimation and optical flow method.

[0006] This invention is achieved through the following technical solution: This invention proposes an earthquake visual early warning method combining monocular depth estimation and optical flow method, the method comprising:

[0007] Step 1: Construct a single-degree-of-freedom motion model for a streetlight-type visual sensing device: Taking an outdoor streetlight-type monitoring device as the object, establish a simplified rigid connection model between the device and the streetlight to eliminate relative displacement interference; combine the calibrated camera intrinsic parameters to establish a spatial reference for visual perception and provide physical constraints for vibration calculation.

[0008] Step 2: Equipment vibration identification by integrating monocular depth estimation and optical flow: Based on the monitoring video stream, monocular depth estimation is applied to solve the scene depth distribution, and combined with the adaptive robust optical flow algorithm designed for traffic scenes, the acceleration and displacement vector of the camera in three-dimensional space are solved to achieve accurate perception of the carrier vibration.

[0009] Step 3: Retrieve the true ground ground acceleration based on visually obtained vibration perception: Filter non-seismic interference signals through the vibration identification system and extract the frequency domain characteristics of equipment vibration; Based on the kinematic relationship of the single-degree-of-freedom motion model, retrieve the time history curve of the ground input acceleration to form a reliable characterization of the ground motion parameters retrieved at a single point.

[0010] Step 4: Construct a multi-device collaborative earthquake early warning system: Deploy a distributed visual sensing device network based on ground motion acceleration data retrieved from a single point, and synchronously collect the first arrival time of P-waves and peak acceleration values ​​at each station; combine the STA / LTA first arrival picking algorithm to achieve automatic phase identification, use the time difference of arrival of P-waves between stations to retrieve the source location, and integrate the acceleration amplitude characteristics of multiple nodes to calculate the magnitude; finally, based on the difference in propagation speed between S-waves and P-waves, generate regional ground motion intensity prediction and early warning time windows to drive second-level automated early warning response.

[0011] Furthermore, in step one, it is first clarified that the main body of the monitoring camera is fixedly installed on the top of the street light pole; secondly, the connection between the camera and the light pole is simplified to a rigid connection, ignoring the actual possible flexibility and vibration; then, the camera's intrinsic parameters are accurately calibrated with the help of a calibration plate to ensure the geometric accuracy of the mapping between the image and the physical space; when performing ground vibration inversion analysis, the overall system including the camera and the light pole support structure is equivalent to a single-degree-of-freedom structure.

[0012] Furthermore, in step two, the adaptive robust optical flow algorithm is used to stably extract pixel-level background key point displacements characterizing equipment vibration from the traffic scene video stream, specifically including two stages: robust feature point extraction and robust displacement data filtering.

[0013] Furthermore, the robust feature point extraction specifically involves: firstly, using edge detection technology to identify the outline of objects in the image, then using a high-precision line segment detection method to extract significant straight line segments from the outline, and accurately selecting the endpoints of the line segments as key feature points.

[0014] Furthermore, the robust screening of displacement data specifically involves: for a large number of extracted feature points, calculating their pixel displacements in consecutive video frames, using the K-means clustering algorithm to perform statistical analysis on the physical displacements calculated for all feature points, effectively eliminating outliers, and selecting the average value of the feature point displacements within the largest cluster as the final effective structural displacement corresponding to that video frame.

[0015] Furthermore, in step two, the spatial distance of key static reference points is calculated in real time using a pre-trained monocular depth network, and then spatiotemporally fused with optical flow displacement. Based on the calibrated camera intrinsic parameters and imaging geometry principles, the changes in the camera's acceleration and displacement vector pose in three-dimensional space are calculated, which is the direct kinematic response of the structure to ground vibration. This pose information serves as the core input, driving the single-degree-of-freedom motion model to complete the ground acceleration inversion, and then inverting the source parameters.

[0016] Furthermore, in step three, the simplified model and vibration equation of the streetlight-type visual sensing device are used to invert the ground acceleration, specifically including:

[0017] Model parameter identification: Using the time-domain signal of the structural vibration response generated by the environmental dynamic load excitation under normal operating conditions, the frequency domain characteristics are extracted by Fast Fourier Transform (FFT), and then the key parameters of the single-degree-of-freedom motion model are identified, specifically including the natural frequency and damping ratio.

[0018] Inversion calculation: By combining the identified model parameters with the monitored structural vibration response, and based on the single-degree-of-freedom vibration equation, the ground acceleration time history at the equipment base, i.e., the installation point, is calculated and output. ;

[0019] (1).

[0020] Furthermore, in step four, based on the P-wave arrival time difference obtained from multiple spatially distributed stations, the spatial location of the earthquake source is accurately calculated and determined using the inter-station P-wave arrival time difference inversion method. Further, combining the maximum acceleration amplitude characteristics after P-wave arrival measured at each station, multi-node amplitude data is integrated for comprehensive analysis to calculate the magnitude, characterizing the intensity of earthquake energy release. Based on the determined source location and magnitude, and according to the inherent difference in propagation speed between S-waves and P-waves in the strata, a predicted distribution map of peak ground intensity at different locations in the target area is calculated, while corresponding S-wave warning time windows are generated for each location. Finally, based on the generated earthquake intensity prediction and warning time window information, the warning system drives an automated response mechanism to issue warning signals to the affected area within a second-level timescale.

[0021] The present invention also proposes an electronic device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the earthquake visual early warning method combining monocular depth estimation and optical flow method.

[0022] The present invention also proposes a computer-readable storage medium for storing computer instructions, which, when executed by a processor, implement the steps of the earthquake visual early warning method combining monocular depth estimation and optical flow.

[0023] The beneficial effects of this invention are:

[0024] This invention proposes an earthquake visual early warning method that combines monocular depth estimation and optical flow. Its core innovation lies in the fact that by integrating monocular depth estimation with a robust optical flow algorithm suitable for traffic scenarios, it achieves the inversion of ground acceleration at the base of street light equipment. Based on the inversion data, a distributed earthquake monitoring network can be constructed, solving the industry pain points of high deployment cost and insufficient coverage density of traditional seismographs, and providing a cost-effective solution for city-level earthquake early warning.

[0025] This invention utilizes streetlight-type visual sensing devices widely deployed in cities as earthquake monitoring nodes, eliminating the need for additional dedicated earthquake sensors. This solves the problems of high deployment costs, low deployment density, and difficulty in achieving high coverage across the entire city associated with traditional seismographs (strong-motion meters). A simple software upgrade is all that's needed to convert a large number of existing streetlights into monitoring points, significantly improving the spatial resolution and coverage of the earthquake monitoring network. Attached Figure Description

[0026] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0027] Figure 1 This is a schematic diagram of the street light pole model and the simplified single-degree-of-freedom model of the present invention.

[0028] Figure 2 This is a schematic diagram of the robust optical flow method combining line segment detection proposed in this invention.

[0029] Figure 3 This is a schematic diagram of the simulation device in an embodiment of the present invention.

[0030] Figure 4 This is a schematic diagram of the calibration results in an embodiment of the present invention.

[0031] Figure 5 This is a schematic diagram of the depth estimation results in an embodiment of the present invention.

[0032] Figure 6 This is a comparison chart of the time history results of the rod top acceleration in an embodiment of the present invention.

[0033] Figure 7 This is a comparison chart of ground acceleration time history results in an embodiment of the present invention.

[0034] The attached figures are labeled as follows: 1-Camera, 2-Scaled model of a street lamp pole, 3-Earthquake simulation shaking table, 4-Simulated traffic scene, 5-Acceleration sensor. Detailed Implementation

[0035] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0036] This invention proposes an earthquake visual early warning method that combines monocular depth estimation and optical flow. This method combines monocular depth estimation and optical flow to identify outdoor ground motion, and obtains ground motion inversion results through a physical model of a street lamp-type outdoor visual sensing device. Based on this, a vision-based ground acceleration monitoring network is established. Using this monitoring network, combined with existing P-wave first arrival detection algorithms, the epicenter can be quickly located and the magnitude estimated, thereby issuing an earthquake warning.

[0037] Specifically, in combination Figures 1-7 This invention proposes an earthquake visual early warning method combining monocular depth estimation and optical flow method, the method comprising:

[0038] Step 1: Construct a single-degree-of-freedom motion model for a streetlight-type visual sensing device: Taking an outdoor streetlight-type monitoring device as the object, establish a simplified rigid connection model between the device and the streetlight to eliminate relative displacement interference; combine the calibrated camera intrinsic parameters to establish a spatial reference for visual perception and provide physical constraints for vibration calculation.

[0039] This invention proposes a simplified physical modeling method for ground vibration inversion applications, targeting streetlight-type monitoring devices widely deployed in urban spaces. In this embodiment, the research object is a typical outdoor visual sensing device for streetlights. First, it is clarified that the main body of the monitoring camera is fixedly installed on the top of the streetlight pole; second, the connection between the camera and the pole is simplified to a rigid connection, ignoring any potential flexibility and vibration; then, the intrinsic parameters of the camera (focal length, principal point, distortion coefficient, etc.) are accurately calibrated using a calibration plate to ensure the geometric accuracy of the image-physical space mapping; during ground vibration inversion analysis, the overall system including the camera and the pole support structure is equivalent to a single-degree-of-freedom (SDOF) structure, such as... Figure 1 As shown. This simplified model aims to efficiently characterize the dominant dynamic behavior of the system under ground vibration excitation, serving as the physical basis for subsequent source parameter inversion using the device's visual sensing data.

[0040] Step 2: Equipment vibration identification by integrating monocular depth estimation and optical flow: Based on the monitoring video stream, monocular depth estimation is applied to solve the scene depth distribution, and combined with the adaptive robust optical flow algorithm designed for traffic scenes, the acceleration and displacement vector of the camera in three-dimensional space are solved to achieve accurate perception of the carrier vibration.

[0041] The earthquake early warning system constructed in this invention is based on the simplified physical model (single-degree-of-freedom model) of the aforementioned streetlight-type visual perception device, and innovatively integrates monocular depth estimation and optical flow technology. To address the complexity of traffic monitoring scenarios (such as interference from moving objects), a highly robust optical flow algorithm combining accurate line segment detection for traffic scenarios was developed, such as... Figure 2As shown, a method for stably extracting pixel-level background key point displacements representing equipment vibration from traffic scene video streams includes two stages: robust feature point extraction and robust displacement data filtering.

[0042] The robust feature point extraction method is as follows: First, edge detection technology is used to identify the outline of objects in the image. Then, a high-precision line segment detection method is used to extract significant straight line segments in the outline and the endpoints of the line segments are accurately selected as key feature points. Compared with traditional corner point or pixel point features, this method has stronger resistance to motion interference and lighting changes in traffic scenes.

[0043] The robust displacement data screening process involves: for a large number of extracted feature points, calculating their pixel displacements in consecutive video frames; using the K-means clustering algorithm to statistically analyze the physical displacements calculated for all feature points, effectively eliminating outliers (usually caused by interference from moving objects in the scene); and selecting the average displacement of feature points within the largest cluster as the final effective structural displacement corresponding to that video frame. This process significantly improves the accuracy and stability of displacement measurements in complex dynamic traffic environments.

[0044] In step two, a pre-trained monocular depth network is used to calculate the spatial distance of key static reference points in real time, and this distance is spatiotemporally fused with optical flow displacement. Based on the calibrated camera intrinsic parameters and imaging geometry principles, the changes in the camera's acceleration and displacement vector pose in three-dimensional space are calculated, which is the direct kinematic response of the structure to ground vibration. This pose information serves as the core input, driving the single-degree-of-freedom motion model to complete the ground acceleration inversion, and then inverting the source parameters.

[0045] Step 3: Retrieve the true ground ground acceleration based on visually obtained vibration perception: Filter out non-seismic interference signals (such as vehicle traffic) through the vibration recognition system and extract the frequency domain characteristics of equipment vibration; Based on the kinematic relationship of the single-degree-of-freedom motion model, retrieve the time history curve of the ground input acceleration to form a reliable representation of the ground motion parameters retrieved at a single point.

[0046] In step three, the simplified model (single-degree-of-freedom motion model) and its vibration equation (formula (1)) of the street lamp-type visual sensing device are used to invert the ground acceleration, specifically including:

[0047] Model parameter identification: Using the time-domain signal of the structural vibration response generated by the equipment under normal operating conditions by environmental dynamic loads (such as wind, vehicle traffic, and other natural or operational excitations), the frequency domain characteristics are extracted by Fast Fourier Transform (FFT), thereby identifying the key parameters of the single-degree-of-freedom motion model, specifically including the natural frequency and damping ratio.

[0048] Inversion calculation: The identified model parameters (natural frequency, damping ratio) are combined with the monitored structural vibration response (as known input), and the ground acceleration time history at the equipment base, i.e. the installation point, is calculated and output according to the single-degree-of-freedom vibration equation (formula (1)). ;

[0049] (1).

[0050] Step 4: Construct a multi-device collaborative earthquake early warning system: Deploy a distributed visual sensing device network based on ground motion acceleration data retrieved from a single point, and synchronously collect the first arrival time of P-waves and peak acceleration values ​​at each station; combine the STA / LTA first arrival picking algorithm to achieve automatic phase identification, use the time difference of arrival of P-waves between stations to retrieve the source location, and integrate the acceleration amplitude characteristics of multiple nodes to calculate the magnitude; finally, based on the difference in propagation speed between S-waves and P-waves, generate regional ground motion intensity prediction and early warning time windows to drive second-level automated early warning response.

[0051] This invention utilizes a single-device ground acceleration inversion method to collaboratively construct an earthquake early warning system using multiple devices. The system includes a distributed visual sensing device network deployed at various monitoring stations, used to synchronously acquire and transmit ground motion acceleration data recorded by each station in real time, extracting key waveform features—including the first arrival time of the P-wave and the peak acceleration value. The system processes the acquired data in real time using a short-time averaging / long-time averaging (STA / LTA) first arrival picking algorithm, achieving automatic identification and accurate picking of seismic phases, especially the arrival of the P-wave.

[0052] Specifically, in step four, based on the P-wave arrival time difference obtained from multiple spatially distributed stations, the spatial location of the earthquake source is accurately calculated and determined using the inter-station P-wave arrival time difference inversion method. Furthermore, combining the maximum acceleration amplitude characteristics after P-wave arrival measured at each station, multi-node amplitude data is integrated for comprehensive analysis to calculate the magnitude, characterizing the intensity of earthquake energy release. Based on the determined source location and magnitude, and according to the inherent difference in propagation speed between S-waves and P-waves in the strata, a predicted distribution map of peak ground intensity at different locations in the target area is calculated, while corresponding S-wave warning time windows are generated for each location. Finally, based on the generated earthquake intensity prediction and warning time window information, the warning system drives an automated response mechanism to issue warning signals to the affected area within a second-level timescale, thereby significantly improving the timeliness of earthquake warnings.

[0053] Example

[0054] The earthquake visual early warning method combining monocular depth estimation and optical flow proposed in this invention will be described in detail below with reference to the accompanying drawings. This embodiment illustrates the implementation process of the invention inverting ground acceleration using a street lamp-type visual sensing device through a shaking table simulation, and further explains the implementation process of the earthquake early warning method with a flowchart. The simulation device is as follows... Figure 3 As shown (1 is a camera, 2 is a scaled-down model of a street lamp pole, 3 is an earthquake simulation shaking table, 4 is a simulated traffic scene, and 5 is an acceleration sensor).

[0055] This invention proposes an earthquake visual early warning method combining monocular depth estimation and optical flow method, the method comprising:

[0056] Step 1: Construct a single-degree-of-freedom motion model for a streetlight-type visual sensing device: Taking an outdoor streetlight-type monitoring device as the object, establish a simplified rigid connection model between the device and the streetlight to eliminate relative displacement interference; combine the calibrated camera intrinsic parameters to establish a spatial reference for visual perception and provide physical constraints for vibration calculation.

[0057] Step 2: Equipment vibration identification by integrating monocular depth estimation and optical flow: Based on the monitoring video stream, monocular depth estimation is applied to solve the scene depth distribution, and combined with the adaptive robust optical flow algorithm designed for traffic scenes, the acceleration and displacement vector of the camera in three-dimensional space are solved to achieve accurate perception of the carrier vibration.

[0058] Step 3: Retrieve the true ground ground acceleration based on visually obtained vibration perception: Filter non-seismic interference signals through the vibration identification system and extract the frequency domain characteristics of equipment vibration; Based on the kinematic relationship of the single-degree-of-freedom motion model, retrieve the time history curve of the ground input acceleration to form a reliable characterization of the ground motion parameters retrieved at a single point.

[0059] Step 4: Construct a multi-device collaborative earthquake early warning system: Deploy a distributed visual sensing device network based on ground motion acceleration data retrieved from a single point, and synchronously collect the first arrival time of P-waves and peak acceleration values ​​at each station; combine the STA / LTA first arrival picking algorithm to achieve automatic phase identification, use the time difference of arrival of P-waves between stations to retrieve the source location, and integrate the acceleration amplitude characteristics of multiple nodes to calculate the magnitude; finally, based on the difference in propagation speed between S-waves and P-waves, generate regional ground motion intensity prediction and early warning time windows to drive second-level automated early warning response.

[0060] During the inversion process, camera intrinsic parameters are obtained using a calibration board, and scene scale is perceived through depth estimation. Specifically, a 9×12 checkerboard calibration board is selected, and the calibration results are as follows: Figure 4 As shown in Equation 2, the camera intrinsic parameters and distortion coefficient matrix are obtained.

[0061] (2)

[0062] By using a pre-trained depth estimation model to perceive scene scale, the depth results of the camera image are obtained as follows: Figure 5 As shown.

[0063] During the inversion process, the response of the structure under excitation is obtained, and the damping ratio and natural frequency of the tested structure are obtained by applying the method of the present invention. Specifically, a broadband random white noise excitation signal with a preset intensity and frequency range is input to the earthquake simulation shaking table through the vibration control system. According to the rod top acceleration time history returned by the acceleration sensor, the signal is transformed into the frequency domain by fast Fourier transform, and the natural frequency of the structure is obtained as 31.36.

[0064] During the inversion process, a simulated earthquake occurs, and the ground acceleration is inverted. Specifically, the amplitude-modulated earthquake is input to the earthquake simulation shaking table through the vibration control system. The method of this invention is used to identify the acceleration response at the top of the pole, which is then compared with the data measured by the sensors. The comparison results are as follows: Figure 6 As shown. Furthermore, the time history of the earthquake simulation shaking table surface acceleration was solved from the response of the pole-top structure using a simplified single-degree-of-freedom model, and compared with the data measured by the sensors. The comparison results are as follows. Figure 7 As shown in the figure. The comparison results show that the method of the present invention can correctly identify the acceleration response at the top of the pole and can accurately reflect the ground acceleration through a simplified physical model.

[0065] In the implementation of this method, firstly, a distributed network needs to be constructed within the seismic risk area. Specifically, this network consists of ground monitoring stations formed by multiple streetlight-style visual sensing devices with known camera parameters. These stations are spatially distributed and ensure μs-level time synchronization through high-precision timing technologies such as GPS or BeiDou, laying the foundation for subsequent collaborative analysis. Each station continuously and synchronously acquires ground motion acceleration data at high frequency, and the data is calculated and transmitted in real time from the edge devices to the central processing system. Secondly, the central processing system automatically processes the real-time acceleration waveform data received from each station to estimate the source parameters. Specifically, the core of this process is to use the short-time average / long-time average (STA / LTA) first-arrival picking algorithm to scan and analyze the input waveform. This algorithm calculates the ratio of the average amplitude (STA) of the signal within a short time window (e.g., 0.2 seconds) to the average amplitude (LTA) of the signal within a longer time window (e.g., 4 seconds). When this ratio significantly exceeds a preset threshold (e.g., 3.5), the system automatically identifies and precisely marks the key time point marking the first arrival of the P-wave, and extracts the corresponding initial acceleration value and the subsequent peak ground acceleration (PGA). Once the system confirms that at least three or more spatially distributed monitoring stations have successfully identified and reported the first arrival time of the P-wave, it enters the crucial source location stage. Using the precise differences (time differences) in the first arrival times of the P-waves recorded by these stations, along with the known geographical coordinates of each station, the system applies an inter-station P-wave arrival time difference inversion method to establish and solve a set of equations. Based on a P-wave propagation velocity model in the strata, this set of equations calculates the spatial source location (latitude, longitude, and depth coordinates) of the earthquake within the target area, while simultaneously determining the absolute time of earthquake occurrence. Based on the source location information, the system integrates the maximum acceleration amplitude (PGA) characteristics recorded after the arrival of the P-wave from various monitoring stations. By fusing and analyzing these PGA values ​​distributed across different spatial locations, and based on a pre-defined empirical model of the relationship between regional earthquake intensity and PGA, the system comprehensively calculates the magnitude, characterizing the overall energy release intensity of this earthquake. Next, the central processing system estimates the spatial distribution of earthquake intensity and the warning time window using the established earthquake information. Specifically, based on the determined hypocenter location and magnitude, and utilizing the inherent difference in propagation speed between S-waves and P-waves in the strata (S-waves are typically slower), the system performs two key predictive calculations: firstly, it predicts and maps the spatial distribution of earthquake intensity to be encountered at specific geographical locations within the target area; secondly, it calculates and generates the remaining time required for the arrival of the more destructive S-wave at each specific location (S-wave warning time window). Finally, tiered warnings are issued. Specifically, based on the generated earthquake intensity prediction distribution map and the corresponding S-wave warning time window information for each location, the system immediately activates an automated emergency response mechanism.Within seconds of the earthquake's impact, the system efficiently disseminates differentiated and tiered early warning information to designated terminals in potentially affected areas (such as emergency broadcast systems, high-speed train control systems, and urban power and gas grid control systems). This mechanism aims to provide crucial advance action time for the protection of life and property.

[0066] The present invention also proposes an electronic device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the earthquake visual early warning method combining monocular depth estimation and optical flow method.

[0067] The present invention also proposes a computer-readable storage medium for storing computer instructions, which, when executed by a processor, implement the steps of the earthquake visual early warning method combining monocular depth estimation and optical flow.

[0068] The memory in this application embodiment can be volatile memory or non-volatile memory, or it can include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory used in the methods described in this invention is intended to include, but is not limited to, these and any other suitable types of memory.

[0069] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device such as a server or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., high-density digital video discs (DVDs)), or semiconductor media (e.g., solid-state disks (SSDs)).

[0070] In implementation, each step of the above method can be completed by integrated logic circuits in the processor's hardware or by instructions in software. The steps of the method disclosed in the embodiments of this application can be directly implemented by a hardware processor, or by a combination of hardware and software modules in the processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method. To avoid repetition, detailed descriptions are omitted here.

[0071] It should be noted that the processor in the embodiments of this application can be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method embodiments can be completed by the integrated logic circuitry in the processor's hardware or by instructions in software form. The processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly embodied as execution by a hardware decoding processor, or as a combination of hardware and software modules in the decoding processor. The software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory, and the processor reads the information in the memory and, in conjunction with its hardware, completes the steps of the above methods.

[0072] The above provides a detailed description of the earthquake visual early warning method combining monocular depth estimation and optical flow proposed in this invention. Specific examples have been used to illustrate the principle and implementation of this invention. The description of the above embodiments is only for the purpose of helping to understand the method and core idea of ​​this invention. At the same time, for those skilled in the art, there will be changes in the specific implementation and application scope based on the idea of ​​this invention. Therefore, the content of this specification should not be construed as a limitation of this invention.

Claims

1. An earthquake visual early warning method combining monocular depth estimation and optical flow method, characterized in that, The method includes: Step 1: Construct a single-degree-of-freedom motion model for a streetlight-type visual sensing device: Taking an outdoor streetlight-type monitoring device as the object, establish a simplified rigid connection model between the device and the streetlight to eliminate relative displacement interference; combine the calibrated camera intrinsic parameters to establish a spatial reference for visual perception and provide physical constraints for vibration calculation. Step 2: Equipment vibration identification by integrating monocular depth estimation and optical flow: Based on the monitoring video stream, monocular depth estimation is applied to solve the scene depth distribution, and combined with the adaptive robust optical flow algorithm designed for traffic scenes, the acceleration and displacement vector of the camera in three-dimensional space are solved to achieve accurate perception of the carrier vibration. Step 3: Retrieve the true ground ground acceleration based on visually obtained vibration perception: Filter non-seismic interference signals through the vibration identification system and extract the frequency domain characteristics of equipment vibration; Based on the kinematic relationship of the single-degree-of-freedom motion model, retrieve the time history curve of the ground input acceleration to form a reliable characterization of the ground motion parameters retrieved at a single point. Step 4: Construct a multi-device collaborative earthquake early warning system: Deploy a distributed visual sensing device network based on ground motion acceleration data retrieved from a single point, and synchronously collect the first arrival time of P-waves and peak acceleration values ​​at each station; combine the STA / LTA first arrival picking algorithm to achieve automatic phase identification, use the time difference of arrival of P-waves between stations to retrieve the source location, and integrate the acceleration amplitude characteristics of multiple nodes to calculate the magnitude; finally, based on the difference in propagation speed between S-waves and P-waves, generate regional ground motion intensity prediction and early warning time windows to drive second-level automated early warning response.

2. The method according to claim 1, characterized in that, In step one, it is first determined that the main body of the surveillance camera is fixedly installed on the top of the street light pole; secondly, the connection between the camera and the light pole is simplified to a rigid connection, ignoring the actual flexibility and vibration that may exist; then, the camera's intrinsic parameters are accurately calibrated with the help of a calibration plate to ensure the geometric accuracy of the mapping between the image and the physical space; when performing ground vibration inversion analysis, the overall system including the camera and the light pole support structure is equivalent to a single-degree-of-freedom structure.

3. The method according to claim 1, characterized in that, In step two, the adaptive robust optical flow algorithm is used to stably extract pixel-level background key point displacements that characterize equipment vibration from traffic scene video streams. Specifically, it includes two stages: robust feature point extraction and robust displacement data filtering.

4. The method according to claim 3, characterized in that, The robust feature point extraction specifically involves: firstly, using edge detection technology to identify the outline of objects in the image, then using a high-precision line segment detection method to extract significant straight line segments from the outline, and accurately selecting the endpoints of the line segments as key feature points.

5. The method according to claim 4, characterized in that, The robust screening of displacement data specifically involves: for a large number of extracted feature points, calculating their pixel displacements in consecutive video frames, using the K-means clustering algorithm to perform statistical analysis on the physical displacements calculated for all feature points, effectively eliminating outliers, and selecting the average displacement of feature points within the largest cluster as the final effective structural displacement corresponding to that video frame.

6. The method according to claim 5, characterized in that, In step two, a pre-trained monocular depth network is used to calculate the spatial distance of key static reference points in real time, and this distance is spatiotemporally fused with optical flow displacement. Based on the calibrated camera intrinsic parameters and imaging geometry principles, the changes in the camera's acceleration and displacement vector pose in three-dimensional space are calculated, which is the direct kinematic response of the structure to ground vibration. This pose information serves as the core input, driving the single-degree-of-freedom motion model to complete the ground acceleration inversion, and then inverting the source parameters.

7. The method according to claim 1, characterized in that, In step four, based on the first arrival time difference of P-waves obtained from multiple spatially distributed stations, the spatial location of the earthquake source is accurately calculated and determined using the inter-station P-wave arrival time difference inversion method. Furthermore, by combining the maximum acceleration amplitude characteristics after the arrival of P-waves measured at each station, multi-node amplitude data is integrated for comprehensive analysis to calculate the magnitude, which characterizes the intensity of earthquake energy release. Based on the determination of the source location and magnitude, and according to the inherent difference in the propagation speed of S-waves and P-waves in the strata, the predicted distribution map of peak ground intensity at different locations in the target area is calculated, and corresponding S-wave early warning time windows are generated for each location. Ultimately, based on the generated earthquake intensity prediction and warning time window information, the early warning system drives an automated response mechanism to issue early warning signals to the affected areas within a second-level timescale.

8. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1-7.

9. A computer-readable storage medium for storing computer instructions, characterized in that, When the computer instructions are executed by the processor, they implement the steps of the method according to any one of claims 1-7.