Event camera based in-cabin intrusion detection method, apparatus, device and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining event cameras with grayscale frames and lightweight neural networks, the problem of insufficient accuracy in cabin intrusion detection under low power consumption scenarios is solved, achieving efficient and accurate intrusion detection, which is suitable for vehicle security systems.

CN122244782APending Publication Date: 2026-06-19ARCSOFT CORP LTD

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: ARCSOFT CORP LTD
Filing Date: 2026-02-10
Publication Date: 2026-06-19

AI Technical Summary

Technical Problem

Existing technologies struggle to guarantee accuracy in cabin intrusion detection under low-power scenarios. Traditional RGB and IR cameras experience performance degradation at night or during sudden changes in lighting, and are susceptible to environmental interference, resulting in insufficient real-time performance and accuracy.

Method used

Event frames are acquired using an event camera for preliminary analysis, noise is filtered out and connected component analysis is performed, feature recognition is performed by combining grayscale frames, and secondary verification is performed through a lightweight neural network model to ensure the accuracy of intrusion detection.

Benefits of technology

It achieves high-precision intrusion detection under low power consumption conditions, reduces false alarm rate, and improves the real-time performance and reliability of detection, making it suitable for vehicle security systems.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122244782A_ABST

Patent Text Reader

Abstract

This application relates to a method, apparatus, device, and medium for cabin intrusion detection based on an event camera. The method includes: acquiring an event stream output by the event camera within a preset time window to obtain event frames; performing preliminary analysis based on the event frames to determine whether a suspected intrusion event exists; when a suspected intrusion event is determined to exist, triggering the event camera to output a grayscale frame corresponding to the suspected intrusion event; performing feature recognition on the grayscale frames; and determining whether an intrusion event exists based on the feature recognition results. This application enables low-power preliminary analysis of the event frames from the event camera, followed by secondary verification of the suspected intrusion event based on feature recognition using the grayscale frames from the event camera. This achieves low-power intrusion detection while maintaining intrusion detection accuracy, solving the problem of difficulty in ensuring intrusion detection accuracy in low-power scenarios.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of vehicle technology, and in particular to a method, apparatus, device and medium for cabin intrusion detection based on an event camera. Background Technology

[0002] In-cabin intrusion detection specifically refers to the technology that automatically identifies and warns of unauthorized entry or suspicious behavior inside the enclosed cabins of vehicles, aircraft, ships, etc. Its core application scenario is vehicle security, aiming to prevent theft of property, illegal stay or concealment of personnel, and acts of vandalism against the vehicle itself.

[0003] Existing single-modal detection solutions often suffer from the following shortcomings: For example, traditional RGB cameras rely on ample lighting, experiencing a sharp performance drop at night or during sudden changes in lighting conditions. Furthermore, their fixed frame rate and continuous high-power operation make it difficult to capture fast-moving targets and meet the requirements for long-term low-power operation. While IR (Infrared) cameras can image in low light, their images are susceptible to interference from ambient heat sources. Similarly limited by their inherent frame rate, they exhibit delays in capturing fast-moving or low-temperature targets, affecting the real-time performance and accuracy of detection. Therefore, it is currently difficult to guarantee the accuracy of intrusion detection in low-power scenarios.

[0004] There is currently no effective solution to the problem of ensuring intrusion detection accuracy in low-power scenarios in related technologies. Summary of the Invention

[0005] Therefore, it is necessary to provide an in-cabin intrusion detection method, device, equipment, and medium based on an event camera that can ensure intrusion detection accuracy while achieving a low-power solution, in response to the above-mentioned technical problems.

[0006] Firstly, this embodiment provides a cabin intrusion detection method based on an event camera, including:

[0007] Acquire the event stream output by the event camera within a preset time window to obtain event frames;

[0008] Based on the event frames, a preliminary analysis is performed to determine whether there is a suspected intrusion event.

[0009] When a suspected intrusion event is detected, the event camera is triggered to output a grayscale frame corresponding to the suspected intrusion event;

[0010] The grayscale frame is subjected to feature recognition, and the presence of an intrusion event is determined based on the feature recognition results.

[0011] In some embodiments, the preliminary analysis based on the event frame to determine whether a suspected intrusion event exists includes:

[0012] Noise filtering is performed on the event frames;

[0013] Perform connected component analysis on the event frames after noise filtering, and determine whether there are target connected components with an area greater than or equal to a first preset threshold.

[0014] If the target connected domain exists, then a suspected intrusion event is determined to exist.

[0015] In some embodiments, the preliminary analysis based on the event frame to determine whether a suspected intrusion event exists includes:

[0016] Perform connected component analysis on the event frames after noise filtering, and determine whether there are target connected components with an area greater than or equal to a first preset threshold.

[0017] The event frames are classified into scenes to obtain scene classification results;

[0018] If the target connected component exists and the scene classification result is an intrusion scene, then it is determined that a suspected intrusion event exists.

[0019] In some embodiments, the step of classifying the event frame to obtain a scene classification result includes:

[0020] The event frames after noise removal are input into the fully trained first neural network model for scene classification, and the scene classification results of intrusion scene or interference scene are output.

[0021] In some embodiments, the step of performing feature recognition on the grayscale frame and determining whether an intrusion event exists based on the feature recognition results includes:

[0022] The grayscale frame is input into a fully trained second neural network model for human feature recognition, and the recognition result of whether a human body exists in the grayscale frame is obtained.

[0023] When the identification result shows that a human body is present in the grayscale frame, an intrusion event is confirmed.

[0024] In some embodiments, the step of inputting the grayscale frame into a fully trained second neural network model for human feature recognition further includes:

[0025] Based on the position of the target connected component in the event frame, the corresponding region of interest is determined on the grayscale frame;

[0026] The region of interest is input into the second neural network model for human feature recognition.

[0027] In some embodiments, the step of performing feature recognition on the grayscale frame and determining whether an intrusion event exists based on the feature recognition results further includes:

[0028] Extract human morphological features from the grayscale frame, including head and shoulder contours and human posture;

[0029] Based on the human morphological characteristics, human movements are determined, and by combining the human morphological characteristics and the human movements, it is confirmed whether it is an intrusion event.

[0030] Secondly, this embodiment provides an intrusion detection device for cabins based on an event camera, comprising:

[0031] The preliminary analysis module acquires the event stream output by the event camera within a preset time window to obtain event frames; based on the event frames, it performs preliminary analysis to determine whether there are any suspected intrusion events.

[0032] The secondary verification module is used to trigger the event camera to output a grayscale frame corresponding to the suspected intrusion event when a suspected intrusion event is detected; to perform feature recognition on the grayscale frame, and to determine whether an intrusion event exists based on the feature recognition result.

[0033] Thirdly, this embodiment provides a computer device including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the intrusion detection method based on an event camera described in the first aspect above.

[0034] Fourthly, this embodiment provides a storage medium storing a computer program that, when executed by a processor, implements the intrusion detection method based on an event camera described in the first aspect above.

[0035] Compared with related technologies, the intrusion detection method, apparatus, device, and medium based on an event camera provided in this embodiment obtains event frames by acquiring the event stream output by the event camera within a preset time window; performs preliminary analysis based on the event frames to determine whether a suspected intrusion event exists; when a suspected intrusion event is determined to exist, the event camera is triggered to output a grayscale frame corresponding to the suspected intrusion event; feature recognition is performed on the grayscale frame, and the existence of an intrusion event is determined based on the feature recognition result. This embodiment allows for low-power preliminary analysis of the event frames from the event camera, followed by secondary verification of the suspected intrusion event based on feature recognition using the grayscale frame from the event camera. This achieves low-power intrusion detection while maintaining intrusion detection accuracy, solving the problem of difficulty in ensuring intrusion detection accuracy in low-power scenarios.

[0036] Details of one or more embodiments of this application are set forth in the following drawings and description to make other features, objects and advantages of this application more readily apparent. Attached Figure Description

[0037] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings:

[0038] Figure 1 This is a hardware structure block diagram of the terminal of an in-cabin intrusion detection method based on an event camera in one embodiment;

[0039] Figure 2 This is a flowchart of an in-cabin intrusion detection method based on an event camera in one embodiment;

[0040] Figure 3 This is a flowchart of an intrusion detection method based on an event camera in another embodiment;

[0041] Figure 4 This is a flowchart illustrating the intrusion detection process in one embodiment;

[0042] Figure 5 This is a structural block diagram of an intrusion detection device based on an event camera in one embodiment.

[0043] In the diagram: 102, processor; 104, memory; 106, transmission device; 108, input / output device; 10, preliminary analysis module; 20, secondary verification module. Detailed Implementation

[0044] To better understand the purpose, technical solution, and advantages of this application, the application is described and illustrated below in conjunction with the accompanying drawings and embodiments.

[0045] Unless otherwise defined, the technical or scientific terms used in this application shall have the general meaning understood by one of ordinary skill in the art to which this application pertains. Words such as “a,” “an,” “an,” “the,” “the,” and “these” used in this application do not indicate quantitative limitation and may be singular or plural. The terms “comprising,” “including,” “having,” and any variations thereof used in this application are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or device that comprises a series of steps or modules (units) is not limited to the listed steps or modules (units) but may include steps or modules (units) not listed, or may include other steps or modules (units) inherent to these processes, methods, products, or devices. Words such as “connected,” “linked,” and “coupled” used in this application are not limited to physical or mechanical connections but may include electrical connections, whether direct or indirect. “Multiple” used in this application refers to two or more. “And / or” describes the relationship between related objects, indicating that three relationships may exist; for example, “A and / or B” can represent: A alone, A and B simultaneously, and B alone. Normally, the character " / " indicates that the objects before and after it are in an "or" relationship. The terms "first," "second," "third," etc., used in this application are merely to distinguish similar objects and do not represent a specific order of objects.

[0046] The method embodiments provided in this example can be executed on a terminal, computer, or similar computing device. For example, it can run on a terminal. Figure 1 This is a hardware structure block diagram of the terminal of the in-cabin intrusion detection method based on an event camera in this embodiment. (See diagram for reference.) Figure 1 As shown, a terminal may include one or more ( Figure 1 Only one is shown in the diagram. A processor 102 and a memory 104 for storing data are also included. The processor 102 may be, but is not limited to, a microprocessor (MCU) or a programmable logic device (FPGA). The terminal may also include a transmission device 106 for communication functions and an input / output device 108. Those skilled in the art will understand that… Figure 1 The structure shown is for illustrative purposes only and does not limit the structure of the terminal described above. For example, the terminal may also include components that are larger than... Figure 1 The more or fewer components shown, or having the same Figure 1 The different configurations shown are illustrated.

[0047] The memory 104 can be used to store computer programs, such as application software programs and modules, like the computer program corresponding to the intrusion detection method based on an event camera in this embodiment. The processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, thereby implementing the aforementioned method. The memory 104 may include high-speed random access memory and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 104 may further include memory remotely located relative to the processor 102, and these remote memories can be connected to the terminal via a network. Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.

[0048] The transmission device 106 is used to receive or send data via a network. This network includes a wireless network provided by the terminal's communication provider. In one example, the transmission device 106 includes a Network Interface Controller (NIC), which can connect to other network devices via a base station to communicate with the Internet. In another example, the transmission device 106 can be a Radio Frequency (RF) module used for wireless communication with the Internet.

[0049] To address the issue that current single-modal in-cabin intrusion detection solutions struggle to guarantee accuracy in low-power scenarios, the following embodiments provide an in-cabin intrusion detection method applicable to vehicle cabin security monitoring and adaptable to various vehicle cabin environments.

[0050] This embodiment provides a cabin intrusion detection method based on event cameras. Figure 2 This is a flowchart of the cabin intrusion detection method based on an event camera in this embodiment, as follows: Figure 2 As shown, the method includes the following steps:

[0051] Step S201: Obtain the event stream output by the event camera within a preset time window to obtain event frames.

[0052] Specifically, a preset time window (e.g., 10ms) is set for the event camera. The event camera continuously collects dynamic information inside the cabin and outputs a discrete event stream. The discrete events in each time window are accumulated according to their coordinates to generate a binary or grayscale event frame, which reflects the dynamic changes within this time window.

[0053] Step S202: Perform preliminary analysis based on the event frames to determine whether there is a suspected intrusion event.

[0054] Specifically, a preliminary analysis of moving object detection in event frames can be performed using simple image processing algorithms or feature extraction algorithms to assess whether a moving object exists in the event frame.

[0055] Since the event stream signal of an EVS (Event-Based Vision Sensor) consists of events triggered by a large number of pixels, although it has advantages in capturing dynamic information, it inevitably generates a large number of relatively uniform, low-density background noise events, such as thermal noise and random triggering of the photosensitive device itself. Therefore, before performing moving object detection, noise in the event frames can be removed to improve detection accuracy and reduce redundant computation.

[0056] Step S203: When it is determined that there is a suspected intrusion event, the event camera is triggered to output a grayscale frame corresponding to the suspected intrusion event.

[0057] Specifically, after the event camera receives a judgment signal of a suspected intrusion event, it directly triggers the APS (Active Pixel Sensor) module to output a grayscale frame corresponding to the current judgment time. The grayscale frame contains rich scene texture information and is only output during the suspected intrusion into the cabin.

[0058] Step S204: Perform feature recognition on the grayscale frame, and determine whether an intrusion event exists based on the feature recognition results.

[0059] Specifically, since the suspected intrusion events obtained in the above steps may contain real intrusion behavior and environmental interference, and grayscale frames can provide relatively stable and reliable image information with richer texture image features compared to event frame images, a second fine-tuning verification is performed on the grayscale frames after judging suspected intrusion events based on event frames. Specifically, feature recognition (including morphological feature extraction and human action judgment, etc.) is performed on the grayscale frames to obtain the feature recognition result of whether human intrusion is detected. For example, the identified features are compared with the feature library, and the feature matching degree is used as the feature recognition result. If the recognition result is that human intrusion is detected, an intrusion event is confirmed, and the vehicle security mechanism is triggered (e.g., in-cabin audible and visual alarm, push warning information to the owner's mobile device, and start high-definition video recording and storage); if the recognition result is that it is not human intrusion, indicating that it is environmental interference such as debris or external light and shadow, the process is terminated, and the event camera continues to output event streams for continuous monitoring.

[0060] Through the above steps, a preliminary low-power analysis is first performed on the event frames of the event camera, fully utilizing the high temporal resolution and low power consumption characteristics of the event camera in dynamic scenes. Then, a secondary verification of feature recognition is performed by combining the grayscale frames of the event camera, fully utilizing the texture image features of the grayscale frames. This achieves low-power intrusion detection while ensuring intrusion detection accuracy. Compared with related technologies, event cameras have the advantages of high temporal resolution and low power consumption, but their signals are easily affected by environmental changes such as lighting, leading to a large number of false alarms in intrusion detection. This embodiment, through the coordinated use of event frames and grayscale frames, logically implements a two-level detection mechanism from rapid preliminary analysis to accurate secondary verification, achieving reliable and accurate intrusion detection in complex vehicle environments and solving the problem of difficulty in ensuring intrusion detection accuracy in low-power scenarios.

[0061] In some embodiments, step S202 above involves preliminary analysis based on event frames to determine whether a suspected intrusion event exists, including the following steps:

[0062] Noise is filtered out from the event frames; connected component analysis is performed on the noise-filtered event frames, and it is determined whether there is a target connected component with an area greater than or equal to a first preset threshold; if a target connected component exists, it is determined that there is a suspected intrusion event.

[0063] Specifically, since the event stream signal of the event camera consists of a large number of pixel-triggered events, it generates a large number of relatively uniform, low-density background noise events. In this embodiment, during the initial analysis, noise filtering is first performed on the event frames. The noise filtering methods include, but are not limited to, image processing operations and deep learning denoising methods. Considering that the main requirements of the vehicle-mounted scenario are low power consumption and real-time performance, basic image processing operations can be used to filter noise from the event frames, such as morphological operations (e.g., opening and closing operations), connected component analysis, neighborhood filtering, and frequency domain filtering. Preferably, morphological opening and closing operations can be used, which can efficiently filter out isolated noise points and small noise clusters while maintaining the integrity of the potential target contour. Specifically, median filtering is first used to filter out the background noise of the event frames, then opening operations are used to filter out local noise, and finally closing operations are used to retain the potential target contour.

[0064] For the noise-filtered event frames, a connected component analysis algorithm is used to traverse all non-zero pixels in the event frame, dividing adjacent non-zero pixels into the same connected component, and calculating the actual area (number of pixels) of each connected component by pixel counting. A first preset threshold is calibrated based on the in-vehicle cabin scene. If at least one target connected component in the event frame has an area greater than or equal to the first preset threshold, it is determined that a suspected intrusion event exists, triggering subsequent processes; if the area of all connected components is less than the first preset threshold, it is determined that there is no intrusion, the process terminates, and the event stream continues to collect data. The first preset threshold can be calibrated using the corresponding pixel area of the smallest human intrusion area (such as an arm or head) within the cabin.

[0065] By employing morphological denoising and connected region area thresholding in this embodiment, a large number of minor environmental disturbances can be quickly filtered out with extremely low computational cost. This enables a preliminary analysis of whether a suspected intrusion event exists with low power consumption, thereby significantly reducing the triggering frequency of subsequent processing steps.

[0066] In some embodiments, step S202 above involves preliminary analysis based on event frames to determine whether a suspected intrusion event exists, including the following steps:

[0067] Perform connected component analysis on the noise-filtered event frames and determine whether there are target connected components with an area greater than or equal to a first preset threshold; classify the event frames into scenes and obtain scene classification results; if there are target connected components and the scene classification result is an intrusion scene, then it is determined that there is a suspected intrusion event.

[0068] Specifically, considering that event stream signals are extremely sensitive to changes in lighting—for example, flickering interior lights or external light sources can trigger numerous false events, leading to false alarms in intrusion detection—semantic-level filtering is further added after event domain noise filtering. This involves classifying suspicious intrusion scenarios based on event frames to quickly distinguish between intrusion scenarios and interference scenarios caused by moving objects outside the vehicle or changes in lighting. This can be achieved by inputting the noise-filtered event frames into the classification network.

[0069] The method described in the above embodiments is used to determine whether a target connected component with an area greater than or equal to a first preset threshold exists. A suspected intrusion event is only determined when a target connected component exists and the scene classification result is an intrusion scene.

[0070] By introducing scene classification after determining the existence of a target connected domain in this embodiment, the interference situation inside and outside the cabin can be further distinguished. This allows for the early filtering out of most false detections caused by environmental interference based on event frames, effectively reducing the trigger frequency of secondary verification and actual alarm reporting, thereby further reducing power consumption while ensuring detection accuracy.

[0071] In some embodiments, the above-described scene classification of event frames to obtain scene classification results includes the following steps:

[0072] The event frames after noise removal are input into the fully trained first neural network model for scene classification, and the scene classification results of intrusion scene or interference scene are output.

[0073] Specifically, the first neural network model is a classification network, which can be a lightweight convolutional neural network (CNN), etc. During the training phase, large-scale environmental change simulation data is introduced, including time frame signal responses under different light intensities, flicker interference, and background dynamic noise, as well as data enhanced by light and noise. The robustness of the model is improved by leveraging the rich diversity of the dataset.

[0074] The event frames after noise removal are input into the first neural network model. Features are extracted through multiple convolutions and pooling. Finally, a two-dimensional vector is output through a fully connected layer, representing the probabilities of "intrusion scene" and "interference scene". The category with the higher probability is taken as the scene classification result.

[0075] By employing a lightweight neural network in this embodiment, the feature information of event frames can be effectively utilized, and computing resources can be saved to meet the requirements of low-power operation and ensure the computing efficiency on the vehicle embedded platform.

[0076] In some embodiments, step S204 above involves feature recognition of the grayscale frame and determining whether an intrusion event exists based on the feature recognition results, including the following steps:

[0077] The grayscale frame is input into the trained second neural network model for human feature recognition, and the recognition result of whether a human body exists in the grayscale frame is obtained; when the recognition result is that a human body exists in the grayscale frame, an intrusion event is confirmed.

[0078] Specifically, the single-channel grayscale frames captured on demand by the event camera are directly input into the fully trained second neural network model. The model extracts global and local detail features (such as human contours and textures) of the image through an attention mechanism and outputs the recognition result of whether a human body exists. When a human body is detected, an intrusion event is confirmed; when no human body is detected, it is determined to be non-human interference, the process terminates, and the event camera resumes event stream-only acquisition.

[0079] The second neural network model employs a lightweight Vision Transformer (ViT) model, balancing global feature capture capabilities with low power consumption. Compared to traditional CNNs, it achieves higher accuracy in recognizing human features in grayscale images. During the training phase, grayscale frame sample images are labeled with image categories: intrusive human parts are classified as positive samples, while negative samples represent non-intrusive human parts, such as backgrounds with purely changing lighting or human figures moving outside the cabin. Furthermore, rich real-world data and data augmentation techniques are used to enhance image contrast or reduce noise, training the classification model to learn and distinguish intrusive human features.

[0080] The introduction of a lightweight ViT network in this embodiment can better balance abnormal intrusion recall and real-time low-power scenarios. By analyzing grayscale frames, the accuracy of detection is fundamentally ensured, and non-human intrusion situations can be effectively excluded. Furthermore, through data augmentation training, the model has strong robustness to noise and light and shadow interference in grayscale images, and can accurately distinguish between human and non-human targets, solving the problem of false alarms in single-event camera detection.

[0081] Furthermore, human morphological features are extracted from the grayscale frames, including head and shoulder contours and human posture; human actions are determined based on human morphological features, and the combination of human morphological features and human actions is used to confirm whether it is an intrusion event.

[0082] Specifically, based on human body recognition, this embodiment further adds a detection dimension for behavior discrimination. When performing feature recognition on grayscale frames, it not only determines whether a human body exists, but also extracts more refined human morphological features. This can be achieved by adding or switching branches in the second neural network model, for example, outputting the coordinates of key human body points (e.g., head, shoulder, and elbow joints), or segmenting the human body contour. Based on these features, it is possible to analyze whether the head and shoulder contours are complete and stable, or to analyze human posture (e.g., upright, bent, and extended). Furthermore, by combining morphological feature changes across multiple consecutive frames, human actions (e.g., climbing, knocking, waving) can be determined. Finally, by comprehensively considering human morphological features and human actions, accurate identification of intrusion events with high-risk intrusion behaviors can be achieved.

[0083] Taking the second neural network model (Lightweight ViT) as an example, while extracting global features, it outputs human morphological feature vectors through intermediate feature layers: using the model's attention weight distribution, it locates key feature points of the head and shoulders (top of the head and left and right acromions), and forms the circumscribed polygon of the head and shoulders contour through feature point fitting; it extracts key skeletal points of the torso and limbs (e.g., neck, waist, elbow, knee) to construct posture feature vectors that reflect the overall posture trend of the human body (e.g., upright, bent, and extended); based on the extracted morphological features and posture vectors, combined with the characteristics of intrusion scenarios in the vehicle cabin, it sets action judgment rules (common logic for action recognition), such as intrusion actions, bending actions, and climbing actions, and performs dual verification by combining morphological features and human actions to confirm whether it is an intrusion event.

[0084] In this embodiment, a dual verification is performed by combining human morphological characteristics and human movements to confirm whether a suspected intrusion event is a real intrusion event, thereby further reducing false alarms.

[0085] In some embodiments, the above steps of inputting grayscale frames into a fully trained second neural network model for human feature recognition also include the following steps:

[0086] Based on the position of the target connected component in the event frame, the corresponding region of interest is determined on the grayscale frame; the region of interest is then input into the second neural network model for human feature recognition.

[0087] Specifically, after triggering the output grayscale frame, based on the position coordinates of the target connected components identified in the preliminary analysis of the event frame, and combined with the pixel alignment relationship between the event frame and the grayscale frame within the event camera, a corresponding region of interest (ROI) is mapped onto the grayscale frame. The ROI can be slightly larger than the original connected component bounding box to ensure complete target capture. Subsequently, the ROI is cropped and extracted from the grayscale frame, scaled to the input size specified by the model, and then fed into the second neural network model for human feature recognition.

[0088] In this embodiment, determining the region of interest in the grayscale frame for human body recognition enables the model to focus on the region of interest for recognition, avoiding full-image traversal calculations, improving detection efficiency, and eliminating interference from non-target areas (such as static backgrounds like seats and dashboards), reducing the amount of calculation for invalid areas, making it more suitable for scenarios with limited resources in vehicle terminals.

[0089] In some embodiments, step S204 above performs feature recognition on the grayscale frame and determines whether an intrusion event exists based on the feature recognition results, and further includes the following steps:

[0090] Human morphological features are extracted from grayscale frames, including head and shoulder contours and human posture. Human actions are determined based on human morphological features, and the combination of human morphological features and human actions is used to confirm whether it is an intrusion event.

[0091] Specifically, human morphological features in grayscale frames can be extracted based on deep learning, traditional visual processing, or a combination of deep learning and traditional visual processing. For example, a lightweight feature extraction network can be used to directly regress human keypoints, pose vectors, or semantic feature maps from grayscale frames; or, a predefined human geometric model (e.g., head-shoulder model and torso-limb model) can be used to fit the model in the image and extract features such as keypoint coordinates, contour parameters, and pose angles through algorithms such as edge detection, contour analysis, and template matching.

[0092] Based on these human morphological features, we can analyze whether the head and shoulder contours are complete and stable, or analyze human posture (e.g., upright, bent, and extended). Furthermore, by combining morphological feature changes across multiple frames, we can determine human actions (e.g., climbing, knocking, waving). Finally, by integrating human morphological features and human actions, and considering the characteristics of intrusion scenarios within a vehicle cabin, we establish action determination rules (common logic for action recognition), such as intrusion actions, bending over actions, and climbing actions. By combining morphological features and human actions for dual verification, we analyze human behavior and confirm whether it is an intrusion event.

[0093] This embodiment extends the feature recognition method for grayscale frames, enabling the determination of human body movements based on human morphological characteristics, and further confirming the existence of an intrusion event from the perspective of whether there are high-risk intrusion behaviors.

[0094] The present embodiment will now be described and illustrated through preferred embodiments.

[0095] Figure 3 This is a flowchart of the cabin intrusion detection method based on an event camera in this embodiment, as follows: Figure 3 As shown, the method includes the following steps:

[0096] Step S301: Obtain the event stream output by the event camera within a preset time window, obtain the event frame, and perform noise filtering on the event frame.

[0097] Step S302: Perform connected component analysis on the event frame after noise filtering, and determine whether there is a target connected component with an area greater than or equal to a first preset threshold.

[0098] Step S303: Input the noise-filtered event frame into the fully trained first neural network model for scene classification, and output the scene classification result of the intrusion scene or the interference scene.

[0099] Step S304: If a target connected component exists and the scene classification result is an intrusion scene, then it is determined that a suspected intrusion event exists.

[0100] Step S305: When a suspected intrusion event is detected, the event camera is triggered to output a grayscale frame corresponding to the suspected intrusion event.

[0101] Step S306: Based on the position of the target connected component in the event frame, determine the corresponding region of interest in the grayscale frame; input the region of interest into the second neural network model for human feature recognition, and obtain the recognition result of whether a human body exists in the grayscale frame, so as to determine whether an intrusion event exists.

[0102] Figure 4 This is a flowchart illustrating the intrusion detection process within the cabin in this embodiment, as follows: Figure 4 As shown, after filtering noise from the input event frame, the system enters the first stage of moving object detection. It determines whether there is a target connected component with an area greater than or equal to a first preset threshold. If it does not exist, it is determined that there is no intrusion. If it exists, the system enters the second stage of scene classification to obtain the scene classification result of whether the event frame is an intrusion scene or an interference scene. If it is an interference scene, it is determined that there is no intrusion. If it is an intrusion scene, the system enters the third stage of human feature detection to perform human feature recognition on the grayscale frame. If the recognition result is that there is no human body, it is determined that there is environmental interference. If the recognition result is that there is a human body, it is determined that there is a real intrusion event.

[0103] In this embodiment, rapid noise filtering and low-power target detection are achieved based on event frames. Then, a secondary fine verification is performed by combining grayscale frames with intruder and human features. This achieves a three-level detection mechanism, which not only significantly reduces the false detection rate and power consumption, but also enables accurate identification and timely response to intrusion events, thereby improving the overall reliability and practicality of the cabin security monitoring system.

[0104] It should be noted that the steps shown in the above process or in the flowchart of the accompanying figures can be executed in a computer system such as a set of computer-executable instructions, and although a logical order is shown in the flowchart, in some cases the steps shown or described may be executed in a different order than that shown here.

[0105] This embodiment also provides an intrusion detection device based on an event camera, which is used to implement the above embodiments and preferred embodiments; details already described will not be repeated. The terms "module," "unit," "subunit," etc., used below refer to combinations of software and / or hardware that perform predetermined functions. Although the device described in the following embodiments is preferably implemented in software, hardware implementation, or a combination of software and hardware, is also possible and contemplated.

[0106] Figure 5This is a structural block diagram of the cabin intrusion detection device based on an event camera in this embodiment, as shown below. Figure 5 As shown, the device includes:

[0107] The preliminary analysis module 10 acquires the event stream output by the event camera within a preset time window to obtain event frames; based on the event frames, it performs preliminary analysis to determine whether there are any suspected intrusion events.

[0108] The secondary verification module 20 is used to trigger the event camera to output a grayscale frame corresponding to the suspected intrusion event when a suspected intrusion event is detected; to perform feature recognition on the grayscale frame, and to determine whether an intrusion event exists based on the feature recognition results.

[0109] The device provided in this embodiment first performs a low-power preliminary analysis of the event frames from the event camera, fully utilizing the high temporal resolution and low power consumption characteristics of the event camera in dynamic scenes. Then, it combines the grayscale frames from the event camera for further feature recognition and secondary verification, fully utilizing the texture image features of the grayscale frames. This achieves low-power intrusion detection while ensuring intrusion detection accuracy. Compared to related technologies, event cameras have advantages in high temporal resolution and low power consumption, but their signals are easily affected by environmental changes such as lighting, leading to a large number of false alarms in intrusion detection. This embodiment, through the coordinated use of event frames and grayscale frames, logically implements a two-level detection mechanism from rapid preliminary analysis to accurate secondary verification, achieving reliable and accurate intrusion detection in complex vehicle environments and solving the problem of difficulty in ensuring intrusion detection accuracy in low-power scenarios.

[0110] It should be noted that the above modules can be functional modules or program modules, and can be implemented through software or hardware. For modules implemented through hardware, the above modules can reside in the same processor; or the above modules can be located in different processors in any combination.

[0111] This embodiment also provides a computer device, including a memory and a processor, wherein the memory stores a computer program and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.

[0112] Optionally, the computer device may further include a transmission device and an input / output device, wherein the transmission device is connected to the processor and the input / output device is connected to the processor.

[0113] It should be noted that the specific examples in this embodiment can refer to the examples described in the above embodiments and optional implementations, and will not be repeated in this embodiment.

[0114] Furthermore, in conjunction with the event camera-based cabin intrusion detection method provided in the above embodiments, this embodiment can also provide a storage medium for implementation. This storage medium stores a computer program; when executed by a processor, the computer program implements any of the event camera-based cabin intrusion detection methods described in the above embodiments.

[0115] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties.

[0116] It should be understood that the specific embodiments described herein are merely illustrative of the application and not intended to limit it. All other embodiments derived by those skilled in the art based on the embodiments provided in this application without inventive effort are within the scope of protection of this application.

[0117] Obviously, the accompanying drawings are merely some examples or embodiments of this application. Those skilled in the art can apply this application to other similar situations based on these drawings without any creative effort. Furthermore, it is understood that although the work done in this development process may be complex and lengthy, for those skilled in the art, certain design, manufacturing, or production modifications made based on the technical content disclosed in this application are merely conventional technical means and should not be considered as insufficient disclosure of this application.

[0118] The term "embodiment" in this application refers to a specific feature, structure, or characteristic described in connection with an embodiment that may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily imply the same embodiment, nor does it imply that it is mutually exclusive with or alternative to other embodiments. It will be clearly or implicitly understood by those skilled in the art that the embodiments described in this application may be combined with other embodiments without conflict.

[0119] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of patent protection. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the appended claims.

Claims

1. A method for cabin intrusion detection based on event cameras, characterized in that, include: Obtain the event stream output by the event camera within a preset time window to obtain event frames; Based on the event frames, a preliminary analysis is performed to determine whether there is a suspected intrusion event. When a suspected intrusion event is detected, the event camera is triggered to output a grayscale frame corresponding to the suspected intrusion event; The grayscale frame is subjected to feature recognition, and the presence of an intrusion event is determined based on the feature recognition results.

2. The intrusion detection method based on an event camera according to claim 1, characterized in that, The preliminary analysis based on the event frame to determine whether a suspected intrusion event exists includes: Noise filtering is performed on the event frames; Perform connected component analysis on the event frames after noise filtering, and determine whether there are target connected components with an area greater than or equal to a first preset threshold. If the target connected domain exists, then a suspected intrusion event is determined to exist.

3. The intrusion detection method based on an event camera according to claim 2, characterized in that, The preliminary analysis based on the event frame to determine whether a suspected intrusion event exists includes: Perform connected component analysis on the event frames after noise filtering, and determine whether there are target connected components with an area greater than or equal to a first preset threshold. The event frames are classified into scenes to obtain scene classification results; If the target connected component exists and the scene classification result is an intrusion scene, then it is determined that a suspected intrusion event exists.

4. The intrusion detection method based on an event camera according to claim 3, characterized in that, The step of classifying the event frame to obtain the scene classification result includes: The event frames after noise removal are input into the fully trained first neural network model for scene classification, and the scene classification results of intrusion scene or interference scene are output.

5. The intrusion detection method based on an event camera according to any one of claims 1 to 4, characterized in that, The step of performing feature recognition on the grayscale frame and determining whether an intrusion event exists based on the feature recognition results includes: The grayscale frame is input into a fully trained second neural network model for human feature recognition, and the recognition result of whether a human body exists in the grayscale frame is obtained. When the identification result shows that a human body is present in the grayscale frame, an intrusion event is confirmed.

6. The intrusion detection method based on an event camera according to claim 5, characterized in that, The step of inputting the grayscale frame into the fully trained second neural network model for human feature recognition also includes: Based on the position of the target connected component in the event frame, the corresponding region of interest is determined on the grayscale frame; The region of interest is input into the second neural network model for human feature recognition.

7. The intrusion detection method based on an event camera according to claim 1, characterized in that, The step of performing feature recognition on the grayscale frame and determining whether an intrusion event exists based on the feature recognition results further includes: Extract human morphological features from the grayscale frame, including head and shoulder contours and human posture; Based on the human morphological characteristics, human movements are determined, and by combining the human morphological characteristics and the human movements, it is confirmed whether it is an intrusion event.

8. A cabin intrusion detection device based on an event camera, characterized in that, include: The preliminary analysis module acquires the event stream output by the event camera within a preset time window, and obtains event frames; Based on the event frames, a preliminary analysis is performed to determine whether a suspected intrusion event exists. The secondary verification module is used to trigger the event camera to output a grayscale frame corresponding to the suspected intrusion event when a suspected intrusion event is detected. The grayscale frame is subjected to feature recognition, and the presence of an intrusion event is determined based on the feature recognition results.

9. A computer device, comprising a memory and a processor, characterized in that, The memory stores a computer program, and the processor is configured to run the computer program to perform the intrusion detection method based on an event camera as described in any one of claims 1 to 7.

10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the steps of the in-cabin intrusion detection method based on an event camera as described in any one of claims 1 to 7.