Intelligent monitoring and early warning system for foreign matters on top of locomotive

By installing a high-definition camera and an embedded processing chassis on the roof of the locomotive, and combining FPGA and NPU acceleration cards for image registration and real-time processing, the problems of accuracy, real-time performance, and scalability of foreign object detection on the roof of the locomotive have been solved, realizing efficient and low-cost multi-track foreign object detection and instant alarm.

CN224356172UActive Publication Date: 2026-06-12BEIJING PILOT HUIZHI TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Utility models(China)
Current Assignee / Owner
BEIJING PILOT HUIZHI TECH CO LTD
Filing Date
2025-07-14
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In existing technologies, locomotive roof foreign object detection relies on manual inspection, which is inefficient and has a high false alarm rate. Fixed camera solutions have response delays and cannot meet real-time early warning requirements. Single-camera architectures have limited scalability and insufficient real-time performance, making it difficult to achieve high-precision, low-cost multi-track detection.

Method used

It employs two high-definition cameras combined with an RTSP protocol encoding chip, FPGA coordinate mapping and NPU acceleration card in an embedded processing chassis for image registration, and combines an audible and visual alarm and a mobile communication module to achieve multi-angle image acquisition, real-time processing and instant alarm.

🎯Benefits of technology

It has improved detection accuracy to over 90%, shortened alarm response time to within 5 seconds, reduced hardware costs by 60%, and achieved dual alarms for multi-track coverage and remote on-site operation, meeting the real-time needs of railway traffic dispatching.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN224356172U_ABST
    Figure CN224356172U_ABST
Patent Text Reader

Abstract

The utility model relates to a locomotive roof foreign matter intelligent monitoring early warning system belongs to railway safety monitoring technical field. The system includes: image acquisition subassembly, including at least two high definition camera, each camera built -in protocol encoding chip, output compressed video stream, embedded processing machine case is connected with camera through industrial ethernet switch, and the integrated physical storage module in the machine case, prestore vehicle type standard image data, space registration module, by chip realizes coordinate mapping circuit, input real -time video stream and output registration image signal, heterogenous computing module, carries chip acceleration card, input registration image signal and output foreign matter position digital signal, alarm execution terminal, including: audible -visual alarm, on -the -spot trigger alarm, mobile communication module, convert foreign matter position digital signal into wireless network data packet. The utility model has built the scheme of high robustness, high efficiency, high expansibility, replaced artificial inspection and the inefficient mode of traditional static detection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This utility model relates to the field of railway safety monitoring technology, and in particular to an intelligent monitoring and early warning system for foreign objects on the roof of a locomotive. Background Technology

[0002] Based on the current technological status in the field of railway locomotive safety inspection, foreign objects on the locomotive roof mainly rely on two methods: manual inspection and fixed camera image review. Manual inspection has the problems of high missed detection rate (over 40% in rainy and foggy weather), low efficiency, and inability to cover nighttime operations. Although the fixed camera solution can record images, it requires manual frame-by-frame review and analysis, resulting in significant response delay (average time > 10 minutes), which cannot meet the second-level real-time early warning requirements of train dispatching.

[0003] Existing automated systems generally adopt a single-camera static comparison architecture, which has three main technical shortcomings: First, insufficient environmental robustness. Traditional image processing algorithms suffer from a surge in false alarm rates (nighttime detection accuracy is less than 60%) due to the lack of dedicated noise suppression hardware modules in low light or rain / fog conditions. For example, the document points out that the foreign object positioning deviation exceeds 20 pixels under complex lighting conditions. Second, limited hardware scalability. The single-camera architecture cannot adapt to multi-track hangar scenarios. Each new track requires a separate deployment of a detection terminal, resulting in equipment redundancy and increased costs. Third, insufficient real-time performance. The processing speed is difficult to exceed 15 frames per second, far below the real-time threshold of 30 frames per second. Summary of the Invention

[0004] Therefore, it is necessary to provide an intelligent monitoring and early warning system for foreign objects on the roof of a locomotive to address the aforementioned problems of insufficient robustness, limited hardware scalability, and insufficient real-time performance.

[0005] This utility model provides an intelligent monitoring and early warning system for foreign objects on the roof of a locomotive, comprising:

[0006] The image acquisition component includes at least two high-definition cameras fixed to the side brackets of the locomotive entry track. Each camera has a built-in RTSP protocol encoding chip and outputs a compressed video stream.

[0007] An embedded processing chassis connects to the camera via an industrial Ethernet switch. The chassis integrates:

[0008] The physical storage module pre-stores standard image data of the vehicle model;

[0009] The spatial registration module, a coordinate mapping circuit implemented by an FPGA chip, takes a real-time video stream as input and outputs a registration image signal.

[0010] The heterogeneous computing module is an accelerator card equipped with an NPU chip, which takes in the registration image signal and outputs the digital signal of the foreign object position;

[0011] Alarm execution terminals include:

[0012] Audible and visual alarm, triggered on-site;

[0013] The mobile communication module converts the digital signal of the foreign object's location into wireless network data packets and sends them to the user terminal.

[0014] In one embodiment, the image acquisition component includes a high-definition camera mounted on an adjustable-angle bracket with a magnetic base at the bottom. The camera lens is covered with a nano-hydrophobic coating to suppress rain and fog interference.

[0015] In one embodiment, the physical storage module uses a hot-swappable SSD hard drive bay with an external status indicator light. In the spatial registration module, the FPGA chip integrates a pixel mapping lookup table circuit, and real-time registration is achieved through pre-stored vehicle coordinate mapping relationships.

[0016] In one embodiment, the audible and visual alarm has a built-in frequency division control circuit that drives a buzzer to output intermittent alarm sounds.

[0017] In one embodiment, the mobile communication module integrates a data compression chip to convert the digital signal of the object's coordinates into low-bandwidth network data packets.

[0018] In one embodiment, the industrial Ethernet switch is connected to an expansion interface board, which has a reserved RJ45 port to support the addition of cameras up to 8 inputs.

[0019] In one embodiment, the chassis has an expansion slot for adding a camera access interface.

[0020] In one embodiment, the mobile communication module has a built-in SIM card slot.

[0021] The aforementioned intelligent foreign object detection and early warning system for locomotive roofs captures multi-angle images of the locomotive roof using a high-definition camera in the image acquisition component. Combined with an FPGA coordinate mapping circuit within the embedded processing chassis, it performs spatial transformation and pixel alignment on the real-time video stream, eliminating viewing angle deviations and environmental interference. This improves the detection accuracy from less than 60% using traditional methods to over 90%, significantly reducing missed foreign object detections. Addressing the bottleneck of insufficient real-time performance, the NPU accelerator card in the heterogeneous computing module efficiently processes the registered image signals, achieving a high-speed processing of 30 frames per second through an optimized pipeline architecture, reducing the alarm response time to less than 5 seconds. The system fully meets the real-time requirements of railway traffic dispatching; the RTSP protocol encoding chip of the image acquisition component supports concurrent input of multiple video streams, and combined with the physical connection of the industrial Ethernet switch, a single system can cover multiple tracks in the same hangar, reducing hardware costs by 60% and avoiding the equipment redundancy of the traditional single-camera architecture; the audible and visual alarm and mobile communication module of the alarm execution terminal further ensure the immediate feedback of abnormal information, realizing dual alarms on-site and remotely, thus building a highly robust, efficient, and scalable solution that completely replaces the inefficient mode of manual inspection and traditional static detection. Attached Figure Description

[0022] To more clearly illustrate the technical solutions in this utility model or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this utility model. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0023] Figure 1 This is a schematic diagram of the intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to this utility model.

[0024] Figure label:

[0025] 110. Image acquisition component; 112. Camera; 120. Embedded processing chassis; 122. Physical storage module; 124. Spatial registration module; 126. Heterogeneous computing module; 130. Alarm execution terminal; 132. Audible and visual alarm; 134. Mobile communication module. Detailed Implementation

[0026] To make the objectives, technical solutions, and advantages of the embodiments of this utility model clearer, the technical solutions of the embodiments of this utility model will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this utility model, not all embodiments. Based on the embodiments of this utility model, all other embodiments obtained by those skilled in the art without creative effort are within the protection scope of this utility model.

[0027] It should be noted that when a component is referred to as being "fixed to" or "set on" another component, it can be directly on the other component or there may be an intermediate component. When a component is considered to be "connected to" another component, it can be directly connected to the other component or there may be an intermediate component present. The terms "vertical," "horizontal," "upper," "lower," "left," "right," and similar expressions used in this specification are for illustrative purposes only and do not represent the only possible implementation.

[0028] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of that feature. In the description of this utility model, "a plurality of" means at least two, such as two, three, etc., unless otherwise explicitly specified.

[0029] In this utility model, unless otherwise explicitly specified and limited, "above" or "below" the second feature can mean that the first feature is in direct contact with the second feature, or that the first feature and the second feature are in indirect contact through an intermediate medium. Furthermore, "above," "on top of," and "over" the second feature can mean that the first feature is directly above or diagonally above the second feature, or simply indicates that the first feature is at a higher horizontal level than the second feature. "Below," "below," and "under" the second feature can mean that the first feature is directly below or diagonally below the second feature, or simply indicates that the first feature is at a lower horizontal level than the second feature.

[0030] Unless otherwise defined, all technical and scientific terms used in this specification have the same meaning as commonly understood by one of ordinary skill in the art to which this specification belongs. The terminology used in this specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and / or" as used in this specification includes any and all combinations of one or more of the associated listed items.

[0031] Based on the current technological status in the field of railway locomotive safety inspection, foreign objects on the locomotive roof mainly rely on two methods: manual inspection and fixed camera image review. Manual inspection has the problems of high missed detection rate (over 40% in rainy and foggy weather), low efficiency, and inability to cover nighttime operations. Although the fixed camera solution can record images, it requires manual frame-by-frame review and analysis, resulting in significant response delay (average time > 10 minutes), which cannot meet the second-level real-time early warning requirements of train dispatching.

[0032] Existing automated systems generally employ a single-camera static comparison architecture, which suffers from three main technical shortcomings: First, insufficient environmental robustness. Traditional image processing algorithms, lacking dedicated noise suppression hardware modules, experience a surge in false alarm rates under low light or rain / fog conditions (nighttime detection accuracy is less than 60%). For example, the document indicates that foreign object positioning deviations exceed 20 pixels under complex lighting conditions. Second, limited hardware scalability. The single-camera architecture cannot adapt to multi-track hangar scenarios; each new track requires a separate deployment of a detection terminal, resulting in equipment redundancy and increased costs. Third, insufficient real-time performance. The algorithm and hardware are disconnected. Although deep learning models such as YOLOv8 have high accuracy potential, they are not deeply coupled with embedded accelerators (such as GPUs / NPUs), making it difficult to break through 15 frames per second, far below the real-time threshold of 30 frames per second. These technical bottlenecks collectively make it difficult for existing systems to balance accuracy, efficiency, and cost, necessitating the development of new detection systems that support multi-source dynamic registration, environmental adaptive processing, and lightweight acceleration.

[0033] The following is combined with Figure 1 This invention describes an intelligent monitoring and early warning system for foreign objects on the roof of a locomotive.

[0034] like Figure 1 As shown, in one embodiment, a locomotive roof foreign object intelligent monitoring and early warning system includes an image acquisition component 110, an embedded processing chassis 120, and an alarm execution terminal 130.

[0035] The image acquisition component 110 is installed on the side bracket of the locomotive entering the depot track. It uses two 2-megapixel high-definition cameras 112. The cameras 112 have built-in RTSP protocol encoding chips and output H.265 compressed video streams.

[0036] Camera 112 is mounted on an adjustable-angle bracket, with a magnetic base at the bottom of the bracket adhering to the metal bracket on the side of the rail. The lens is covered with a nano-hydrophobic coating, which, in actual tests, reduces noise interference from rain and fog by 35%. The camera's tilt angle adjustment range is ±30°, covering the detection area on the locomotive roof (length × width = 20m × 3m).

[0037] The embedded processing chassis 120 is connected to the camera via an industrial Ethernet switch. The chassis integrates a physical storage module 122, a spatial registration module 124, and a heterogeneous computing module 126.

[0038] The physical storage module 122 is equipped with two hot-swappable SSD hard drive bays, pre-stored with a standard image database of eight mainstream locomotive models (resolution: 1920×1080). Each hard drive bay has an external three-color status indicator light (green: normal; yellow: insufficient capacity; red: fault).

[0039] The spatial registration module 124, equipped with an NPU chip-based accelerator card, inputs the registration image signal and outputs a digital signal indicating the location of foreign objects. The spatial registration module 124 calls a pre-stored vehicle coordinate mapping table to complete the spatial alignment of the real-time video with the standard image. The NPU accelerator card executes a lightweight YOLOv8 algorithm (inference speed: 28ms / frame) to identify foreign objects and output their coordinates. Optionally, a Xilinx Artix-7 series FPGA chip is used, with a built-in pixel lookup table circuit. Using BRAM (Block RAM) to implement the pixel lookup table is one of the most common applications of Artix-7 FPGAs in image processing (such as cameras, displays, and industrial vision). Through the pre-stored vehicle roof coordinate mapping relationship (mapping accuracy: ±2 pixels), the real-time video stream is converted into a registration image signal. The heterogeneous computing module 126 is equipped with a Huawei Atlas 300 NPU accelerator card. Its three-stage pipeline architecture implements the following processing: Backbone computing unit: extracts feature maps through a 3×3 convolutional circuit (output channels: 256); Head fusion unit: fuses multi-scale features using a feature pyramid circuit; Detect output unit: outputs the coordinates of the foreign object (format: XML digital signal) through a bounding box regression circuit. The implementation of "backbone feature extraction, head multi-scale fusion, and detect coordinate output through a three-stage pipeline architecture using a Huawei Atlas 300 NPU accelerator card" is already a standard technology in the current technological environment, with its core components supported by hardware architecture, toolchain adaptation, and large-scale application case verification. The Huawei Atlas 300 is based on the Da Vinci architecture, and its AI Core includes a three-stage pipeline of matrix computing units (Cube Unit, Vector Unit, and Scalar Unit), which can seamlessly connect the entire process of convolutional computation, feature fusion, and coordinate regression.

[0040] Backbone Computation Unit: The 3×3 convolutional circuits can be hardware-accelerated through the 3D matrix operation unit of the Cube Unit. The Ascend CANN framework's TBE DSL API supports the development of 3D convolutional operators, and the da Vinci architecture's storage conversion unit can automatically complete data format conversion (such as Img2Col), avoiding the software overhead of traditional CPUs / GPUs. Real-world testing data shows that the Atlas 300 achieves a processing speed of 2000 FPS (INT8 precision) on the 3×3 convolutional layer of the ResNet-50 model.

[0041] Head Fusion Unit: Multi-scale feature fusion of the Feature Pyramid (FPN) can be achieved through parallel computation of the Vector Unit. The Vector Unit of Ascend AI Core supports FP16 / FP32 mixed precision operations. In the FPN module of the YOLOv8 model, the latency of multi-scale feature concatenation and upsampling operations can be controlled within 1.2ms.

[0042] Detect Output Unit: The bounding box regression circuit completes coordinate transformation and post-processing through the Scalar Unit. The Ascend SDK provides a Protobuf serialization tool, which can convert raw coordinate data into XML format. Real-world testing on the Huawei Atlas 500 smart station shows that the entire process latency from coordinate decoding to XML generation is <5ms.

[0043] The alarm execution terminal 130 includes an audible and visual alarm 132 and a mobile communication module 134. The audible and visual alarm 132 has a built-in frequency division control circuit that drives a buzzer to output an intermittent alarm sound that lasts 0.5 seconds on and 1 second off. The mobile communication module 134 integrates a data compression chip (compression ratio: 1:8) to convert the coordinates of the foreign object into low-bandwidth data packets (average packet size: 2KB), which are then transmitted to the dispatch center or user terminal via a built-in SIM card slot through a 4G network.

[0044] The locomotive roof foreign object intelligent monitoring and early warning system of this embodiment captures multi-angle images of the locomotive roof through a high-definition camera in the image acquisition component 110. Combined with the FPGA coordinate mapping circuit in the embedded processing chassis, it performs spatial transformation and pixel alignment on the real-time video stream, eliminating viewing angle deviations and environmental interference (such as rain, fog, or low-light conditions). This improves the detection accuracy from less than 60% in traditional methods to over 90%, significantly reducing missed foreign object detections. Addressing the bottleneck of insufficient real-time performance, the NPU accelerator card in the heterogeneous computing module efficiently processes the registered image signal, achieving a high-speed processing of 30 frames per second through an optimized pipeline architecture. The alarm response time is shortened to less than 5 seconds, fully meeting the requirements. The real-time requirements of railway traffic dispatching are met thanks to the parallel acceleration of computing tasks by the NPU chip; the RTSP protocol encoding chip of the image acquisition component supports concurrent input of multiple video streams, and combined with the physical connection of the industrial Ethernet switch, a single system can cover multiple tracks in the same hangar, reducing hardware costs by 60% and avoiding the equipment redundancy of the traditional single-camera architecture; the audible and visual alarm 132 and mobile communication module 134 of the alarm execution terminal 130 further ensure the immediate feedback of abnormal information, realizing dual alarms on-site and remotely, thus building a highly robust, efficient and scalable solution that completely replaces the inefficient mode of manual inspection and traditional static detection.

[0045] In this embodiment, the number of camera inputs can be increased to 8 via the expansion interface board (with a reserved RJ45 port) of the industrial Ethernet switch, adapting to multi-track hangar scenarios. The chassis has two reserved PCIe expansion slots to support the addition of camera access interface cards (compatible with PoE power supply). The NPU accelerator card has a built-in temperature sensor (operating threshold: 0-85℃), and automatically reduces frequency when the temperature exceeds the limit.

[0046] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0047] The above-described embodiments are merely illustrative of several implementations of this utility model, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of this utility model. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this utility model, and these all fall within the protection scope of this utility model. Therefore, the protection scope of this utility model should be determined by the appended claims.

Claims

1. A locomotive roof foreign object intelligent monitoring and early warning system, characterized in that, include: The image acquisition component includes at least two high-definition cameras fixed to the side brackets of the locomotive entry track. Each camera has a built-in RTSP protocol encoding chip and outputs a compressed video stream. An embedded processing chassis connects to the camera via an industrial Ethernet switch. The chassis integrates: The physical storage module pre-stores standard image data of the vehicle model; The spatial registration module, a coordinate mapping circuit implemented by an FPGA chip, takes a real-time video stream as input and outputs a registration image signal. The heterogeneous computing module is an accelerator card equipped with an NPU chip, which takes in the registration image signal and outputs the digital signal of the foreign object position; Alarm execution terminals include: Audible and visual alarm, triggered on-site; The mobile communication module converts the digital signal of the foreign object's location into wireless network data packets and sends them to the user terminal.

2. The intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to claim 1, characterized in that, In the image acquisition component, a high-definition camera is mounted on an adjustable angle bracket, and a magnetic base is provided at the bottom of the bracket. The camera lens is covered with a nano-hydrophobic coating to suppress rain and fog interference.

3. The intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to claim 1, characterized in that, The physical storage module uses a hot-swappable SSD hard drive bay with an external status indicator light. In the spatial registration module, the FPGA chip integrates a pixel mapping lookup table circuit, which achieves real-time registration by pre-stored vehicle coordinate mapping relationships.

4. The intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to claim 1, characterized in that, The sound and light alarm has a built-in frequency division control circuit that drives the buzzer to output intermittent alarm sounds.

5. The intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to claim 1, characterized in that, The mobile communication module integrates a data compression chip to convert the digital signal of the object's coordinates into low-bandwidth network data packets.

6. The intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to claim 1, characterized in that: The industrial Ethernet switch is connected to an expansion interface board, which has a reserved RJ45 port to support the addition of cameras up to 8 inputs.

7. The intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to claim 1, characterized in that, The chassis has reserved expansion slots to add camera access interfaces.

8. The intelligent monitoring and early warning system for foreign objects on the roof of a locomotive according to claim 1, characterized in that, The mobile communication module has a built-in SIM card slot.