Rail transit station vertical domain intelligent agent inspection system and method

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By introducing adaptive polling inspection and large-scale model anomaly recognition in rail transit stations, combined with a knowledge base collaborative reasoning intelligent agent inspection system, the problems of low efficiency and insufficient multimodal analysis of manual inspection have been solved, realizing efficient and reliable anomaly recognition and intelligent decision-making in subway stations.

CN122244779APending Publication Date: 2026-06-19ZHEJIANG SUPCON INFORMATION TECH CO LTD

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: ZHEJIANG SUPCON INFORMATION TECH CO LTD
Filing Date: 2025-12-02
Publication Date: 2026-06-19

Application Information

Patent Timeline

02 Dec 2025

Application

19 Jun 2026

Publication

CN122244779A

IPC: G06V20/52; G06V10/764; G06V20/40; G06N5/04; G06V10/82; G06N3/0442

AI Tagging

Application Domain

Character and pattern recognition Biological models

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

The existing rail transit station inspection system relies on manual inspection, which has problems such as high human resource investment, limited frequency, delayed discovery of hidden dangers, and untimely emergency response. In addition, multimodal analysis lacks the ability to integrate understanding and contextual semantic reasoning, making it difficult to accurately identify and handle complex abnormal events.

Method used

The system adopts a vertical intelligent body inspection system for rail transit stations. It monitors and collects data through cameras and sensors, combines an adaptive polling inspection mechanism and an adaptive image enhancement algorithm to dynamically adjust the inspection intensity, uses a large model anomaly identification module for multimodal data analysis, and combines a knowledge base for intelligent response processing.

Benefits of technology

It has improved the quality and timeliness of multi-source sensing data of subway stations, significantly enhanced the system's adaptability in different scenarios, improved the interpretability of diagnostic results and the value of operation and maintenance guidance, and improved the safety and response efficiency of subway operation.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122244779A_ABST

Patent Text Reader

Abstract

This invention provides a vertical intelligent body inspection system and method for rail transit stations. The system includes a data acquisition module that monitors the station area, personnel, and equipment using cameras and sensors, collecting monitoring data and environmental data and sending it to a data preprocessing module. It also includes a scheduling module that controls the polling frequency and order of the data acquisition module, as well as the scheduling priority for subsequent large-scale model inference, based on the inherent risk level of the area collected by the data acquisition module, real-time pedestrian density, historical frequency of equipment failures, and current event status. The data preprocessing module performs image preprocessing on the monitoring data and dynamically constructs prompt words for input to the large-scale model anomaly recognition module. The anomaly response module classifies anomalies according to the anomaly recognition results and outputs processing suggestions in conjunction with a knowledge base. By introducing an adaptive polling inspection mechanism and an adaptive image enhancement algorithm, the quality and timeliness of multi-source sensing data are effectively improved.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of urban rail transit operation and maintenance technology, and particularly relates to a vertical intelligent body inspection system and method for rail transit stations. Background Technology

[0002] Urban rail transit, as a crucial component of modern transportation networks, boasts significant advantages such as large capacity, high efficiency, and environmental friendliness. With the acceleration of urbanization in my country, an increasing number of cities are constructing and operating massive and complex rail transit networks. Under long-term high-load operation, the operational status, service life, and maintenance level of electromechanical equipment within subway stations directly impact operational safety and service quality. Currently, subway system inspection and maintenance still primarily rely on manual inspections and periodic maintenance, resulting in high human resource investment, limited inspection frequency, delayed hazard detection, and untimely emergency response, failing to meet the demands of intelligent and refined operation management. Especially when facing complex scenarios such as sudden equipment failures, passenger accidents (e.g., falls, delays), and fire hazards, the traditional model lacks real-time perception and intelligent decision-making capabilities, easily leading to delays in response.

[0003] The prior art patent CN120047907B discloses an emergency response method and system for urban rail transit based on a large visual model, including: collecting visual data, sound data, and equipment operation data at urban rail transit platforms and carriages, preprocessing them, and synchronizing them with timestamps; performing dynamic frame differential processing on continuous video frames of the visual data to generate a two-dimensional motion heat map and performing spatiotemporal aggregation; constructing a three-dimensional digital model based on engineering drawings of the platforms and carriages, and establishing a mapping relationship between the two-dimensional motion heat map and the three-dimensional digital model; detecting abnormal behavior based on the mapping results, performing multimodal fusion analysis in combination with sound data and equipment operation data to determine the severity of abnormal events; formulating emergency response strategies according to the severity of abnormal events, sending control commands to rail transit equipment through an industrial protocol interface, and pushing abnormal event information to the control center, inspection personnel, and passenger terminals. Summary of the Invention

[0004] In the existing technology, automated inspection systems based on computer vision and sensor monitoring are gradually being applied, but most systems are still limited to single-modal analysis or rule-based judgment, lacking the ability to fuse and understand multi-source information and contextual semantic reasoning, making it difficult to achieve accurate identification of complex abnormal events and generate handling suggestions.

[0005] To address the aforementioned technical problems, the present invention provides the following technical solution: a vertical intelligent body inspection system for rail transit stations, comprising a data acquisition module that monitors station areas, personnel, and equipment using cameras and sensors, and collects monitoring data and environmental data, which are then sent to a data preprocessing module; a scheduling module that controls the polling frequency and order of the data acquisition module, as well as the scheduling priority for subsequent large-scale model inference, based on the inherent risk level of the area collected by the data acquisition module, real-time pedestrian density, historical frequency of equipment failures, and current event status; the data preprocessing module performs image preprocessing on the monitoring data and dynamically constructs prompt words for input to the large-scale model anomaly recognition module; and the anomaly response module performs hierarchical processing of anomalies based on the anomaly recognition results and outputs processing suggestions in conjunction with a knowledge base.

[0006] Specifically, the data acquisition module collects monitoring video streams, audio signals, environmental parameters, and equipment operating status information in real time. After local preprocessing of the data stream through edge computing nodes, it transmits it to the station center server, stores it synchronously in the multimodal data warehouse according to timestamps, and synchronizes it to the data preprocessing module.

[0007] Specifically, the scheduling module divides the entire station into multiple areas based on the deployment location of the cameras. Based on the inherent risk level, real-time passenger flow density, historical frequency of equipment failures, and current event status of each monitoring area, it calculates the area scheduling weight and dynamically adjusts the camera polling frequency of the data acquisition module according to the weight.

[0008] Specifically, the scheduling module constructs a differentiable scheduling decision function. Through a gating fusion mechanism, it performs nonlinear interaction of multi-dimensional features of the inherent security risks of the monitored area, dynamic indicators of pedestrian flow and its changing trends, abnormal event memory decay indicators, and current event state incentives to construct scheduling priorities. The scheduling priorities determine the polling order of cameras / sensors, the allocation density of AI analysis resources, and the frequency of model inference in the next scheduling cycle for that area.

[0009] Specifically, the data preprocessing module performs selective image enhancement processing on the acquired real-time video frames, with the enhancement intensity determined by a weighted average of the regional risk level and real-time pedestrian flow. Combining sensor data and contextual information, it generates structured prompt segments and constructs a unified multimodal input format containing images, metadata, and prompt text, which is then used as input to the large model and sent to the large model anomaly recognition module.

[0010] Specifically, the large model anomaly identification module performs semantic understanding and preliminary identification of abnormal equipment status and abnormal personnel behavior, and outputs image description, anomaly type, location and confidence level; if an anomaly is identified and the confidence level is greater than the threshold, the anomaly response module performs corresponding anomaly response processing; if an anomaly is identified but the confidence level is lower than the threshold, an early warning log is generated.

[0011] Specifically, the anomaly response module uses the anomaly identification results output by the large model as query conditions, automatically extracts the anomaly type, occurrence location and contextual features, determines the anomaly event level based on the intelligent rule base, and adopts different mechanisms after leveling to link the devices to respond, and generates suggestions through collaborative reasoning by the knowledge base.

[0012] Specifically, the anomaly response module comprehensively judges the event level based on the anomaly type, risk area, and dynamic situation, and builds an intelligent rule base to achieve "context-aware" anomaly classification.

[0013] Specifically, while the anomaly response module coordinates with the equipment within the station to respond, the knowledge base generates specific accident analyses and detailed handling suggestions for staff. Through keyword matching and semantic expansion mechanisms, it retrieves knowledge bases on rail transit equipment faults, personnel safety incidents, emergency plans, and maintenance and operation manuals. Combining historical work order records, equipment ledger information, and historical handling cases of similar personnel safety incidents in the target area or equipment, it performs multi-source information fusion analysis to generate structured and actionable response suggestions.

[0014] A method for vertical intelligent agent inspection of rail transit stations, applied to the aforementioned rail transit station vertical intelligent agent inspection system, includes the following steps: S1, Data Acquisition Module, monitors the station area, personnel, and equipment through cameras and sensors; S2. The scheduling module allocates the polling frequency, order, and scheduling priority of the subsequent large model inference based on the regional data collected by the data acquisition module. S3, the data preprocessing module performs image preprocessing on the monitoring data and dynamically constructs prompt words; S4. The large model anomaly identification module identifies anomalies based on the output of the data preprocessing module. S5, the exception response module, performs hierarchical processing of exceptions and outputs processing suggestions in conjunction with the knowledge base.

[0015] The beneficial effects of this invention are as follows: By introducing an adaptive polling inspection mechanism and an adaptive image enhancement algorithm, the inspection intensity is reasonably allocated according to the characteristics of different areas, effectively improving the quality and timeliness of multi-source sensing data, providing highly reliable input for subsequent intelligent analysis, and solving the problems of missed detection and false detection caused by image blurring and low illumination in the complex environment of the subway, significantly enhancing the system's adaptability in different scenarios; adopting a hybrid intelligent architecture of "large model guidance + knowledge base collaborative reasoning", which is different from the pure data-driven black box model, it can deeply integrate the generalization recognition capability of multimodal large models with professional knowledge in the field of rail transit. It realizes the intelligent leap from "perceiving anomalies" to "understanding the cause" and then to "guiding actions", greatly improving the interpretability of diagnostic results and the value of operation and maintenance guidance. This invention also designs a post-processing mechanism based on multi-frame consistency verification and confidence test, and constructs a dynamic risk scoring model to realize intelligent event classification, overcoming the problems of high false alarm rate and rigid response caused by existing technologies' diagnostic bias or hard-coded keyword matching. The system can dynamically adjust its judgment sensitivity based on context such as regional risk level and real-time passenger flow, and implement differentiated response strategies for equipment malfunctions and personnel incidents. All event information is pushed to the control center simultaneously, forming a comprehensive and three-dimensional intelligent monitoring and collaborative handling capability, which significantly improves the safety, response efficiency and intelligence level of subway operations. Attached Figure Description

[0016] Figure 1 This is a flowchart of the present invention. Detailed Implementation

[0017] The present invention will now be described in detail with reference to the accompanying drawings and embodiments.

[0018] Example 1: A vertical intelligent body inspection system for rail transit stations includes a data acquisition module that monitors the station area, personnel, and equipment using cameras and sensors, and sends the collected monitoring data and environmental data to a data preprocessing module. It also includes a scheduling module that controls the polling frequency and order of the data acquisition module, as well as the scheduling priority for subsequent large-scale model inference, based on the inherent risk level of the area collected by the data acquisition module, real-time pedestrian density, historical frequency of equipment failures, and current event status. The data preprocessing module performs image preprocessing on the monitoring data and dynamically constructs prompt words for input to the large-scale model anomaly recognition module. The anomaly response module classifies anomalies according to the anomaly recognition results and outputs processing suggestions in conjunction with a knowledge base.

[0019] The data acquisition module collects real-time video streams, audio signals, electromechanical equipment vibration data, environmental parameters (such as temperature and humidity), and operating condition information through cameras, vibration sensors, temperature and humidity sensors, smoke detectors, and various equipment operating status interfaces deployed in various areas of the subway station. All data streams are preprocessed locally at edge computing nodes and then transmitted to the station's central server via industrial Ethernet or wireless network. They are also stored synchronously in a multimodal data warehouse according to timestamps to ensure input consistency, which is used for subsequent statistical analysis of safety incidents, diagnosis of electromechanical equipment faults, and other tasks.

[0020] The scheduling module controls the polling frequency and order of the data acquisition module, as well as the scheduling priority for subsequent large-scale model inference. Within a station, the basic conditions differ across areas. For example, during peak hours, large crowds converge at the main passenger entrances and exits, while off-peak hours see a significant decrease in passenger flow. At equipment room entrances and exits, although fewer people pass through, their importance necessitates focused monitoring to prevent risks. Therefore, the inspection intensity for each area of the station needs to be rationally allocated to maximize inspection efficiency; that is, the inspection frequency over a period of time must comprehensively consider factors such as the area's risk level and the frequency of abnormal events. Based on the camera deployment locations, the entire station is divided into multiple areas. Based on the inherent risk level, real-time passenger flow density, historical equipment failure frequency, and current event status of each monitored area, an area scheduling weight is calculated. The camera polling frequency is dynamically adjusted according to the weight, enabling high-frequency, long-term focused monitoring of high-risk and high-passenger-flow areas, while automatically shortening the polling cycle for high-risk areas. The inspection intensity, or scheduling cycle, allocated to each area is inversely proportional to its weight; that is, the higher the weight, the more frequently the area is inspected and scheduled.

[0021] Furthermore, after determining the polling frequency, it is also necessary to divide the polling scheduling order for different areas. In this invention, the adaptive polling priority coefficient for each monitoring area of the station is defined as a dynamically differentiable function, the output of which determines the camera / sensor polling order, AI analysis resource allocation density, and model inference frequency for that area in the next scheduling cycle. The scheduling priority calculation function is constructed from a differentiable scheduling decision function, through a gating fusion mechanism to achieve nonlinear interaction of features across various dimensions.

[0022] The specific scheduling priority function's hidden state is constructed by concatenating the outputs of four context-aware encoders, including: The inherent risk characterization encoder reflects the quantitative characteristics of the inherent security risks in the monitored area. For example, the original risk of a power distribution room is higher than that of a normal passage. An embedding layer is used to map the area type (such as "power distribution room" or "platform edge") into a high-dimensional semantic vector, and a prior risk knowledge graph is introduced as a constraint. The real-time pedestrian flow dynamic density function is a dynamic indicator that reflects the pedestrian flow and its changing trend in a region in real time. It is derived by weighting the current normalized value of pedestrian flow, the pedestrian flow change rate (reflecting sudden gatherings), and the continuous exposure time of pedestrian flow, reflecting the cumulative effect of "long-term high density" risk. The abnormal event memory decay function is a decay-weighted memory value based on the frequency and severity of historical abnormal events in the region. It reflects the fault tendency and is calculated from the number of historical faults, time, fault severity level, and equipment type. It realizes the exponential forgetting mechanism of fault memory and reflects the "recency effect". The event state incentive function represents the immediate boost to scheduling priority of the current abnormal event. The higher the level, the stronger the incentive. For continuous events, a time penalty term is introduced to reflect the escalation of risk caused by "failure to handle in a timely manner".

[0023] The scheduling priority function also uses system resource constraint gating terms to limit the available computing and bandwidth resources of the system in real time to prevent scheduling overload.

[0024] The data preprocessing module performs selective image enhancement on the acquired real-time video frames. The enhancement intensity is determined by a weighted average of the regional risk level and real-time pedestrian flow. Combining sensor data and contextual information, it generates structured prompt segments and constructs a unified multimodal input format containing images, metadata, and prompt text as input to the large model.

[0025] In multimodal large-scale model-based visual inspection systems for rail transit, the monitoring environment is complex and variable (e.g., insufficient lighting, smoke obstruction, crowd occlusion), and directly inputting raw images into the large model may lead to a decrease in the accuracy of safety hazard identification. Existing technologies typically employ fixed enhancement strategies or no enhancement at all, lacking dynamic perception of the importance of the scene. Selective image enhancement processing can adaptively enhance key areas of the image based on risk and crowd density perception. Without modifying the structure of the large model, it improves the quality of the input image through intelligent preprocessing, enhancing the large model's ability to identify abnormal events in key areas.

[0026] At the input front end of the multimodal large model, the data preprocessing module comprehensively assesses regional risks and pedestrian flow patterns. First, a risk score between 0 and 1 is preset for each monitored area within the subway station, set by the expert system based on historical accident data, equipment importance, and personnel activity characteristics. For example, highly confidential areas such as power distribution rooms and machine rooms are set to 0.9; dangerous and prone areas such as escalators and stairwells are set to 0.7; and rest areas and shops are set to 0.3. This score can be dynamically updated based on historical event statistics to overcome the shortcomings of traditional "static risk scores" that cannot reflect actual operational changes (e.g., frequent incidents of children climbing on a certain escalator recently). The mechanism is as follows: at fixed time intervals (e.g., daily, weekly), the system automatically counts the number of abnormal events (e.g., equipment failure, falls, smoke alarms, etc.) occurring in each monitored area within that period, and accordingly weights and corrects the original risk score, achieving dynamic evolution of the risk score. The data preprocessing module calls the image editing module (with a built-in object detection model, such as YOLOv8) to perform crowd detection on video frames of each region, and outlines key areas to specify the range for subsequent enhancement, avoiding full-image processing and improving efficiency. It calculates and normalizes the population density per unit area. To ensure the accuracy of the density calculation, a moving average can be taken from multiple consecutive frames of the video to reduce jitter. Subsequently, an adjustable fusion coefficient that automatically switches between monitoring requirements for different time periods is set to fuse the risk level with real-time crowd flow, and the enhancement weight is calculated. Based on the final enhancement weights The value is used to determine whether image enhancement is needed, and the subsequent enhancement algorithm, for example, when... When the value is less than 0.3, no image enhancement is performed; the original image is input directly. When 0.3 ≤ When <0.6, apply moderate enhancement to the image (e.g., contrast +10%, brightness +5%); when When the value is ≥0.6, strong enhancement is applied to the image (e.g., CLAHE contrast equalization + sharpening filtering). Enhancement algorithms may include: brightness / contrast adjustment, histogram equalization (CLAHE), dehazing algorithms, etc.

[0027] The large-scale model anomaly detection module uses selectively enhanced video frame images as the main visual modality input for multimodal analysis. This ensures that the images retain clear details even in complex environments such as low light, smoke obscuration, and backlighting, enhancing the large-scale model's ability to perceive security risks. To further enable the large-scale model to focus accurately, reduce illusions, and improve diagnostic efficiency, a "context-aware Prompt generation engine" is constructed. Based on the type of the current monitored area, device status, and operating period, it automatically generates targeted and semantically clear prompts to guide the large-scale model to focus on key anomalies. First, a Prompt template library is built according to scene categories. Then, context parameters are dynamically read from the system, and priority matching logic is used to generate natural language prompt fragments. These fragments are dynamically generated by the system based on area labels, sensor data, and time strategies, achieving "scene-adaptive prompts." For example: In the daytime power distribution room area: "You are a rail transit equipment inspection expert. Please check the image for smoke, sparks, overheated cables, or abnormal indicator lights. The current ambient temperature is 32°C, slightly higher than normal." In the evening rush hour platform area: "Please identify if there are passengers approaching the yellow line, falling, or items falling onto the tracks at the platform edge. It is currently the evening rush hour, with high passenger volume; please pay special attention to the behavior of children and the elderly." The prompt design follows a "role + task + constraint" structure, conforming to the instruction tuning paradigm. Additionally, a few examples can be embedded in the prompts to simulate the instruction tuning effect. The enhanced image, dynamic prompts, and structured metadata are integrated to construct a unified input format, typically using JSON or embedded text, guiding the model to output structured or semi-structured judgment results.

[0028] After constructing the multimodal input layer, the model performs semantic understanding and preliminary identification of equipment status anomalies (such as smoke, indicator light malfunctions, loose parts) and personnel behavior anomalies (such as falls, lingering, crowding), outputting image descriptions, anomaly types, locations, and confidence scores. The confidence score indicates the model's "confidence" in accurately judging the anomaly. When generating each anomaly judgment, the model calculates the Softmax probability distribution of that token (e.g., "smoke"); the confidence score is typically the maximum probability value of that anomaly category in the output distribution.

[0029] Judgment based on a single frame image is susceptible to noise and occlusion. Stability can be improved by outputting confidence verification results for multiple consecutive frames. The multi-frame consistency verification and confidence check mechanism is designed as follows: Aggregate analysis is performed on the model output of N consecutive frames (e.g., N=5) from the same monitoring area. At least K frames (e.g., K=4) report the same anomaly type, and the average confidence score is greater than a set threshold (e.g., threshold=0.85). Furthermore, to ensure the spatiotemporal continuity of anomalies, regional coding matching can be used to ensure that the locations of anomalies are close in consecutive frames. Whether to enable in-depth knowledge base query is determined based on the relationship between the confidence score and the initial threshold. If the confidence score is greater than the threshold, the next step is taken; otherwise, an early warning log is generated for staff to query.

[0030] The anomaly response module comprehensively judges the event level based on anomaly type, risk area, and dynamic situation, constructs an intelligent rule base, and achieves "context-aware" anomaly classification. Simultaneously, it uses the anomaly identification results output by the large model as query conditions to automatically extract anomaly type, occurrence location, and contextual features, and determines the anomaly event level based on the intelligent rule base. After classification, different mechanisms are adopted to coordinate device responses, and suggestions are generated through collaborative reasoning using the knowledge base.

[0031] When constructing the intelligent rule base, the comprehensive risk score S of the event is calculated based on the severity of the abnormal event, the inherent risk level of the area, the real-time pedestrian flow density, and the confidence level of the large model output. The event is then classified according to the severity of the anomaly, i.e., the range of the S value: for example, S≥3.0 is a Level 1 (emergency) event; 2.0≤S≤3.0 is a Level 2 (alarm) event; and S≤2.0 is a Level 3 (notification) event.

[0032] In this embodiment, the abnormal response module triggers corresponding automated response and collaborative handling mechanisms for events of different levels: Level 1 event (emergency): applicable to serious equipment failures (such as fire, explosion, power outage) or major personnel safety incidents (such as people falling onto tracks, stampedes, unresponsive sudden illnesses, etc.). The system immediately triggers an automatic alarm, activates the fire protection system, stops the elevators in an emergency, cuts off power to the relevant areas, and broadcasts preset evacuation instructions through the broadcast system; at the same time, it sends the highest priority alarm to the control center, emergency duty team, and nearby personnel, activates the emergency plan process, automatically retrieves surrounding video surveillance, and reserves rescue channels.

[0033] Level 2 events (alarms): These cover moderate equipment anomalies (such as equipment overheating, indicator light alarms, loose parts) and abnormal personnel behavior (such as prolonged lingering, climbing equipment, children being unsupervised, passenger disputes, etc.). The system will push alarm information to staff in real time, automatically generate maintenance or inspection work orders, and suggest handling priorities; for personnel incidents, it will trigger targeted reminders from station staff, track nearby cameras, and can pre-activate local passenger flow intervention measures (such as adjusting the direction of turnstiles and guiding detours).

[0034] Level 3 Events (Alerts): Primarily targeting low-risk equipment status changes (such as exceeding environmental temperature and humidity limits, lighting malfunctions) or general passenger flow phenomena (such as localized short-term congestion, incomplete passenger clearing, areas with concentrated passenger inquiries). The system generates an operation log, pushes alert information to the control center's large screen and duty desk, activates intelligent passenger flow guidance strategies, and issues friendly reminders or route directions through the PIS passenger information system display screen and broadcasts; it can also trigger preventative inspection plans.

[0035] All event information at all levels is simultaneously pushed to the control center's integrated monitoring platform, enabling full lifecycle recording of events, visual display, and multi-departmental collaborative response, ensuring both equipment operational safety and passenger service experience.

[0036] While the equipment within the coordinated station responds, the knowledge base needs to generate specific accident analyses and detailed handling suggestions for staff. Through keyword matching and semantic expansion mechanisms, the system retrieves knowledge bases on rail transit equipment faults, personnel safety incidents, emergency plans, and maintenance and operation manuals. Combining historical work order records, equipment ledger information, and historical handling cases of similar personnel safety incidents in the target area or equipment, multi-source information fusion analysis is performed to generate structured and actionable response suggestions. For equipment anomalies, the output includes fault cause analysis, anomaly severity, required spare parts models, standard operating procedures, and reference documents. For personnel safety incidents (such as falls, entrapment, crowding, climbing, etc.), based on historical event response measures and handling procedures, an emergency response plan is generated, including personnel rescue steps, broadcast guidance scripts, job responsibility assignments, coordinated control commands (such as suspending escalators and opening evacuation routes), and follow-up suggestions, comprehensively improving the professionalism, standardization, and response efficiency of operation and maintenance decisions. Finally, historical events are updated, with each event chain recorded in the log, maintaining the historical version information of event chain nodes, and dynamically updating the knowledge base.

[0037] This embodiment also provides a method for vertical intelligent agent inspection of rail transit stations, applied to the above-mentioned rail transit station vertical intelligent agent inspection system, including the following steps: S1, Data Acquisition Module, monitors the station area, personnel, and equipment through cameras and sensors; S2. The scheduling module allocates the polling frequency, order, and scheduling priority of the subsequent large model inference based on the regional data collected by the data acquisition module. S3, the data preprocessing module performs image preprocessing on the monitoring data and dynamically constructs prompt words; S4. The large model anomaly identification module identifies anomalies based on the output of the data preprocessing module. S5. The exception response module classifies exceptions and collaborates with the knowledge base to output handling suggestions.

[0038] In this embodiment, a full-process closed-loop technology system from active patrol inspection to exception recognition, intelligent diagnosis, decision generation, and even linkage disposal is constructed, which is applicable to subway stations of different types and scales, and is compatible with multi-vendor and multi-type electromechanical equipment (such as fans, water pumps, escalators, and distribution cabinets) and personnel safety monitoring scenarios. The system drives multi-source perception devices to synchronously collect heterogeneous data such as video, infrared, vibration, and sound according to the timestamp through an adaptive polling patrol inspection mechanism; an adaptive image enhancement algorithm is introduced for low-quality images to improve the availability of input data; based on a multi-modal large model, anomalies in equipment operation states (such as smoke, overheat, and abnormal noise) and personnel safety events (such as falls, stays, climbs, and crowds) are synchronously identified and preliminarily judged; structured information is extracted through post-processing logic, and the recognition reliability is improved by combining multi-frame consistency verification and confidence test mechanisms; further, an intelligent event grading and determination rule library is constructed, and level-1 (urgent), level-2 (alarm), and level-3 (prompt) events are dynamically divided according to the comprehensive risk score; guided by the output of the large model, the rail transit knowledge base is automatically matched and retrieved, and historical work order records and equipment ledger information are integrated to generate structured operation and maintenance suggestions or emergency response plans including fault cause analysis, disposal suggestions, spare part recommendations, standard operation guidelines, personnel rescue procedures, broadcast scripts, and linkage control instructions; finally, different response actions are triggered according to the event level: level-1 events automatically alarm, activate the fire protection system, and broadcast evacuation; level-2 events push alarms to the operation and maintenance terminal and generate work orders; level-3 events activate the passenger flow guidance strategy and prompt passengers through the PIS screen and broadcast; all event information is synchronously pushed to the control center in real time to achieve visual monitoring of the whole station's operation situation and multi-department collaborative disposal. The present invention realizes active discovery, reliable recognition, in-depth diagnosis, and intelligent decision-making under unmanned or less manned conditions, significantly improves the maintenance efficiency of subway station electromechanical equipment, reduces operation risks, and reduces manual dependence, providing strong technical support for the safe, intelligent, and efficient operation of urban rail transit.

Claims

1. A vertical intelligent body inspection system for rail transit stations, characterized in that, It includes a data acquisition module that monitors the station area, personnel, and equipment using cameras and sensors, collecting monitoring and environmental data and sending it to the data preprocessing module; it also includes a scheduling module that controls the polling frequency and order of the data acquisition module, as well as the scheduling priority of subsequent large-scale model inference, based on the inherent risk level of the area collected by the data acquisition module, real-time pedestrian density, historical frequency of equipment failure, and current event status; the data preprocessing module performs image preprocessing on the monitoring data and dynamically constructs prompt words for input to the large-scale model anomaly recognition module; The anomaly response module classifies anomalies according to the anomaly identification results and outputs processing suggestions in collaboration with the knowledge base.

2. The intelligent body inspection system for vertical areas of rail transit stations according to claim 1, characterized in that, The data acquisition module collects monitoring video streams, audio signals, environmental parameters, and equipment operating status information in real time. After local preprocessing of the data stream through edge computing nodes, it transmits it to the station center server, stores it synchronously in the multimodal data warehouse according to timestamps, and synchronizes it to the data preprocessing module.

3. The intelligent body inspection system for vertical areas of rail transit stations according to claim 1, characterized in that, The scheduling module divides the entire station into multiple areas based on the deployment location of the cameras. Based on the inherent risk level, real-time passenger flow density, historical frequency of equipment failures, and current event status of each monitoring area, it calculates the area scheduling weight and dynamically adjusts the camera polling frequency of the data acquisition module according to the weight.

4. The intelligent body inspection system for vertical areas of rail transit stations according to claim 1 or 3, characterized in that, The scheduling module constructs a differentiable scheduling decision function. Through a gating fusion mechanism, it performs nonlinear interaction of multi-dimensional features of the inherent security risks of the monitored area, dynamic indicators of pedestrian flow and its changing trends, abnormal event memory decay indicators, and current event state incentives to construct scheduling priorities. The scheduling priorities determine the polling order of cameras / sensors, the allocation density of AI analysis resources, and the inference frequency of models in the next scheduling cycle for that area.

5. The intelligent body inspection system for vertical areas of rail transit stations according to claim 1, characterized in that, The data preprocessing module performs selective image enhancement on the acquired real-time video frames. The enhancement intensity is determined by a weighted average of the regional risk level and real-time pedestrian flow. Combining sensor data and contextual information, it generates structured prompt segments and constructs a unified multimodal input format containing images, metadata, and prompt text, which is then used as input to the large model and sent to the large model anomaly recognition module.

6. The intelligent body inspection system for vertical areas of rail transit stations according to claim 1, characterized in that, The large model anomaly identification module performs semantic understanding and preliminary identification of abnormal equipment status and abnormal personnel behavior, and outputs image description, anomaly type, location and confidence level; if an anomaly is identified and the confidence level is greater than the threshold, the anomaly response module performs corresponding anomaly response processing; if an anomaly is identified but the confidence level is lower than the threshold, an early warning log is generated.

7. The intelligent body inspection system for vertical areas of rail transit stations according to claim 1, characterized in that, The anomaly response module uses the anomaly identification results output by the large model as query conditions, automatically extracts the anomaly type, occurrence location and contextual features, determines the anomaly event level based on the intelligent rule base, and adopts different mechanisms after leveling to link the devices to respond, and generates suggestions through collaborative reasoning by the knowledge base.

8. The intelligent body inspection system for vertical areas of rail transit stations according to claim 7, characterized in that, The anomaly response module comprehensively judges the event level based on the anomaly type, risk area, and dynamic situation, and builds an intelligent rule base to achieve "context-aware" anomaly classification.

9. The intelligent body inspection system for vertical areas of rail transit stations according to claim 7, characterized in that, While the anomaly response module coordinates with the equipment within the station to respond, the knowledge base generates specific accident analyses and detailed handling suggestions for staff. Through keyword matching and semantic expansion mechanisms, it retrieves knowledge bases on rail transit equipment faults, personnel safety incidents, emergency plans, and maintenance and operation manuals. Combining historical work order records, equipment ledger information, and historical handling cases of similar personnel safety incidents in the target area or equipment, it performs multi-source information fusion analysis to generate structured and actionable response suggestions.

10. A method for inspecting the vertical domain of a rail transit station using intelligent agents, applied to the rail transit station vertical domain intelligent agent inspection system described in any one of claims 1-9, characterized in that, Includes the following steps: S1, Data Acquisition Module, monitors the station area, personnel, and equipment through cameras and sensors; S2. The scheduling module allocates the polling frequency, order, and scheduling priority of the subsequent large model inference based on the regional data collected by the data acquisition module. S3, the data preprocessing module performs image preprocessing on the monitoring data and dynamically constructs prompt words; S4. The large model anomaly identification module identifies anomalies based on the output of the data preprocessing module. S5, the exception response module, performs hierarchical processing of exceptions and outputs processing suggestions in conjunction with the knowledge base.