Subway train popular epidemic transmission control optimization method based on agent model and probability search
By constructing a proxy model and a probability search-based method for controlling the spread of epidemics in subway trains, the problem of limited computational resources for high-fidelity spread simulation was solved, and an efficient train control scheme was achieved, thereby improving the efficiency of epidemic spread control in the subway system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- FUDAN UNIVERSITY
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-23
AI Technical Summary
Given the high computational cost of high-fidelity propagation simulation and the large number of candidate train control combinations, existing technologies struggle to quickly search for high-quality train control schemes within limited computational resources, resulting in low propagation control efficiency.
A method for controlling the spread of an epidemic in subway trains based on a surrogate model and probabilistic search is constructed. By reconstructing the co-occurrence contact network of carriages in a continuous time period, an event-driven SEIR propagation simulation model is established. The early system infection load is defined as an evaluation index. A surrogate prediction model that is insensitive to the order of the train set is trained, and an iterative probabilistic update search strategy is adopted to generate an optimized train control set.
A highly efficient train control scheme was achieved under a limited simulation budget, which significantly improved the efficiency of epidemic transmission control in the subway system, reduced computational costs, and improved the transmission control effect.
Smart Images

Figure CN122266818A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of complex network propagation control technology, specifically relating to an optimization method for controlling the spread of epidemics in subway train services based on a proxy model and probabilistic search. Background Technology
[0002] In recent years, the passenger flow of urban subways has continued to grow. The relatively enclosed space of subway carriages, close-range passenger sharing, and high concentration during peak commuting hours create a high risk of respiratory infectious disease transmission in subway settings. Compared to the fixed settings of home or office, subway travel is characterized by strong time constraints and high-frequency, short-term contact. To conduct transmission risk assessments and intervention planning in public transportation scenarios without relying on individual health privacy, smart card entry and exit data and timetables can be used to reconstruct passenger travel processes and shared contact within the carriages. Based on this, a transmission model can be built for simulation evaluation, thereby characterizing system-level transmission characteristics and intervention effects.
[0003] In actual operation and management, measures such as station closures and passenger flow restrictions can cause significant social disruption. Engineering measures such as enhanced carriage ventilation and ultraviolet disinfection can serve as carriage-level space interventions and can be implemented gradually through vehicle operation and maintenance plans. Due to limitations in equipment and maintenance resources, operators typically need to budget for the number of train trips. Under the constraint of (i.e., the number of trains allowed to implement control measures first) select some trains or vehicles for priority upgrades. Therefore, using "train number" as the intervention unit has clear engineering implementation significance and management compatibility: on the one hand, train numbers naturally have attributes such as routes, directions, and departure times, which makes it easy to form an executable list; on the other hand, the intensity of train number upgrades can be carried out in batches or in key areas with minimal disruption to operations.
[0004] In existing train priority control strategies, the most common engineering practice is to directly select the train based on passenger flow. Prioritizing measures such as enhanced ventilation, disinfection, or equipment upgrades for specific train services is based on the intuitive reason that "trains with high passenger flow generate greater potential exposure." In academic and engineering practice, single risk indicators are often designed to rank train services by risk. These rankings can be based on factors such as the intensity of shared contact within the carriage (e.g., the number of passengers or the number of seating segments), the centrality of the train service's transfer coupling network, or other rules. Intervention is implemented on individual train services. This type of method is simple to implement, but it essentially scores each train service independently, making it difficult to characterize the combined effects of the train service control set in terms of spatial coverage, complementarity, and redundancy. In addition, under the constraints of transfer chains and timetables, the impact of different train service combinations on system propagation is nonlinearly superimposed, and train service combinations obtained by ranking by a single index may not necessarily achieve better control performance under propagation simulation evaluation.
[0005] When the total number of train trips reaches several thousand and the budget reaches several hundred, the number of candidate train trip control combinations grows exponentially. Meanwhile, high-fidelity propagation simulations (such as event-driven SEIR models on continuous-time co-occurring contact networks) require significant time and computational resources to run multiple times, making it difficult to complete searches within the vast combination space using exhaustive methods or conventional greedy algorithms within limited computational resources. There is an urgent need for an optimization algorithm that can automatically search for high-quality train trip control schemes under limited simulation budgets, enabling the "expensive propagation simulation—combinatorial search—policy output" process to be completed in a closed loop within an engineering-acceptable time and computational budget. Summary of the Invention
[0006] The purpose of this invention is to propose an optimization method for the control of epidemic spread in subway trains based on a surrogate model and probabilistic search, in order to improve the efficiency of epidemic spread control in subway systems, given the high computational cost of high-fidelity propagation simulation and the large number of candidate train control combinations.
[0007] The proposed optimization method for controlling the spread of infectious diseases in subway trains, as described in this invention, has the following overall process: Figure 1 As shown, this includes: reconstructing a continuous-time shared contact network of carriages at the train-number scale based on smart card data and timetables; establishing an event-driven SEIR propagation simulation model on the shared contact network; and defining the early system infection load. As an indicator for evaluating the effectiveness of transmission control; constructing the structural characteristics of train passenger flow, exposure, and transfer network; and budgeting the number of trains. Under constraints, a candidate train control set and corresponding label data are generated, and a surrogate prediction model that is insensitive to the order of the train set is trained. Based on the surrogate model, an iterative probability update search strategy is used for sampling, filtering, and probability updating to obtain an optimized train control set. Finally, the results are verified and output through a high-fidelity propagation simulation model. The specific steps are as follows:
[0008] Step 1: Construct a propagation simulation model to evaluate the effectiveness of the train propagation control scheme. This includes identifying the entire set of trains in the subway system based on smart card data and timetables, reconstructing the continuous-time passenger co-occurrence contact network at the train number scale, and establishing a simulation evaluation model to evaluate the propagation control effect for any train number control set; specifically:
[0009] Step 1-1: Smart Card Data Preprocessing and Travel Record Generation (Entry and Exit Pairing). Enter and exit events are extracted from the smart card transaction records and cleaned to form passenger travel records in units of "entry-exit".
[0010] Steps 1-2: Timetable parameter extraction and train schedule calculation. For each route and direction, extract the first and last train times and time-varying departure interval functions. Recursively generate all trains for the day and their planned arrival times at each station, resulting in a train set. ;
[0011] Steps 1-3: Trip Segmentation and Train Allocation. Each passenger trip is decomposed into several "single-line, single-direction" trip segments according to the subway network topology; when the trip involves transfers, multiple trip segments are formed with transfer stations as the dividing points, and each trip segment is uniquely assigned to the corresponding train based on the relationship between the passenger's arrival time at the platform and the passing time of adjacent trains.
[0012] Steps 1-4: Reconstructing the continuous-time co-existence contact network. For each train, generate a time-ordered co-existence snapshot sequence based on passenger boarding and alighting events within the interval between adjacent stations. The snapshot records the train ID, interval ID, snapshot start and end times, and passenger set.
[0013] Steps 1-5: Event-driven SEIR propagation simulation. Based on the co-occurrence snapshot sequence, propagation events and intra-host state transition events are executed, and the system infection ratio curve is output. ;
[0014] Steps 1-6: Train number control operation mode. Given a set of train number controls. ,in This represents the set of all train services. When all train services are included in the snapshot, the snapshot belongs to that train service. At that time, the probability of contact infection within the snapshot is reduced to
[0015]
[0016] in, Based on the probability of infection through contact, Used to characterize the reduction in the probability of transmission caused by engineering measures such as enhanced ventilation in the carriage and ultraviolet disinfection;
[0017] Steps 1-7: SIL Calculation. The SIL is obtained by integrating the infection rate curve over a preset early time window.
[0018]
[0019] in, For a moment The number of infected people For the total number of people, The preset early propagation observation termination time is given. (Given train control set) The corresponding label is recorded as .
[0020] Step 2: Construct multi-source features of train numbers for propagation control decisions. This includes calculating multi-source features related to propagation risk at the train number scale. These multi-source features include at least passenger flow features, co-existence exposure features, and inter-train number coupling network structure features induced by transfer behavior, forming a train number feature vector; specifically:
[0021] Step 2-1: Calculation of Basic Train Operation and Passenger Flow Characteristics. Calculate the basic operational attributes of each train, such as total travel time, number of stops, and route category, as well as the passenger flow characteristic (Flow). Flow represents the number of passengers on each train, which can be calculated from train allocation records.
[0022] Step 2-2: Train Exposure Feature Calculation. Based on the co-occurrence snapshot, calculate the train exposure features Contacts and Intervals. Contacts is the average number of different passengers each passenger encountered on that train, and Intervals is the average number of segments a passenger traverses on that train. For each train... Its segment set is For the section The set of passengers on the train obtained from the snapshot is Define the total set of passengers for each train as follows: For passengers The set of segments it traversed is Define passengers In train number The different groups of rideshare passengers encountered during the journey are: Contacts and Intervals are defined as follows:
[0023]
[0024] Steps 2-3: Constructing the Train Transfer Network. A directed weighted train transfer network is constructed using train numbers as nodes. ,in, Let be a set of directed edges. Let's consider the adjacency matrix. For any passenger's travel sequence... ,in This indicates the number of train trips included in this trip. For any... Establish directed edges and on the border rights Add up the number of times the "preceding train" appears in all trips;
[0025] Steps 2-4: Calculation of Train Transfer Network Structure Features. Calculate centrality features such as weighted degree, proximity centrality, and betweenness centrality on the train transfer network to form a train structure feature vector. Wherein:
[0026] The weighted degree is defined as:
[0027] ,
[0028] in, These are weighted in-degree and weighted out-degree, specifically:
[0029]
[0030] Transforming border rights into costs ( To prevent division by zero (a constant), the outward shortest path distance is defined. Then, proximity centrality is defined as:
[0031]
[0032] Betweenness centrality is defined as:
[0033]
[0034] in, Represents a node To the node The number of shortest paths, Indicates passing through nodes The number of shortest paths.
[0035] Steps 2-5: Feature Vector Formation and Normalization. After normalizing the train passenger flow features, train exposure features, and train transfer network structure features, concatenate them to obtain the feature vector for each train. 3D feature vector .
[0036] Step 3: Train the surrogate prediction model. This includes generating several candidate train control sets under the aforementioned train quantity budget constraint, obtaining corresponding propagation control effect labels using the propagation simulation model described in Step 1, and training a surrogate prediction model that is insensitive to the order of trains within the set based on the multi-source features of trains described in Step 2, so as to quickly predict the propagation control effect of any candidate train control set; specifically:
[0037] Step 3-1: Training sample generation. (In budget...) Generate under constraints Group of candidate train control set samples For each The label is obtained by calling the propagation simulation in step 1. To form a training set ;
[0038] Step 3-2: Define the cascaded surrogate model. Construct a cascaded surrogate prediction model that is insensitive to the order of the train number set:
[0039]
[0040] in, This is a train number embedding network used to map individual train number features into low-dimensional embedding representations; For regression networks, this is used to map the ensemble-pooled embedding vectors to predicted propagation control effects. The network... and All of them can be constructed using a multilayer perceptron structure. The permutation-invariant structure of this set can be referenced in the Deep Sets model [2].
[0041] Step 3-3: Model Training. Training is performed with mean squared error as the target. , This allows for the rapid prediction of any set. of .
[0042] Step 4: Iterative Search and Output Verification. This includes initializing the train selection probability distribution, employing a probability-based iterative update search method, cyclically performing candidate set sampling, effect prediction and filtering, and probability updates, outputting an optimized train control set; inputting the optimized train control set into the propagation simulation model described in Step 1 for verification, generating a train priority control train list for engineering implementation. Specifically:
[0043] Step 4-1: Initialize the probability distribution. Let the total number of train trips be... Initialize the train selection probability vector And set the number of samples Elite ratio Smoothing coefficient Maximum number of iterations Convergence threshold ;
[0044] Step 4-2: Iterative Sampling and Elite Selection. In the... In the round of iteration, according to Perform sampling without replacement to generate There are candidate sets, each candidate set containing Different train routes. Calculate using a proxy model. And select the first with the smallest predicted value As an elite group ;
[0045] Step 4-3: Probability Update. Update the frequency of each train based on the elite set. And employ smooth updates to suppress oscillations:
[0046]
[0047] Step 4-4: Convergence and Output. If the following conditions are met... Or reach the maximum number of iterations Then stop iterating and output the one with the highest probability. Each train number constitutes the optimized train number control set. ;
[0048] Steps 4-5: High-fidelity verification and strategy comparison. Input the simulation model from step 1 to obtain the real... and within the same budget The results are compared with the baseline train control set (Flow sorting, Contacts sorting, weighted / proximity sorting) to output the relative improvement and the final train list.
[0049] The present invention also includes a subway train epidemic transmission control optimization system, comprising the following four modules: a transmission simulation model construction module, a train multi-source feature construction module, a proxy prediction model training module, and an iterative search and output verification module; used to execute the four steps of the subway train epidemic transmission control optimization method.
[0050] The innovation of this invention lies in the following: using event-driven propagation simulation on a continuous-time carriage co-occurrence contact network as the evaluation basis for propagation control effect, combining the structural characteristics of the train transfer network and the exposure characteristics of the trains, constructing a proxy prediction model that is insensitive to the order of the train set to quickly evaluate the propagation control effect of candidate train control schemes, and adopting a search strategy based on probability iterative updates to automatically generate a train priority control scheme that is superior to the single index ranking method from a large number of candidate combinations, thereby achieving efficient optimization under limited simulation computing resources. Attached Figure Description
[0051] Figure 1 This is an overall flowchart of the automatic search framework of the present invention.
[0052] Figure 2 A comparison chart of the infection rate curves under the same budget and different train control strategies in real-world transmission simulations.
[0053] Figure 3 This diagram illustrates the iterative update and convergence of the probability vector during the cross-entropy probability distribution search process. Detailed Implementation
[0054] To make the above-mentioned objectives and innovations of the present invention easier to understand, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0055] This example uses the Shanghai Metro system in 2015 as an example. (System train frequency) In this embodiment, the budgeted number of trains for which control measures are prioritized is: The evaluation indicators for the effectiveness of transmission control are adopted. .
[0056] Step 1: Construction of the propagation simulation evaluation model.
[0057] Step 1-1: Smart Card Data Preprocessing and Travel Record Generation (Entry / Exit Pairing). The input is a smart card transaction record, including fields such as anonymous passenger identifier, transaction time, route, station, and transaction amount. Entry and exit events are distinguished by transaction amount: entry amount is 0, exit amount is positive. Abnormal records are removed, such as identical entry and exit stations, missing fields, etc. A single travel record is formed by pairing the entry and exit points of the same passenger, denoted as [example record]. The anonymous passenger identifier is used only for station entry / exit matching and route reconstruction, and does not involve any identifiable information, thus meeting data privacy protection requirements;
[0058] Steps 1-2: Timetable parameter extraction and train operation time calculation. For each line... With direction Extract the first departure time The last departure time is and the departure interval function that varies with time. Generate a sequence of departure times for all trains on the day using a recursive formula:
[0059]
[0060] in, This indicates the "departure number of the day" within that route in that direction (i.e., the first departure number). (bus schedule) Indicates time This refers to the departure interval for that route. If the generated departure time exceeds the last departure time... The recursion terminates at a certain time. Each train number is recorded as... If the station sequence in a certain direction of a certain line is as follows: The travel time between adjacent stations is Then the train's transit time at the station is:
[0061]
[0062] This allows us to calculate the planned transit time of any train at any station.
[0063] Steps 1-3: Trip Segmentation and Train Allocation. Each trip is decomposed into single-line, single-direction trip segments according to the subway network topology. For the first segment, passenger arrival time at the platform can be calculated as follows: Estimate; for the segment after the transfer, the passenger arrival time at the platform is calculated as "the arrival time of the previous segment + " "Estimation. Among them." This parameter represents the average walking time from the station entrance to the platform. This represents the average walking time for transfers. Let the time of passage of two adjacent trains at a certain station be (e.g., ). , When satisfied The travel segment will be allocated to the train number at that time. This yields the passenger-train number allocation record;
[0064] Steps 1-4: Reconstruction of the continuous-time co-existing contact network. For each train... Its operating path is divided into several "segments" between adjacent stations (i.e., adjacent station segments), and the set of these segments is denoted as . For each interval The passenger set is obtained based on the allocation record. A new co-occurrence snapshot is generated when the passenger set within an interval changes due to boarding or alighting events. Each snapshot contains: train ID, interval ID, snapshot start and end times, and passenger set. The snapshot duration is determined by the difference between two adjacent change times. Within the same snapshot, co-occurring sets... Any two passengers are considered potential contacts;
[0065] Steps 1-5: Event-Driven SEIR Propagation Simulation. Establish the SEIR propagation process, with individual states including at least four types: Susceptible (S), Latent (E), Infected (I), and Removed (R). Simulation events include:
[0066] (1) Co-infection events: For each infected person-susceptible person pair within a co-infection snapshot, the base contact infection probability is used. Initiate an infection attempt; if the train number to which the snapshot belongs belongs to the control train set, the infection probability is calculated as follows: Reduced; when a susceptible person is first infected, they enter a latent state (E).
[0067] (2) Intra-host state transition events: The latent period and the infection period are sampled when the individual enters the E and I states, respectively, and the events are triggered at the corresponding times. , Transfer;
[0068] (3) External introduction event: In order to characterize the external input of the system, an external infection can be introduced with a preset extremely low probability. This mechanism does not affect the relative comparison conclusion between different train control strategies.
[0069] The simulation processes snapshot events and state transition events in chronological order and outputs a system infection rate curve. This embodiment uses respiratory infectious diseases as the application scenario, and the relevant transmission parameters can be selected within the common parameter range of typical respiratory infectious diseases.
[0070] Steps 1-6: Train number control operation mode. Given a set of train number controls. When the snapshot belongs to the train number At that time, the probability of contact infection within the snapshot is reduced to
[0071]
[0072] Used to characterize the reduction in the probability of transmission caused by engineering measures such as enhanced ventilation in the carriage and ultraviolet disinfection;
[0073] Steps 1-7: SIL Calculation. At the preset early termination time... Previously, integrating the infection rate curve yielded:
[0074]
[0075] in, For a moment The number of infected people For the total number of people, The preset early observation termination time is used. Given the train control set. The corresponding label is recorded as In the embodiments, The fastest growth day or other preset early window termination time can be selected.
[0076] During implementation, a comparative analysis of different evaluation methods was conducted. The results showed that when the "whole-process infection ratio curve" and the "SIL value corresponding to the fastest growth day" were used as evaluation criteria, the ranking of the control strategies for each train remained consistent. Therefore, without changing the strategy superiority / inferiority judgment results, in order to reduce the simulation computational burden and improve optimization stability, this embodiment selected the SIL value corresponding to the fastest growth day as the optimization label.
[0077] Step 2: Construction of multi-source features for train number propagation control decisions.
[0078] Step 2-1: Calculation of basic train operation and passenger flow characteristics. Calculate the passenger flow characteristic Flow of the train, where Flow is the number of passengers on the train, which can be calculated from the train allocation records; calculate the basic operation characteristics of the train, such as the total travel time, the number of stations the train passes through, and the route category to which the train belongs. The route category can be converted into numerical characteristics using category coding.
[0079] Step 2-2: Train Number Exposure Feature Calculation. For train numbers... Its segment set is For the section The set of passengers on the train obtained from the snapshot is Define the total set of passengers for each train as follows: For passengers The set of segments it traversed is Define passengers In train number The different groups of rideshare passengers encountered during the journey are: The train exposure features, Contacts, represent the average number of different passengers sharing a train on that train, and Intervals represent the average number of segments a passenger traverses on that train. These are defined as follows:
[0080]
[0081] Steps 2-3: Constructing the train transfer network. Let the set of all trains for the day be... Construct a directed weighted graph with train numbers as nodes. ,in Let be a set of directed edges. This is an adjacency matrix. For any passenger's trip... The assigned train sequence is as follows: ,in This indicates the number of trains included in the trip (i.e., the sequence length). For any Add directed edges to the network Then, the edge weights are summed according to their frequency of occurrence to obtain the elements of the adjacency matrix:
[0082]
[0083] in Indicates the train number among all passenger trips Prior to train number The frequency of occurrence reflects the cross-train propagation coupling strength induced by the transfer chain. The above construction method can characterize the inter-train propagation correlation induced by the passenger transfer chain.
[0084] Steps 2-4: Calculation of the structural features of the train transfer network.
[0085] Weighted in-degree, weighted out-degree, and weighted degree are defined as follows:
[0086]
[0087] Transforming border rights into costs ( To prevent division by zero (a constant), the outward shortest path distance is defined. Then the outward proximity is:
[0088]
[0089] Betweenness centrality is defined as:
[0090]
[0091] in, Represents a node To the node The number of shortest paths, Indicates passing through nodes The number of shortest paths.
[0092] Steps 2-5: Feature Vector Formation and Normalization. The above features are concatenated according to their dimensions to form the train number feature vector. The features of each dimension are normalized for use in training the ensemble proxy model.
[0093] Step 3: Train the cascaded proxy model.
[0094] Step 3-1: Generation of the training sample set. (This is related to the train number estimation.) Under constraints, a train number control set sample is constructed. In this embodiment, the total number of training samples is... The dataset is randomly partitioned into training, validation, and test sets for model training, hyperparameter selection, and independent performance evaluation. The sets are generated using a multi-strategy hybrid approach with random perturbation to ensure the representativeness of training samples under different levels of propagation control effectiveness. For each set... The label is obtained by calling the propagation simulation in step 1. To form a training set ;
[0095] Step 3-2: Employ a set proxy model structure. To ensure insensitivity to the arrangement of set elements, a network structure that satisfies permutation invariance is adopted:
[0096]
[0097] in To embed train numbers into the network, For regression networking. In this embodiment, A multilayer perceptron structure is adopted, consisting of an input layer, several fully connected layers, and nonlinear activation layers, which is used to map the train number feature vector into a fixed-dimensional embedded representation. It also adopts a multilayer perceptron structure, consisting of several fully connected layers and an output layer, used to output a single scalar prediction value.
[0098] Step 3-3: Train the surrogate model. Use mean squared error as the training objective function:
[0099]
[0100] After training, the surrogate model parameters are fixed and used for rapid evaluation of the candidate set in step 4.
[0101] Independent evaluation on the test set showed that the Spearman correlation coefficient reached [value missing]. Kendall's coefficient is The mean square error is This indicates that the surrogate model has high consistency and reliability in judging the quality of sets, and can be used as a rapid evaluation model in the subsequent combinatorial search stage.
[0102] Step 4: Search and verification output based on probability-based iterative updates.
[0103] Step 4-1: Initialization and Parameter Setting. Assume the total number of train trips is... ,initialization:
[0104] in To define in the set of train numbers probability distribution on ), used for sampling without replacement to generate a scale of The candidate set. Set the number of samples. Elite ratio Smoothing coefficient Maximum number of iterations Convergence threshold Example setup: , , , , .
[0105] Step 4-2: Sampling the candidate set. In the... In the round of iteration, from Generate by sampling without replacement A size of The candidate set;
[0106] Step 4-3: Proxy Evaluation and Elite Selection. Calculate the Proxy Evaluation score for each candidate set using a ensemble proxy model. Sort by size from smallest to largest and select the top few. The candidate set is used as the elite set. ;
[0107] Step 4-4: Probability Update. Update the probability vector based on the frequency of train occurrences in the elite set:
[0108]
[0109] in This is the indicator function; and the updated probability vector is normalized.
[0110] Steps 4-5: Termination and Output. When Or the number of iterations reaches Stop iteration when the probability is highest, and output the one with the highest probability. The train number control set consists of individual train numbers. ;
[0111] Steps 4-6: High-fidelity verification and comparison strategies. Input the simulation model from step 1 to obtain the real... ;
[0112] To verify the effectiveness of the present invention, a budget was prepared for the same number of train trips. Under the following conditions, the following comparative control strategies were constructed: (1) No intervention; (2) Random selection Control strategy for each train (Random@k); (3) Selecting trains based on passenger flow index Flow ranking Control strategy for each train (Flow@k); (4) Selection of trains based on Closeness ranking of the train transfer network Control strategy for each train trip (Closeness@k); (5) Selection of the front based on the co-existence exposure index Contacts The control strategy for each train (Contacts@k) is then used. The train control sets obtained from each strategy are input into the propagation simulation model to obtain the corresponding infection rate curves. Indicators, and in the infection rate curve, peak infection rate, peak occurrence time, In the case of relative lack of control measures The comparison was conducted by reducing the proportions and other dimensions. The comparison results are as follows: Figure 2 As shown in Table 1.
[0113] In this embodiment, the target value corresponding to the optimal set predicted by the surrogate model during the cross-entropy search phase is This value is the prediction result from the surrogate model, obtained from the actual propagation simulation. The results are shown in Table 1.
[0114] The results show that the method of the present invention has significant effects on infection ratio curves, peak infection ratios, delay of transmission peaks, and... In terms of indicators, it is superior to the above comparison methods.
[0115] The computational advantage of this invention stems from the combination of a surrogate model and a probabilistic iterative search framework. In this embodiment, a single propagation simulation takes approximately 14 minutes, while the search phase requires evaluation of approximately... There are several candidate sets. If all are evaluated using real propagation simulations, it would consume a significant amount of computational resources and time. This invention uses a surrogate model for approximate evaluation, reducing the single-round search time to approximately 0.29 seconds, with the total search time completed within seconds to minutes. This significantly reduces the computational cost of the combinatorial optimization stage, making the optimization of large-scale train control schemes engineering-feasible.
[0116] Table 1
[0117] .
[0118] In this embodiment, The point in time when the infection rate increases the most under no-control measures is selected as the termination point of the early window.
[0119] Figure 2 Showing up within the same budget The graph shows the infection rate over time under different train control strategies. The horizontal axis represents time (days), with a data recording time step of 30 minutes. The vertical axis represents the system infection rate. .from Figure 2 It can be observed that the infection rate curve corresponding to the method of this invention is generally lower than that of other strategies, the peak infection rate is significantly reduced, the peak occurrence time is relatively delayed, and the growth rate in the early stage of transmission is significantly slowed down. These phenomena indicate that the method of this invention, through combined optimization of the selected train set, can more effectively weaken the key coupling paths in the transmission network, thereby achieving a better system-level transmission control effect.
[0120] Figure 3 The graph illustrates the changing trends of the probability vector and the evolution of the target value during the search process. It includes three metrics: the optimal target value. , indicating the first The surrogate prediction SIL value of the current optimal candidate set in each round of iteration; probability change index. Entropy is used to measure the degree of difference between probability vectors in two adjacent rounds; probability distribution entropy This is used to characterize the concentration of a probability distribution. It can be observed that... It gradually decreases and tends to stabilize with each iteration; Gradually decrease and approach the preset threshold; The gradual decrease indicates that the probability distribution gradually becomes more concentrated. These results demonstrate that the probabilistic iterative search process can converge stably.
[0121] References:
[0122] [1] Hethcote H W. The mathematics of infectious diseases[J]. SIAMreview, 2000, 42(4): 599-653.
[0123] [2] Zaheer M, Kottur S, Ravanbakhsh S, et al. Deep Sets. NeurIPS, 2017. [3] de Boer P T, Kroese D P, Mannor S, Rubinstein R Y. A tutorial onthe cross-entropy method. Annals of Operations Research, 2005。
Claims
1. An optimization method for controlling the spread of epidemics in subway train services based on surrogate models and probabilistic search, characterized in that, include: Reconstructing a continuous-time carriage co-contact network at the train number scale based on smart card data and timetable; An event-driven SEIR propagation simulation model is established on the shared contact network, and the early system infection load is defined. As an indicator for evaluating the effectiveness of transmission control; constructing the structural characteristics of train passenger flow, exposure, and transfer network; and budgeting the number of trains. Under constraints, a candidate train control set and corresponding label data are generated, and a surrogate prediction model that is insensitive to the order of the train set is trained. Based on the surrogate model, an iterative probability update search strategy is used for sampling, filtering, and probability updating to obtain an optimized train control set. Finally, the result is verified and output through a high-fidelity propagation simulation model. The specific steps are as follows: Step 1: Construct a propagation simulation model to evaluate the effectiveness of the train propagation control scheme, including identifying the entire set of trains in the metro system based on smart card data and timetables, reconstructing the continuous-time passenger co-contact network at the train scale, and establishing a simulation evaluation model to evaluate the propagation control effect of any train control set. Step 2: Construct multi-source features of train numbers for propagation control decisions, including calculating multi-source features related to propagation risk at the train number scale. The multi-source features include at least passenger flow features, co-existence exposure features, and inter-train number coupling network structure features induced by transfer behavior, forming a train number feature vector. Step 3: Train the surrogate prediction model, including generating several candidate train control sets under the train quantity budget constraint, and obtaining the corresponding propagation control effect label using the propagation simulation model described in Step 1. Based on the multi-source features of trains described in Step 2, train a surrogate prediction model that is insensitive to the order of trains in the set, so as to achieve rapid prediction of the propagation control effect of any candidate train control set. Step 4: Iterative search and output verification, including initializing the probability distribution of train selection, using a search method based on probability iterative updates, cyclically performing candidate set sampling, effect prediction and screening, probability updates, and outputting the optimized train control set; inputting the optimized train control set into the propagation simulation model described in Step 1 for verification, and generating a list of priority control trains for engineering implementation.
2. The method for optimizing the control of epidemic transmission in subway train services according to claim 1, characterized in that, The specific process for step 1 is as follows: Step 1-1: Smart card data preprocessing and travel record generation; extract entry and exit events from smart card transaction records and clean them to form passenger travel records in units of "entry-exit"; Steps 1-2: Timetable parameter extraction and train operation time calculation; for each line and direction, extract the first and last train times and time-varying departure interval functions, recursively generate all trains for the day and their planned passing times at each station, obtaining a set of train numbers. ; Steps 1-3: Trip segmentation and train allocation; Each passenger trip is decomposed into several "single-line, single-direction" trip segments according to the subway network topology; When the trip involves transfers, multiple trip segments are formed with the transfer station as the dividing point, and each trip segment is uniquely assigned to the corresponding train based on the relationship between the passenger's arrival time at the platform and the passing time of adjacent trains. Steps 1-4: Reconstruct the continuous time co-existence contact network; For each train, generate a time-ordered co-existence snapshot sequence based on passenger boarding and alighting events in the interval between adjacent stations. The snapshot records the train ID, interval ID, snapshot start and end times, and passenger set. Steps 1-5: Event-driven SEIR propagation simulation; based on the co-occurrence snapshot sequence, execute propagation events, in-host state transition events, etc., and output the system infection ratio curve. ; Steps 1-6: Train number control operation mode; given train number control set ,in This represents the set of all train numbers, when they are all included in the snapshot. At that time, the probability of contact infection within the snapshot is reduced to: in, Based on the probability of infection through contact, Used to characterize the reduction in the probability of transmission caused by engineering measures such as enhanced ventilation in the carriage and ultraviolet disinfection; Steps 1-7: SIL Calculation; Integrate the infection rate curve over a preset early time window to obtain the SIL: in, For a moment The number of infected people For the total number of people, The preset early propagation observation termination time; given the train number control set. The corresponding label is recorded as .
3. The method for optimizing the control of epidemic transmission in subway train services according to claim 2, characterized in that, The specific process for step 2 is as follows: Step 2-1: Calculation of basic operation and passenger flow characteristics of train services; Calculate the basic operation attributes of train services, such as total operating time, number of stations passed, and line category, as well as the passenger flow characteristics (Flow); Flow is the number of passengers on each train service, calculated from the train service allocation records; Step 2-2: Train exposure feature calculation; Calculate train exposure features Contacts and Intervals based on co-occurrence snapshots, where Contacts is the average number of different passengers each passenger on the train has encountered within that train, and Intervals is the average number of segments traversed by passengers on the train; for each train... Its segment set is ; for the section The set of passengers on the train obtained from the snapshot is Define the total set of passengers for each train as follows: For passengers The set of segments it traversed is Define passengers In train number The different groups of rideshare passengers encountered during the journey are: Contacts and Intervals are defined as follows: Steps 2-3: Constructing the train transfer network; building a directed weighted train transfer network with train numbers as nodes. ,in, Let be a set of directed edges. It is an adjacency matrix; for any passenger's travel sequence... ,in Indicates the number of trains included in this trip; for any Establish directed edges and on the border rights Add up the number of times the "preceding train" appears in all trips; Steps 2-4: Calculation of train transfer network structural features; Calculate the weighted degree, proximity centrality, and betweenness centrality features on the train transfer network to form a train structure feature vector; where: The weighted degree is defined as: , in, These are weighted in-degree and weighted out-degree, respectively, as follows: Transforming border rights into costs ( To prevent division by zero (a constant), the outward shortest path distance is defined. Then, proximity centrality is defined as: Betweenness centrality is defined as: in, Represents a node To the node The number of shortest paths, Indicates passing through nodes The number of shortest paths; Steps 2-5: Feature Vector Formation and Normalization; After normalizing the train passenger flow features, train exposure features, and train transfer network structure features, concatenate them to obtain the feature vector for each train. 3D feature vector .
4. The method for optimizing the control of epidemic transmission in subway train services according to claim 3, characterized in that, The specific process for step 3 is as follows: Step 3-1: Training sample generation; in budget Generate under constraints Group of candidate train control set samples For each The label is obtained by calling the propagation simulation in step 1. To form a training set ; Step 3-2: Define the cascaded surrogate model; construct a cascaded surrogate prediction model that is insensitive to the order of the train number set: in, This is a train number embedding network used to map individual train number features into low-dimensional embedding representations; For regression networks, the embedded vectors after pooling are mapped to predicted values of propagation control effects; the network and All are constructed using a multilayer perceptron architecture; Step 3-3: Model training; training with mean squared error as the target. , This allows for the rapid prediction of any set. of .
5. The method for optimizing the control of epidemic transmission in subway train services according to claim 4, characterized in that, The specific process for step 4 is as follows: Step 4-1: Initialize the probability distribution; let the total number of train trips be... Initialize the train selection probability vector And set the number of samples Elite ratio Smoothing coefficient Maximum number of iterations Convergence threshold ; Step 4-2: Iterative sampling and elite selection; in the... In the round of iteration, according to Perform sampling without replacement to generate There are candidate sets, each candidate set containing Different train routes; calculate using a proxy model And select the first with the smallest predicted value As an elite group ; Step 4-3: Probability Update; Update the frequency of each train based on the elite set. And employ smooth updates to suppress oscillations: Step 4-4: Convergence and Output; if satisfied Or reach the maximum number of iterations Then stop iterating and output the one with the highest probability. Each train number constitutes the optimized train number control set. ; Steps 4-5: High-fidelity verification and strategy comparison; Input the simulation model from step 1 to obtain the real... and within the same budget The system compares the current system with the baseline train control set, outputting the relative improvement and the final train list.