Infectious disease pre-checking and protective equipment issuing method and system based on deep learning
By using deep learning technology to quantitatively assess symptom and contact history data, and combining this with regional situation data, the priority of protective equipment distribution is dynamically calculated. This solves the problem of unreasonable resource allocation in infectious disease prevention and control, realizes intelligent and multi-objective optimized distribution of protective equipment, and improves the effectiveness of prevention and control.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- THE FIRST MEDICAL CENT CHINESE PLA GENERAL HOSPITAL
- Filing Date
- 2026-03-12
- Publication Date
- 2026-06-12
AI Technical Summary
In infectious disease control, existing technologies fail to dynamically respond to changes in the epidemic situation in the distribution of protective equipment, resulting in unreasonable resource allocation and a lack of self-iterative optimization capabilities, making it difficult to achieve dynamic optimization allocation based on multiple objectives.
A deep learning-based infectious disease pre-detection method is adopted. The risk assessment network is used to quantitatively evaluate symptom descriptions and contact history data. Combined with regional situation data, the priority of protective equipment distribution is dynamically calculated. Distribution instructions are generated through a multi-objective optimization algorithm, and the priority is adjusted in real time to optimize resource allocation.
It has enabled intelligent and dynamic scheduling of protective equipment, ensuring that high-risk individuals and severely affected areas have priority access to resources, improving resource allocation efficiency and overall prevention and control effectiveness, and forming a closed-loop management system.
Smart Images

Figure CN122201837A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to deep learning technology, and more particularly to a method and system for infectious disease pre-detection and distribution of protective equipment based on deep learning. Background Technology
[0002] In the field of infectious disease control, pre-screening and triage, along with the rational allocation of protective equipment, are crucial links in breaking the chain of transmission and ensuring the efficient use of public health resources. Current technologies typically rely on a combination of manual consultation and simple scoring cards to collect and preliminarily assess information such as the epidemiological history and clinical symptoms of patients or screening subjects. Based on this assessment, staff categorize individuals into different risk levels according to established, relatively static classification standards, and accordingly decide whether to distribute protective equipment and what type, such as masks, gloves, or face shields, to them. In this process, regional epidemic trends, such as the number of local cases and the level of transmission risk, are sometimes considered as macro-level background information, but are rarely deeply quantified and integrated into real-time decision-making for individual risk assessment and resource allocation. The priority of material distribution is mainly based on an established risk level order, or, when inventory is tight, simple rules such as first-come, first-served or allocation according to need are adopted.
[0003] However, the aforementioned conventional practices have significant limitations. The distribution strategies for protective equipment lack intelligent coordination with the dynamically changing epidemic situation and resource inventory. Static allocation rules cannot respond to real-time fluctuations in the intensity of epidemic transmission, nor can they achieve dynamic optimization based on multiple objectives (such as maximizing overall protection effectiveness, prioritizing key populations, and balancing regional risks) when resources are scarce. This easily leads to unreasonable resource allocation, creating protection gaps in certain areas or among certain populations, while also resulting in resource waste in other areas. Furthermore, the existing processes lack sufficient closed-loop functionality. Post-distribution feedback, such as the recipients' subsequent health status, is rarely systematically collected and used to evaluate and correct the accuracy of the initial risk prediction models, rendering the system lacking in self-iteration and optimization capabilities. Summary of the Invention
[0004] This invention provides a method and system for infectious disease pre-detection and distribution of protective equipment based on deep learning, which can solve the problems in the prior art.
[0005] A first aspect of this invention provides a method for infectious disease pre-detection and distribution of protective equipment based on deep learning, comprising: The system acquires symptom description data and exposure history data of the subjects to be tested, inputs them into the risk assessment network, and outputs risk assessment results. Based on the risk assessment results and regional situation data, the risk classification identifier and transmission intensity index of the target to be monitored are determined; Based on the aforementioned risk classification identifiers, transmission intensity indicators, and protective equipment inventory status, a multi-objective optimization algorithm is used to dynamically calculate the distribution priority for each target to be tested, and to generate distribution instructions for protective equipment. Execute the issuance operation according to the issuance instruction and collect issuance feedback data; The health status tracking information in the distribution feedback data is matched with the risk assessment results in a time series correlation to extract prediction deviation features. Based on the prediction deviation features, the parameters of the risk assessment network are updated and a confidence correction value is generated. The confidence correction value is used to calculate the distribution priority of subsequent test subjects.
[0006] The steps for obtaining symptom description data and exposure history data of the subjects to be tested, inputting them into the risk assessment network, and outputting risk assessment results include: Semantic features are extracted from the symptom description data, and the extracted symptom semantic features are matched with a preset infectious disease symptom feature database to generate a symptom matching degree vector; The contact history data is analyzed in a spatiotemporal manner to extract contact time nodes, contact location coordinates, and contact object identifiers, and a spatiotemporal contact relationship map is constructed. The symptom matching vector and the spatiotemporal contact relationship map are input into the risk assessment network. The symptom features and contact relationship features are cross-correlated and encoded through a multi-layer feature fusion module to generate a fused feature representation. Based on the fusion feature representation, the infection probability value and transmission risk level are calculated through the risk assessment output layer as the risk assessment result.
[0007] Based on the risk assessment results and regional situation data, the steps for determining the risk classification identifier and transmission intensity index of the target to be monitored include: The infection probability value from the risk assessment results is compared with a preset set of risk thresholds to determine the preliminary risk level of the target. Spatiotemporal distribution analysis was performed on the regional situation data to extract the number of confirmed cases and population density parameters at the location of the target object, and a regional transmission situation feature vector was constructed. The preliminary risk level identifier, the transmission risk level in the risk assessment result, and the regional transmission trend feature vector are weighted and fused together. The number of confirmed cases is used as the regional risk weight coefficient to dynamically correct the preliminary risk level identifier, thereby generating a risk classification identifier for the object to be tested. Based on the risk classification identifier and the regional transmission trend feature vector, the virus transmission rate and infection spread radius of the location of the object to be detected are calculated. The product of the number of confirmed cases and the population density parameter is used as a transmission acceleration factor in the calculation of the infection spread radius to generate a transmission intensity index.
[0008] Based on the distribution priority sorting sequence and the inventory constraint set, the distribution type and quantity of each object to be tested are determined, and a distribution instruction for protective equipment is generated.
[0009] The distribution sequences for high-risk groups, medium-risk groups, and low-risk groups are merged according to group priority to generate a distribution priority sorting sequence.
[0010] Extract the symptom feature combinations, contact relationship patterns, and regional situation parameters corresponding to the deviation samples, construct a set of deviation feature vectors, and statistically analyze the occurrence frequency and correlation strength of various deviation features as predicted deviation features.
[0011] When the similarity exceeds the preset similarity threshold, when dynamically calculating the distribution priority of the object to be tested based on the risk classification identifier, transmission intensity index and protective equipment inventory status, the confidence correction value is used as the priority adjustment coefficient to correct the calculated distribution priority, and a confidence-corrected distribution priority is generated.
[0012] A second aspect of the present invention provides a deep learning-based infectious disease pre-detection and protective equipment distribution system, comprising: The data input unit is used to acquire symptom description data and contact history data of the subject to be tested, input them into the risk assessment network, and output the risk assessment results. The risk classification unit is used to determine the risk classification identifier and transmission intensity index of the object to be detected based on the risk assessment results and regional situation data. The priority calculation unit is used to dynamically calculate the distribution priority of each object to be tested based on the risk classification identifier, the transmission intensity index and the inventory status of protective equipment, and generate the distribution instruction of protective equipment. The issuance execution unit is used to execute the issuance operation according to the issuance instruction and collect issuance feedback data; The network optimization unit is used to perform time-series correlation matching between the health status tracking information in the distribution feedback data and the risk assessment results, extract prediction deviation features, update the parameters of the risk assessment network based on the prediction deviation features and generate a confidence correction value, and use the confidence correction value for subsequent distribution priority calculation of the target objects.
[0013] A third aspect of the present invention provides an electronic device, comprising: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0014] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0015] This invention inputs symptom descriptions and contact history data into a risk assessment network, directly outputting quantitative risk assessment results. This avoids the subjectivity and inconsistency of traditional manual screening, significantly improving the efficiency and objectivity of initial screening. This method enables intelligent dynamic scheduling of protective equipment distribution. Based on risk classification, transmission intensity, and real-time inventory status, a multi-objective optimization algorithm calculates the distribution priority for each individual, thereby generating distribution instructions. This mechanism ensures that, under limited resource conditions, protective equipment is prioritized for allocation to high-risk individuals and severely affected areas, optimizing resource allocation efficiency and improving overall prevention and control effectiveness. Feedback data is automatically collected after the distribution operation is executed, forming a closed-loop management system. Attached Figure Description
[0016] Figure 1 This is a flowchart illustrating the method for infectious disease pre-detection and distribution of protective equipment based on deep learning, as described in an embodiment of the present invention. Figure 2 Flowchart for determining the risk classification identifier and transmission intensity index of the target object. Detailed Implementation
[0017] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0018] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes will not be repeated in some embodiments.
[0019] Figure 1 This is a flowchart illustrating a deep learning-based method for infectious disease pre-detection and protective equipment distribution, as described in an embodiment of the present invention. Figure 1 As shown, the method includes: The system acquires symptom description data and exposure history data of the subjects to be tested, inputs them into the risk assessment network, and outputs risk assessment results. Based on the risk assessment results and regional situation data, the risk classification identifier and transmission intensity index of the target to be monitored are determined; Based on the aforementioned risk classification identifiers, transmission intensity indicators, and protective equipment inventory status, a multi-objective optimization algorithm is used to dynamically calculate the distribution priority for each target to be tested, and to generate distribution instructions for protective equipment. Execute the issuance operation according to the issuance instruction and collect issuance feedback data; The health status tracking information in the distribution feedback data is matched with the risk assessment results in a time series correlation to extract prediction deviation features. Based on the prediction deviation features, the parameters of the risk assessment network are updated and a confidence correction value is generated. The confidence correction value is used to calculate the distribution priority of subsequent test subjects.
[0020] In one optional implementation, the steps of acquiring symptom description data and exposure history data of the subject to be tested, inputting them into the risk assessment network, and outputting risk assessment results include: Semantic features are extracted from the symptom description data, and the extracted symptom semantic features are matched with a preset infectious disease symptom feature database to generate a symptom matching degree vector; The contact history data is analyzed in a spatiotemporal manner to extract contact time nodes, contact location coordinates, and contact object identifiers, and a spatiotemporal contact relationship map is constructed. The symptom matching vector and the spatiotemporal contact relationship map are input into the risk assessment network. The symptom features and contact relationship features are cross-correlated and encoded through a multi-layer feature fusion module to generate a fused feature representation. Based on the fusion feature representation, the infection probability value and transmission risk level are calculated through the risk assessment output layer as the risk assessment result.
[0021] For example, symptom description data of the subjects to be tested is collected through mobile terminals or self-service terminals. This data is input in text form and includes subjective or objective descriptions such as body temperature, cough frequency, degree of difficulty breathing, and fatigue. A pre-trained BERT model is used to encode the input text. The BERT model is a bidirectional language representation model based on the Transformer architecture, which possesses powerful semantic understanding capabilities after pre-training on large-scale corpora. The encoding process converts the text sequence into a fixed-dimensional vector representation, with the vector dimension set to 256 or 512 dimensions, each dimension corresponding to a semantic feature component of the text.
[0022] The pre-built infectious disease symptom feature database stores typical symptom vectors for various infectious diseases such as influenza and tuberculosis. Each infectious disease corresponds to a set of standard feature vectors, which are generated by training on symptom data from historical confirmed cases. The similarity matching between the symptom semantic feature vector and the symptom feature vectors of various infectious diseases in the feature database uses the cosine similarity calculation method. The similarity value ranges from 0 to 1, with values closer to 1 indicating higher similarity. The symptom matching degree vector includes matching degree components with each infectious disease type, forming a multi-dimensional quantitative representation. For example, the symptom matching degree vector of a certain target object is [0.85, 0.32, 0.15], corresponding to the matching degree of influenza and tuberculosis, respectively.
[0023] For processing contact history data, the timestamp information, GPS coordinate information, and contact object identification information in the input data are parsed. Timestamp information is accurate to the minute, recording specific contact time points; GPS coordinates are represented using latitude and longitude and then converted into standardized spatial grid codes, with grid sizes set to 50m×50m or 100m×100m for easy subsequent spatial clustering analysis; contact object identification uses anonymized numbering to protect privacy. When constructing the spatiotemporal contact relationship map, the object to be detected is used as the central node, and objects it has contacted are used as associated nodes. The edge weight between nodes is determined by both contact duration and contact distance, calculated as: Edge weight = Contact duration (minutes) × Distance attenuation coefficient, where the attenuation coefficient is 1.0 for distances less than 2 meters, 0.5 for distances between 2 and 5 meters, and 0.1 for distances greater than 5 meters. Contact relationships with a duration exceeding 15 minutes and a distance less than 2 meters are assigned higher weight values. The graph data structure uses adjacency matrix or sparse matrix to improve computational efficiency. Node features include the object's historical health status labels, such as confirmed, suspected, or healthy.
[0024] The risk assessment network comprises two parallel processing pathways: a symptom feature encoding branch and a contact relationship encoding branch. The symptom feature encoding branch receives the symptom matching degree vector as input and performs a nonlinear transformation through three fully connected layers. Each layer has 128, 64, and 32 neurons, respectively, and uses the ReLU activation function to introduce nonlinearity. The weight matrix of the fully connected layers is obtained through backpropagation training. The contact relationship encoding branch employs a Graph Convolutional Network (GCN) to aggregate features from the spatiotemporal contact relationship graph. Graph convolution operations update node feature representations by aggregating node neighborhood information. Specifically, two or three graph convolutional layers are used, with the kernel size of each layer adaptively determined based on the average degree of the graph. Graph convolution extracts the neighborhood structure features and propagation chain patterns of nodes.
[0025] The multi-layer feature fusion module receives feature vectors from two branches and performs weighted fusion using an attention mechanism. The attention mechanism calculates the importance weights of the two branch features. The calculation process is as follows: first, the two feature vectors are linearly transformed using a learnable weight matrix; then, they are normalized using a softmax function to obtain the attention weights. The attention weights are dynamically adjusted based on the feature distribution of the current input data. When symptom features are prominent, the symptom branch has a higher weight; when contact relationships are complex, the contact branch has a higher weight. The fusion method uses weighted concatenation, which involves weighting the feature vectors of the two branches according to the attention weights and concatenating them into a longer vector. The fused feature representation has a dimension of 128 or 256, comprehensively including symptom information and propagation chain information.
[0026] The risk assessment output layer comprises two parallel fully connected sublayers that output the infection probability value and the transmission risk level, respectively. The infection probability sublayer contains one fully connected layer and a sigmoid activation function, mapping the fused features to a probability range of 0 to 1. The output value represents the infectivity of the detected object: a probability value higher than 0.7 indicates high risk, between 0.3 and 0.7 indicates medium risk, and below 0.3 indicates low risk. The transmission risk level sublayer contains one fully connected layer and a softmax activation function, outputting probability distribution vectors for low, medium, and high levels. The level with the highest probability value is selected as the final level identifier. The infection probability value and the transmission risk level together constitute the output risk assessment result, providing a quantitative basis for subsequent risk classification and protective equipment distribution decisions.
[0027] This invention achieves accurate assessment of infectious disease risk by fusing symptom features and contact relationship features through deep learning networks, solving the problem of insufficient assessment accuracy caused by traditional methods relying on only a single data source.
[0028] In one optional implementation, the steps of determining the risk classification identifier and transmission intensity index of the target object based on the risk assessment results and regional situation data include: The infection probability value from the risk assessment results is compared with a preset set of risk thresholds to determine the preliminary risk level of the target. Spatiotemporal distribution analysis was performed on the regional situation data to extract the number of confirmed cases and population density parameters at the location of the target object, and a regional transmission situation feature vector was constructed. The preliminary risk level identifier, the transmission risk level in the risk assessment result, and the regional transmission trend feature vector are weighted and fused together. The number of confirmed cases is used as the regional risk weight coefficient to dynamically correct the preliminary risk level identifier, thereby generating a risk classification identifier for the object to be tested. Based on the risk classification identifier and the regional transmission trend feature vector, the virus transmission rate and infection spread radius of the location of the object to be detected are calculated. The product of the number of confirmed cases and the population density parameter is used as a transmission acceleration factor in the calculation of the infection spread radius to generate a transmission intensity index.
[0029] Combination Figure 2 The flowchart for determining the risk classification identifier and transmission intensity index of the target object is explained below. After obtaining the risk assessment results of the target object, the infection probability value is extracted. This infection probability value is output by the infection probability sublayer of the aforementioned risk assessment network and is a floating-point number between 0 and 1, representing the susceptibility of the target object to infectious disease. A set of pre-set risk thresholds is included, containing two boundary values: a low-risk threshold of 0.3 and a high-risk threshold of 0.7. The infection probability value of the target object is compared stepwise with these two thresholds. If the infection probability value is less than 0.3, it is marked as low risk; between 0.3 and 0.7, it is marked as medium risk; and above 0.7, it is marked as high risk. This marking result serves as a preliminary risk level identifier, providing an initial basis for subsequent risk classification.
[0030] For the geographical coordinates of the person being tested, the number of confirmed cases within a 3-kilometer radius of that location is retrieved from the regional situation database. This database stores dynamic epidemic information for each geographical region, with data sources including confirmed cases reported by medical institutions, epidemic reports issued by disease control centers, and data collected by real-time monitoring systems. The data is updated hourly to ensure timeliness and includes case distribution information, infection type distribution, and time-series trends for each sampling point. Simultaneously, the population density parameter for that location is extracted. This parameter is derived by statistically analyzing the number of mobile terminals using mobile signaling data and real-time data collection of pedestrian flow from public place monitoring devices, expressed in people per square kilometer, reflecting the population density of a specific area. The number of confirmed cases, the population density parameter, and the case growth rate over the past 72 hours are combined to form a 3-dimensional regional transmission trend feature vector: the first dimension is the number of confirmed cases, the second is the population density, and the third is the case growth rate.
[0031] The initial risk level identifiers are quantified and mapped to facilitate subsequent calculations: low risk is assigned a value of 1, medium risk 2, and high risk 3. A transmission risk level parameter is extracted from the risk assessment results. This parameter, output by the transmission risk level sub-layer of the aforementioned risk assessment network, reflects the degree of danger in the transmission chain assessed by the subject based on contact history data. It also uses three levels: low, medium, and high, assigned values of 1, 2, and 3 respectively, with higher levels indicating a more dangerous position in the transmission chain. The number of confirmed cases is normalized and used as the regional risk weight coefficient. The normalization formula is the current number of confirmed cases in the region divided by the region's historical maximum number of cases. The normalized value ranges from 0 to 1; the more confirmed cases, the larger the weight coefficient.
[0032] The final risk classification identifier is calculated using a weighted fusion method: the initial risk level identifier is multiplied by a weight of 0.4, the transmission risk level by a weight of 0.3, and the first dimension of the regional transmission trend feature vector (i.e., the normalized value of confirmed cases) by a weight of 0.3. These three factors are summed to obtain a basic fusion score. This basic fusion score is then dynamically adjusted by multiplying it by the regional risk weight coefficient, quantifying the impact of regional epidemic severity on individual risk. If the number of confirmed cases in a region exceeds the warning threshold of 50, indicating a high-risk state, an additional correction value of 0.5 is added to the adjusted score to raise the risk level. The calculation results are then reclassified into three risk levels based on thresholds of 1.5 and 2.5: less than 1.5 is low risk, 1.5 to 2.5 is medium risk, and greater than 2.5 is high risk, generating risk classification identifiers for the subjects to be tested.
[0033] The transmission rate reflects the speed at which the virus spreads in a specific area and is estimated using an exponential growth method. The case growth rate data is extracted from the regional transmission trend feature vector, representing the relative increase in the number of cases over the past 72 hours. This data is multiplied by the population density parameter to obtain the basic transmission coefficient, reflecting the accelerating effect of population density on the virus transmission speed. The basic transmission coefficient is then multiplied by the risk weight corresponding to the risk classification identifier: 1.5 for high risk, 1.0 for medium risk, and 0.6 for low risk. The higher the risk level, the greater the weight, resulting in the virus transmission rate for that area, expressed in terms of the number of infections per day.
[0034] When calculating the infection spread radius, the number of confirmed cases is directly multiplied by the population density parameter to obtain the transmission acceleration factor. The number of confirmed cases reflects the number of sources of infection, and the population density reflects the density of susceptible populations; their product quantifies the acceleration effect of transmission. A baseline spread radius of 500 meters is set, based on the typical transmission distance of respiratory infectious diseases. The transmission acceleration factor is normalized by dividing it by a preset standardization constant of 1000. The normalized value is used as the amplification factor of the baseline spread radius. When the transmission acceleration factor exceeds the threshold of 1000, it indicates extremely severe transmission conditions. At this point, the spread radius grows according to a logarithmic function to avoid excessively large values that could lead to calculation distortion. The logarithmic function formula is: Spread radius = 500 × log10(Transmission acceleration factor / 100). The calculated spread radius is divided by the preset maximum spread radius of 5000 meters for normalization, resulting in a normalized spread radius value between 0 and 1. The virus transmission rate is divided by the preset maximum transmission rate of 100 people per day for normalization, resulting in a normalized transmission rate value between 0 and 1. The normalized values of the diffusion radius and the normalized values of the propagation rate are weighted and summed. The calculation formula is: Propagation intensity index = Normalized value of diffusion radius × 0.6 + Normalized value of propagation rate × 0.4. The result is then mapped to a five-level classification from 1 to 5. A value of 1 corresponds to extremely low propagation intensity, and a value of 5 corresponds to extremely high propagation intensity. The larger the value, the higher the propagation intensity of the area where the object to be detected is located, and the more urgent the allocation of protective resources is required.
[0035] This invention deeply integrates individual risk assessment results with the regional epidemic situation, enabling precise determination of risk classification markers and transmission intensity indicators. The risk classification markers comprehensively consider individual infection probability, contact history risk, and the regional epidemic situation, avoiding the one-sidedness of single-dimensional assessments.
[0036] In one optional implementation, the step of dynamically calculating the distribution priority of each target object to be tested and generating a distribution instruction for protective equipment based on the risk classification identifier, transmission intensity index, and protective equipment inventory status includes: Obtain the current inventory status of protective equipment, extract the remaining quantity of each type of protective equipment, and construct a set of inventory constraints; The risk classification identifier is mapped to a basic priority score, and the transmission intensity index is converted into an urgency coefficient. The two are then multiplied to generate a preliminary priority score. The infection probability value from the risk assessment results is combined with the preliminary priority score to generate a comprehensive priority assessment value. Based on the comprehensive priority evaluation value and the set of inventory constraints, a multi-objective optimization function containing resource utilization and distribution fairness objectives is constructed using a multi-objective optimization algorithm. The distribution priority of each object to be detected is iteratively solved to generate a distribution priority ranking sequence, wherein the remaining quantity is used for constraint calculation of the resource utilization objective. Based on the distribution priority sorting sequence and the inventory constraint set, the distribution type and quantity of each object to be tested are determined, and a distribution instruction for protective equipment is generated.
[0037] For example, by establishing a real-time data connection with the inventory management database, the system reads the inventory status information of various protective supplies such as masks, protective suits, goggles, and disinfectants. During the extraction process, three key parameters are calculated for each type of protective supply: current inventory value, reserved quantity, and safety stock threshold. The current inventory value represents the real-time available quantity of this type of supply, the reserved quantity represents the locked quantity that has been allocated but not actually distributed, and the safety stock threshold represents the minimum inventory level required to maintain emergency reserves. When constructing the set of inventory constraints, the allocable quantity of each type of supply is set as the upper limit of the constraint. The formula for calculating the allocable quantity is the current inventory value minus the reserved quantity and then minus the safety stock threshold. When the remaining quantity of a certain type of supply is less than 1.2 times the safety stock threshold, it indicates that this type of supply is close to a shortage. The system sets an emergency replenishment flag for this type of supply, triggering the procurement process, and adds a retention coefficient of 0.15 to the constraints, that is, 15% is reserved as an emergency reserve based on the allocable quantity to ensure that protective resources are still available in case of emergencies.
[0038] The risk classification system includes three levels: low, medium, and high risk, consistent with the aforementioned risk classification process. When mapping to basic priority scores, high risk corresponds to a score of 80-100, medium risk to 40-79, and low risk to 20-39; the higher the risk level, the higher the basic priority score. The transmission intensity index reflects the activity level of epidemic transmission in the area where the tested individual is located. This index is calculated using the aforementioned steps and is a weighted sum of the normalized values of the diffusion radius and transmission rate, then mapped to a numerical range of 0.5 to 2.0. The mapping method involves multiplying the calculation result from 0 to 1 by 1.5 and then adding 0.5; the higher the transmission intensity, the higher the value. The transmission intensity index is used as an urgency coefficient and multiplied by the basic priority score to generate a preliminary priority score. This score comprehensively reflects the individual's risk level and the severity of the epidemic in their area.
[0039] During the fusion calculation, the infection probability value output from the risk assessment results is extracted. This probability value is the continuous value between 0 and 1 obtained after the infection probability sublayer output of the aforementioned risk assessment network and passing through the sigmoid activation function, consistent with the aforementioned risk assessment process. The preliminary priority score is normalized by dividing it by the theoretical maximum value of 200. The theoretical maximum value of 200 is the product of the upper limit of the basic priority score of 100 and the upper limit of the transmission intensity index of 2.0. The normalized preliminary priority score ranges from 0 to 1, ensuring that the infection probability value and the normalized preliminary priority score are on the same numerical scale for easy fusion calculation. The infection probability value and the normalized preliminary priority score are calculated using a weighted fusion method, with weight coefficients set to 0.3 and 0.7, respectively. The calculation formula is: the comprehensive priority assessment value equals the infection probability value multiplied by 0.3 plus the normalized preliminary priority score multiplied by 0.7, resulting in a comprehensive priority assessment value. This value ranges from 0 to 1, with a higher value indicating a higher priority for the protective resource needs of the object to be tested.
[0040] The multi-objective optimization algorithm optimizes the allocation scheme of protective equipment (PE) based on a comprehensive priority evaluation value and a set of inventory constraints. It employs a Pareto front-based optimization strategy to simultaneously optimize two objectives: resource utilization and distribution fairness. The resource utilization objective is defined as the ratio of the sum of all allocated PE quantities to the total available inventory, where the total available inventory is the sum of the allocatable quantities of each type of PE. The objective is to maximize this ratio while avoiding excessive consumption of a single type of PE that leads to stockpiling of other types. The distribution fairness objective is measured by calculating the variance of the resource value obtained by the tested objects within each risk level. The resource value is obtained by multiplying the standard price of each type of PE by the allocated quantity and then summing the results. The objective is to minimize the variance to ensure a relatively balanced resource allocation for objects at the same risk level, avoiding situations where some objects receive excessive resources while others receive insufficient resources.
[0041] After constructing an optimization function with two objectives, the following constraints are set: First, the total allocation of each type of protective equipment shall not exceed the allocatable quantity set in the inventory constraint set to ensure inventory safety; Second, the quantity of supplies obtained by a single object must meet the minimum configuration standard for basic protection needs, with high-risk objects having at least 3 N95 masks and 1 set of protective clothing, medium-risk objects having at least 5 medical surgical masks, and low-risk objects having at least 3 ordinary medical masks; Third, objects with higher comprehensive priority evaluation values shall be given priority in resource allocation, and under the premise of satisfying the first two constraints, protective equipment shall be allocated in descending order of comprehensive priority evaluation values.
[0042] The iterative solution process employs the non-dominated sorting genetic algorithm NSGA-II for 200 generations of optimization. Each generation generates 100 candidate solutions, each representing a complete protective equipment allocation scheme. Population evolution is performed using a crossover probability of 0.8 and a mutation probability of 0.1. The crossover operation exchanges the allocation schemes of some objects between two parent solutions, while the mutation operation randomly adjusts the allocation quantity of certain objects. Fitness evaluation considers both resource utilization and distribution fairness, calculating the performance of each candidate solution on both objectives. Non-dominated sorting categorizes candidate solutions into different levels, selecting a Pareto optimal solution set. Solutions in this set achieve relatively good performance on both objectives without significant superiority or inferiority relationships. The solution with resource utilization exceeding 85% and the smallest fairness variance is selected from the optimal solution set as the final scheme, generating a distribution priority ranking sequence. This sequence arranges all identifiers of the objects to be tested according to their comprehensive priority evaluation value from high to low. If the comprehensive priority evaluation values are the same, they are ranked according to their infection probability value from high to low, ensuring that the most urgent objects receive protective resources first.
[0043] When determining the distribution type and quantity based on the distribution priority sequence, the system sequentially traverses the object identifier numbers in the sequence. High-risk individuals are provided with a complete protective kit, including 5 N95 masks, 2 sets of disposable protective clothing, 1 pair of goggles, and 500 ml of hand sanitizer. This standard configuration meets the protection needs of high-risk individuals for 7 days in high-exposure environments. Medium-risk individuals are provided with a standard protective kit, including 10 surgical masks, 20 pairs of disposable gloves, 2 packs of disinfectant wipes, and 200 ml of hand sanitizer. Low-risk individuals are provided with a basic protective kit, including 5 ordinary medical masks and 1 bottle of portable disinfectant spray.
[0044] During configuration, inventory constraints are checked in real time. Before allocating supplies to each object, the system checks whether the remaining allocable quantity of each type of supply is sufficient. When the inventory of a certain type of supply is insufficient to meet the standard configuration of the current object, adjustments are made according to the principle of prioritizing core protective equipment. The priority order of core protective equipment is masks, disinfectant, gloves, goggles, and protective clothing. For example, if a high-risk object does not have enough protective clothing, it can be replaced by increasing the number of N95 masks to 8 and increasing the amount of disinfectant to 800 ml, and the alternative solution is noted in the distribution instruction. The generated distribution instruction includes the object identification number, distribution type code (1 for complete protective suit, 2 for standard protective suit, 3 for basic protective suit), the specific name and quantity of each type of supply, the distribution time window (it is recommended to complete the distribution within 24 hours), and the geographical coordinates of the specified distribution point. It is output in JSON or XML structured data format for the execution module to call. The execution module organizes logistics and distribution or notifies the object to go to the designated location to collect the protective equipment according to the distribution instruction.
[0045] This invention enables precise allocation of protective equipment based on comprehensive priority assessment values and inventory constraints. The multi-objective optimization algorithm strikes a balance between resource utilization and distribution fairness, avoiding resource waste and unfair allocation.
[0046] In one alternative implementation, the step of generating the issuance priority order sequence includes: The current period's epidemic transmission rate, confirmed case growth rate, and inventory depletion rate are extracted. An adaptive weight adjustment mechanism is used to calculate the target weight coefficients for resource utilization efficiency and distribution fairness. The epidemic transmission rate and the confirmed case growth rate are used as positive correlation factors to increase the target weight coefficient for resource utilization efficiency. All objects to be tested are divided into high-risk, medium-risk and low-risk groups according to the risk classification identifier, and the resource quota ratio of each group is calculated based on the resource utilization target weight coefficient; Pareto optimization is performed on the high-risk group to generate a high-risk group distribution sequence. The remaining resources are then allocated to the medium-risk group to generate a medium-risk group distribution sequence. Finally, the remaining resources are allocated to the low-risk group to generate a low-risk group distribution sequence. The distribution fairness objective weight coefficient of the low-risk group is higher than that of the high-risk group. The distribution sequences for high-risk groups, medium-risk groups, and low-risk groups are merged according to group priority to generate a distribution priority sorting sequence.
[0047] For example, in the dynamic optimization process of generating the priority ranking sequence for distribution, three core indicators are first obtained: the epidemic transmission rate, the confirmed case growth rate, and the inventory depletion rate for the current period. The epidemic transmission rate is calculated by the ratio of the number of new cases to the number of existing cases per unit time, expressed as a daily percentage. For example, if there are 100 new cases and 1000 existing cases on a given day, the transmission rate is 10%. The confirmed case growth rate is quantified using the moving average growth trend over the past 7 days. It is calculated by dividing the arithmetic mean of the daily new cases over the past 7 days by the number of existing cases 7 days ago, reflecting the medium-term epidemic development trend. The inventory depletion rate is determined by the ratio of real-time outbound records of protective equipment to the remaining inventory. The formula is the total outbound quantity of various types of protective equipment in the past 24 hours divided by the current total remaining inventory of various types of protective equipment, expressed as a daily percentage. This indicator reflects the degree of scarcity of protective resources.
[0048] After receiving the above three indicator data, the adaptive weight adjustment mechanism inputs the epidemic transmission rate and the confirmed case growth rate as positive correlation factors into the weight calculation formula. The weight calculation formula is based on an exponential function, and the formula for calculating the resource utilization rate target weight coefficient is: Resource utilization rate weight = 0.5 + 0.3 × [1 - exp(-epidemic transmission rate × 5)] + 0.2 × [1 - exp(-confirmed case growth rate × 2)]. This formula ensures that when the epidemic transmission rate or the confirmed case growth rate increases, the resource utilization rate weight increases accordingly. When the epidemic transmission rate exceeds the preset threshold of 0.15 or the confirmed case growth rate is higher than 20%, the resource utilization rate target weight coefficient calculated according to the above formula automatically increases to above 0.7. At this time, the weight of resource utilization rate dominates in the multi-objective optimization algorithm, prioritizing the efficiency of material supply to high-risk groups and ensuring that protective resources are quickly delivered to those who need them most.
[0049] The inventory depletion rate, as a measure of resource scarcity, is used in weight adjustments. When the inventory depletion rate exceeds 30% per day, it indicates that protective equipment is being consumed too quickly, leading to inventory shortages. In this case, the resource utilization rate weight is increased by an additional 0.1 to further enhance efficient resource utilization. Conversely, when the epidemic is in a stable phase (i.e., the transmission rate is below 0.05 and the confirmed case growth rate is below 5%), and the inventory depletion rate is below 15% per day, indicating sufficient inventory, the weight coefficient for the distribution fairness objective gradually increases. The calculation formula is: Distribution Fairness Weight = 1 - Resource Utilization Rate Weight, ensuring the sum of the two weights equals 1. When the resource utilization rate weight drops below 0.4, the distribution fairness weight correspondingly rises to above 0.6. At this point, the optimization algorithm prioritizes ensuring that all individuals at each risk level receive basic protective measures, avoiding excessive resource concentration on high-risk groups.
[0050] The dynamic adjustment of weight coefficients employs an exponential smoothing algorithm to achieve a continuous transition. The smoothing formula is: Current weight = α × Calculated weight + (1-α) × Weight at the previous moment, where the smoothing coefficient α is set to 0.3, controlling the speed of weight updates. Through exponential smoothing, the weight adjustment presents a gentle transition curve, avoiding sudden weight changes caused by short-term fluctuations in epidemic data, and ensuring the stability and predictability of the resource allocation strategy. The system updates epidemic indicator data and recalculates weight coefficients hourly, achieving near real-time adaptive adjustment of weights.
[0051] All individuals to be tested were divided into three risk groups according to the aforementioned risk classification identifiers, consistent with the aforementioned risk classification process. The high-risk group includes individuals with a high-risk classification identifier, i.e., those with obvious core symptoms and a history of close contact with confirmed cases, or those living in areas with severe epidemic situations. The medium-risk group includes individuals with a medium-risk classification identifier, covering those with some matching symptoms and a history of indirect contact, or those living in areas with moderate transmission intensity, mild symptoms, or unclear contact history. The low-risk group includes individuals with a low-risk classification identifier, those with mild symptoms or no clear contact history, and those living in areas with stable epidemic situations.
[0052] Based on the current resource utilization target weight coefficient, the system dynamically allocates resource quota ratios to each risk group. A baseline ratio is set at 50% for the high-risk group, 30% for the medium-risk group, and 20% for the low-risk group. This baseline ratio applies when the resource utilization target weight is 0.5. When the resource utilization target weight coefficient increases, the quota ratio for the high-risk group increases accordingly, calculated as: High-risk group quota ratio = 50% + 30% × (Resource utilization weight - 0.5). The quota ratios for other risk groups are reduced proportionally to maintain a total of 100%. For example, when the resource utilization target weight coefficient reaches 0.8, the high-risk group quota ratio automatically increases to 59%, the medium-risk group quota ratio decreases to 25%, and the low-risk group quota ratio decreases to 16%. Conversely, when the resource utilization weight decreases to 0.3, the high-risk group quota ratio decreases to 44%, while the low-risk group quota ratio increases to 22%, improving fairness. This dynamic adjustment of quota ratios ensures priority protection for high-risk groups during severe periods of the epidemic, while balancing the protection needs of all levels of individuals during periods of stability.
[0053] When performing Pareto optimization on the high-risk group, the transmission intensity index and the infection probability value from the risk assessment results are used as the core variables of the optimization objective function. The optimization objective is defined as maximizing the weighted average of the transmission intensity index and the weighted average of the infection probability value under the constraint of limited resource quotas for the high-risk group. The weighted average is calculated using the recipients of protective equipment as samples, with the weights representing the value of the allocated resources. The optimization process uses the NSGA-II non-dominated sorting genetic algorithm to construct a non-dominated solution set. During the iteration process, multiple candidate allocation schemes are generated, each representing a resource allocation combination for different recipients within the high-risk group. Through multiple generations of iteration, the solution that simultaneously satisfies high transmission intensity coverage (i.e., prioritizing resource allocation to recipients in high transmission intensity areas) and high infection probability guarantee (i.e., prioritizing resource allocation to recipients with high infection probability) is selected as the optimal allocation scheme. The weight coefficient for the fairness objective in the allocation of the high-risk group is set to 0.3, and the weight coefficient for the resource utilization objective is set to 0.7. Priority is given to ensuring that resources are concentrated on the highest-risk recipients, while allowing a certain degree of allocation difference within the high-risk group to guarantee the most urgent recipients. After the solution is completed, a high-risk group distribution sequence is generated, which includes the object identification number, the type and quantity of protective equipment allocated, and the objects in the sequence are arranged from high to low according to the comprehensive priority assessment value.
[0054] The remaining resources are allocated to the medium-risk group. The Pareto optimization for the medium-risk group, while ensuring basic risk coverage, increases the weighting coefficient for the fairness of distribution objective to 0.5, and correspondingly adjusts the weighting coefficient for the resource utilization objective to 0.5. The optimization objective, while ensuring coverage of transmission intensity and infection probability, adds a constraint on balanced regional distribution. The constraint is that the total amount of resources received by medium-risk individuals within the same geographic grid should be proportional to the number of medium-risk individuals within that grid, avoiding supply gaps for medium-risk individuals in the same region due to resource concentration in the high-risk group. The optimization algorithm simultaneously considers the geographical proximity of each individual, using spatial clustering to group geographically close individuals, resulting in relatively balanced resource allocation within each group. A minimum guarantee constraint is introduced during the optimization process to ensure that each medium-risk individual receives at least 50% of the resources in the standard protective kit. Based on this, differentiated allocation is performed according to the comprehensive priority evaluation value, generating a relatively balanced distribution sequence for the medium-risk group within the region.
[0055] Resource allocation for the low-risk group adopts a maximum fairness strategy, with the weighting coefficient for fairness in distribution increased to 0.7 and the weighting coefficient for resource utilization decreased to 0.3. The optimization objective focuses on ensuring full coverage of minimum protection needs, employing a minimum variance allocation algorithm to ensure each low-risk individual receives at least a basic quota of protective equipment, consisting of 5 ordinary medical masks and 1 bottle of portable disinfectant spray. The allocation algorithm calculates the variance of the resource value received by all low-risk individuals, iteratively adjusting the allocation quantity for each individual to minimize the variance, achieving a high degree of fairness within the low-risk group. A randomization perturbation factor is introduced during the generation of the low-risk group's distribution sequence. Individuals with similar comprehensive priority assessment values are randomly sorted, with the perturbation range set to individuals whose comprehensive priority assessment value difference is less than 0.05. This avoids the perceived unfairness caused by a fixed sorting pattern and enhances the system's perceived fairness.
[0056] Finally, the distribution sequences of the three risk groups are merged sequentially according to group priority. During merging, the high-risk group sequence is placed at the front to ensure that the highest-risk individuals receive resources first, the medium-risk group sequence is in the middle, and the low-risk group sequence is placed at the end. During the merging process, a priority label is added to the distribution instructions for each individual: P1 for high-risk groups, P2 for medium-risk groups, and P3 for low-risk groups, facilitating identification and processing by the execution module. The merged distribution priority sequence is output in list form, containing the identification number of all individuals to be tested, priority label, details of allocated protective equipment, and suggested distribution time. This directly drives the automated sorting and delivery process of protective equipment. The execution module processes distribution tasks sequentially according to the priority labels, ensuring that protective equipment for high-priority individuals is delivered in the shortest possible time.
[0057] This invention enables dynamic adjustment of resource allocation schemes according to the epidemic situation.
[0058] In one optional implementation, the step of performing time-series correlation matching between the health status tracking information in the issuance feedback data and the risk assessment results to extract prediction bias features includes: Execute the issuance operation according to the issuance instruction and collect issuance feedback data; The distribution feedback data was timestamped, and the health status tracking records of each subject after the distribution of protective equipment were extracted, including the time points of symptom changes, test results, and disease progression trajectory. The health status tracking records are time-series aligned with the risk assessment results of the corresponding subjects to be tested to construct a prediction-actual control sequence. The risk assessment results include the infection probability value and transmission risk level at the time of distribution. A bias analysis was performed on the predicted-actual control sequences. When the actual confirmed infection time was earlier than the time window predicted by the risk assessment results, or when the actual infection was not detected but the infection probability value was higher than the set infection threshold, the corresponding sample was marked as a biased sample. Extract the symptom feature combinations, contact relationship patterns, and regional situation parameters corresponding to the deviation samples, construct a set of deviation feature vectors, and statistically analyze the occurrence frequency and correlation strength of various deviation features as predicted deviation features.
[0059] For example, the data collection process is initiated immediately after the distribution of protective equipment is completed. The collection terminal adds a Unix timestamp to each distribution feedback data, accurate to the millisecond, recording the time of distribution instruction generation, actual delivery time, and receipt confirmation time.
[0060] The health status of those receiving protective equipment was continuously tracked for 14 days, with body temperature and symptom self-assessment data collected twice daily at 8:00 AM and 8:00 PM. Body temperature data was obtained using an electronic thermometer with an accuracy of 0.1 degrees Celsius, recording axillary temperatures. Symptom self-assessment included eight categories: fever, cough, sore throat, fatigue, difficulty breathing, muscle aches, headache, and diarrhea, scored on a Likert scale from 0 to 10, where 0 indicates no symptoms and 10 indicates extremely severe symptoms. Symptom self-assessment data was collected via a mobile application, where participants selected their current symptom type and assessed its severity.
[0061] The system records key milestones such as the first fever (when body temperature first exceeds 37.3 degrees Celsius), cough duration (the total time from the onset of cough to its disappearance in hours), and the onset of shortness of breath (the first time the subject reports difficulty breathing or shortness of breath). When a subject undergoes laboratory diagnostic testing, the system automatically acquires the testing time, results, and pathogen concentration data. The testing time records two points: sample collection and result issuance. Results are categorized as positive, negative, and indeterminate. A positive result indicates detection of the target pathogen, a negative result indicates no detection, and an indeterminate result indicates an unsuitable sample quality or a sample within the detection sensitivity threshold. Pathogen concentration data is obtained through quantitative detection methods and expressed as a pathogen load parameter in copies per milliliter. Higher values indicate higher pathogen loads and stronger infectivity.
[0062] The disease progression trajectory was recorded in segments, showing the transition points from mild to moderate to severe. Mild case was defined as having only mild respiratory symptoms such as sore throat and mild cough, but with normal body temperature or low-grade fever below 38 degrees Celsius, and a symptom self-assessment score of less than 15. Moderate case was defined as having persistent fever above 38 degrees Celsius accompanied by significant cough and fatigue, with a symptom self-assessment score between 15 and 30, but without dyspnea or normal blood oxygen saturation. Severe case was defined as having dyspnea, blood oxygen saturation below 93%, or imaging showing lung lesions covering more than 50% of the lungs, a symptom self-assessment score exceeding 30, or requiring oxygen therapy. The time when each subject transitioned from mild to moderate, from moderate to severe, and the symptom intensity value changes over each time period were recorded. The symptom intensity value was a weighted sum of the scores from each symptom self-assessment, with weights determined based on the indicative role of each symptom in the severity of the disease. Fever and dyspnea had higher weights of 0.25 and 0.3, respectively, while other symptoms had weights ranging from 0.05 to 0.15.
[0063] When performing time-series alignment, the time of distribution is used as the baseline zero point, i.e., the zero point of the timeline. The infection probability value, transmission risk level, and predicted infection time window output from the risk assessment stage are extracted. The infection probability value ranges from 0 to 1, and this value is output by the aforementioned risk assessment network, maintaining consistency with the aforementioned risk assessment process. The transmission risk level is divided into three levels: low, medium, and high. This level is also output by the transmission risk level sub-layer of the aforementioned risk assessment network. Low risk corresponds to a probability value of 0 to 0.3, medium risk to 0.3 to 0.7, and high risk to 0.7 to 1.0. The predicted infection time window is calculated based on a statistical model of the incubation period of infectious diseases. The incubation period model is established based on historical epidemiological data, using a Weibull distribution to fit the incubation period distribution curve. The model parameters are a shape parameter of 2.3 and a scale parameter of 6.4 days. The 5th and 95th percentiles are calculated based on this model as the lower and upper bounds of the time window. The predicted infection time window is typically set to the 3rd to 10th day after distribution, indicating that if the person being tested is infected, symptoms are most likely to appear within this time period.
[0064] The health tracking records are expanded along a timeline, with the origin at the time the instruction was issued and the positive direction representing the elapsed time since issuance. Each tracking record is marked with its time offset relative to the origin, in hours. A predictive-actual control sequence is constructed using an N×T×4-dimensional dual-channel tensor for storage, where N is the total number of subjects, T is a 336-hour time step (14 days × 24 hours), and the four dimensions are the predicted time point, the actual symptom onset time, the predicted probability value, and the actual test result. The predicted time point is the median of the predicted infection time window, calculated by summing the lower and upper bounds of the window and dividing by 2. For example, if the window is from day 3 to day 10, the predicted time point is day 6.5, or 156 hours after issuance. The actual symptom onset time point is the time difference between the moment the subject first reports any of the core symptoms (fever, cough, or difficulty breathing) and the time of issuance. The predicted probability value is the aforementioned infection probability value. The actual test result is the result of the first laboratory test during the tracking period; a positive result is recorded as 1, a negative result as 0, and no test or an uncertain result as -1.
[0065] The deviation analysis module calculates two core indicators: time deviation and probability deviation. Time deviation is defined as the difference between the actual symptom onset time and the predicted time, measured in days. A positive value indicates that the actual symptom onset was later than predicted, while a negative value indicates that the actual symptom onset was earlier than predicted. A time deviation threshold of -2 days is set. When the time deviation is less than -2 days, it is considered an early-onset deviation, indicating that the risk assessment model's prediction of the infection risk for this individual is lagging, and the actual infection time is at least 2 days earlier than the lower bound of the model's predicted time window.
[0066] The probability bias is calculated by comparing the infection probability value with the actual infection status. The formula is: Probability bias equals infection probability value minus actual infection status. The actual infection status is determined based on the test results, with 1 for positive and 0 for negative. An infection threshold of 0.6 is set to separate high and low infection risks. When the infection probability value is greater than 0.6 (i.e., the model predicts a high infection risk, but laboratory tests remain negative during the follow-up period, i.e., at least three consecutive tests with an interval of at least 48 hours between each negative result), it is considered an overestimation bias. In this case, the probability bias is positive and has a large absolute value, indicating that the model overestimated the infection risk of that subject.
[0067] Samples meeting either of the following deviation criteria—a time deviation less than -2 days or an absolute probability deviation greater than 0.4—are automatically marked as biased samples. The deviation type, deviation magnitude, and confidence decay coefficient are recorded. Deviation types include four categories: early-onset, overestimation, underestimation, and composite. Composite samples simultaneously meet both the time deviation and probability deviation criteria. The deviation magnitude is defined as the normalized weighted sum of the time deviation and probability deviation. The formula is: deviation magnitude = absolute value of time deviation divided by 14 days multiplied by 0.5 + absolute value of probability deviation multiplied by 0.5. After normalization, the deviation magnitude ranges from 0 to 1; a larger value indicates a more severe deviation from reality. The confidence decay coefficient is calculated based on the deviation magnitude. The formula is: confidence decay coefficient = 1 minus deviation magnitude multiplied by 0.8; a larger deviation magnitude results in a smaller confidence decay coefficient.
[0068] The feature extraction of biased samples unfolds across three dimensions. Symptom feature combinations are used to extract symptom description vectors, with eight parameters—fever intensity, cough frequency, fatigue index, dyspnea intensity, sore throat intensity, muscle aches intensity, headache intensity, and diarrhea frequency—gradually coded. Fever intensity is categorized by temperature range: normal (36.0-37.2 degrees Celsius) coded as 0, low-grade fever (37.3-38.0 degrees Celsius) coded as 1, moderate fever (38.1-39.0 degrees Celsius) coded as 2, and high fever (above 39.1 degrees Celsius) coded as 3. Cough frequency is categorized by the number of coughs per hour: no cough coded as 0, occasional cough (1-3 times) coded as 1, frequent cough (4-10 times) coded as 2, and severe cough (more than 10 times) coded as 3. Other symptom parameters employ a similar grading and coding method. Each parameter, after being encoded, occupies multiple dimensions of the vector, represented by unique thermal encoding. The eight symptom parameters form a 128-dimensional symptom feature vector, where 1 indicates the presence of the corresponding symptom level, and 0 indicates its absence.
[0069] The contact relationship pattern statistics include: the number of close contacts (those within 2 meters and with contact lasting more than 15 minutes); the types of contact locations (homes, workplaces, public transportation, restaurants, medical institutions, etc.); the contact duration distribution (the duration of each contact, including minimum, maximum, average, and standard deviation); and the infection confirmation rate (the number of confirmed infections among contacts divided by the total number of contacts). These statistics are normalized: the number of close contacts is normalized by dividing by 100; the number of contact location types is normalized by dividing by 10; and the contact duration distribution statistics are normalized by dividing by 240 minutes (4 hours). The infection confirmation rate, being a ratio of 0 to 1, does not require normalization. The normalized statistics, combined with graph structural features such as node degree centrality and clustering coefficients in the contact relationship graph, constitute a 64-dimensional contact relationship vector.
[0070] The regional situation parameters extracted include: case growth rate (the average daily number of new cases in the past 7 days divided by the number of existing cases 7 days ago); medical resource occupancy rate (the number of isolation beds used in medical institutions in the region divided by the total number of beds); population mobility intensity (the total number of people flowing in and out of the region according to mobile phone signaling data divided by the number of permanent residents in the region); and epidemic prevention policy level (the intensity of the prevention and control measures implemented in the region is divided into four levels: low 1, medium 2, high 3, and strict 4). The case growth rate is multiplied by 100 and normalized to the range of 0 to 1. The medical resource occupancy rate and population mobility intensity are proportional values, and the epidemic prevention policy level is normalized by dividing by 4. These four normalized parameters, along with the regional historical epidemic fluctuation coefficient and the normalized value of regional population density, are collectively encoded into a 32-dimensional regional situation vector.
[0071] The three types of vectors are concatenated to form a 224-dimensional deviation feature vector, and all deviation samples constitute an M×224-dimensional deviation feature vector set.
[0072] Statistical analysis was performed on the set of deviation feature vectors. The chi-square test was used to identify feature dimensions significantly associated with deviation. The null hypothesis of the chi-square test is that there is no significant difference in the distribution of deviation samples and normal samples along a certain feature dimension. The null hypothesis was rejected when the chi-square value was greater than 3.841 (significance level 0.05), and this feature dimension was identified as a key deviation feature. The mutual information method was used to quantify the association strength. Mutual information measures the statistical dependence between two random variables. A mutual information value greater than 0.3 indicates a strong association between the feature and the deviation type, and feature pairs with a mutual information value greater than 0.3 are marked as strongly associated feature pairs.
[0073] The statistical results generate a ranking list of bias features by importance (arranged in descending order of chi-square value) and a 224×224-dimensional feature correlation matrix (matrix elements are mutual information values). The top 20 feature dimensions are marked as high-importance features and used as the output for predicting bias features, guiding the direction of risk assessment network parameter updates and the calculation of confidence correction values.
[0074] This invention achieves a quantitative assessment of the accuracy of risk assessment prediction by using time-series correlation matching and deviation feature extraction.
[0075] In one optional implementation, the step of updating the parameters of the risk assessment network and generating a confidence correction value based on the prediction bias characteristics, and using the confidence correction value for subsequent distribution priority calculation of the subjects to be tested, includes: Using the set of deviation feature vectors as training samples, gradient descent optimization is performed on the weight parameters of the feature fusion module and the risk assessment output layer in the risk assessment network. Based on the updated risk assessment network, the set of deviation feature vectors is re-predicted, and the improvement in prediction accuracy before and after the update is calculated. The improvement in prediction accuracy is then mapped to a confidence correction value; the greater the improvement in prediction accuracy, the higher the confidence correction value. For newly added targets to be tested, after the risk assessment results are output through the risk assessment network, the combination of symptom features and contact relationship patterns of the target target are extracted and matched with the deviation feature vector set for similarity. When the similarity exceeds the preset similarity threshold, when dynamically calculating the distribution priority of the object to be tested based on the risk classification identifier, transmission intensity index and protective equipment inventory status, the confidence correction value is used as the priority adjustment coefficient to correct the calculated distribution priority, and a confidence-corrected distribution priority is generated.
[0076] For example, each sample in the bias feature vector set is labeled with its actual risk level. The actual risk level is determined based on the test results during the tracking period of that sample. Positive samples are labeled as high risk, negative samples with persistent symptoms are labeled as medium risk, and negative samples with no symptoms are labeled as low risk. This constitutes the training dataset for supervised learning. The training dataset is divided into a training set and a validation set in an 8:2 ratio. The training set is used for parameter updates, and the validation set is used to monitor overfitting.
[0077] The backpropagation algorithm is used to optimize the parameters of the feature fusion module and the risk assessment output layer of the risk assessment network. The feature fusion module includes a symptom feature encoding branch and a contact relationship encoding branch. The symptom feature encoding branch adopts the aforementioned three fully connected layer structure, and the contact relationship encoding branch adopts the aforementioned graph convolutional network structure, consistent with the aforementioned risk assessment process. The feature vectors output from the two branches are weighted and fused through an attention mechanism and then fed into the risk assessment output layer, which includes an infection probability sublayer and a transmission risk level sublayer.
[0078] During gradient descent, the learning rate is set between 0.001 and 0.01, with the specific value adaptively adjusted based on the descent rate of the loss function. The Adam optimizer is used for weight updates, with its momentum parameter β1 set to 0.9, second-order momentum parameter β2 set to 0.999, and epsilon parameter set to 10. -8This optimizer can adaptively adjust the learning rate of each parameter, accelerating convergence. The loss function uses cross-entropy loss, calculated as: Loss = -∑(Actual Label × log(Predicted Probability)). The total loss is calculated by averaging the loss values of all training samples. The number of training iterations is determined based on the convergence of the validation set loss function. The validation set loss is calculated after each epoch. Training stops when the validation set loss no longer decreases or decreases by less than 0.001 for five consecutive epochs to avoid overfitting.
[0079] During parameter updates, the weight matrix W and bias term b of the fully connected layer in the feature fusion module are updated using gradients ∂loss / ∂W and ∂loss / ∂b, respectively. The update formulas are: Wnew = Wold - learning rate × ∂loss / ∂W, bnew = bold - learning rate × ∂loss / ∂b. The convolution kernel parameters of the graph convolutional network are also updated by calculating gradients through backpropagation. The weight parameters of the infection probability sublayer and the propagation risk level sublayer in the risk assessment output layer are updated synchronously with gradients to ensure end-to-end optimization of the entire network.
[0080] After the parameters are updated, the set of biased feature vectors is re-inputted into the updated risk assessment network to obtain new prediction results. The number of correctly predicted samples before and after the update is counted; a correct prediction is defined as the predicted risk level matching the actual labeled risk level. The accuracy before the update is calculated as: (Number of correctly predicted samples before update) / (Total number of biased samples). The accuracy after the update is calculated as: (Number of correctly predicted samples after update) / (Total number of biased samples). The improvement in prediction accuracy, ΔP, is obtained by subtracting the accuracy before the update from the accuracy after the update. The value typically ranges from 0 to 0.3; a larger ΔP indicates a more significant improvement in the network's ability to predict biased samples after the parameter update.
[0081] The accuracy improvement ΔP is converted into a confidence correction value using a piecewise linear mapping function. The mapping rules are set as follows: when ΔP is less than 0.05, the confidence correction value is 1.0, indicating almost no improvement and no correction is made; when ΔP is between 0.05 and 0.15, the confidence correction value increases linearly from 1.0 to 1.1; when ΔP is between 0.15 and 0.25, the confidence correction value increases linearly from 1.1 to 1.2; and when ΔP is greater than 0.25, the confidence correction value is fixed at 1.25 to avoid over-correction. The specific calculation formula is as follows: when 0.05 ≤ ΔP < 0.15, the confidence correction value = 1.0 + (ΔP - 0.05) × 1.0; when 0.15 ≤ ΔP < 0.25, the confidence correction value = 1.1 + (ΔP - 0.15) × 1.0; and when ΔP ≥ 0.25, the confidence correction value = 1.25. The greater the improvement, the higher the correction coefficient, indicating that the network's ability to judge similar samples is enhanced after correction. The confidence correction value is used as an adjustment coefficient for subsequent priority calculation.
[0082] For newly added individuals to be tested, the updated risk assessment network first outputs initial risk assessment results, including infection probability values and transmission risk levels. Symptom feature combinations and contact relationship patterns are then extracted from the individual's original input data. Symptom feature combinations include continuous or discrete features such as fever duration in hours, cough frequency in times per hour, and dyspnea score of 0-10, consistent with the aforementioned symptom feature extraction method, and are encoded into a 128-dimensional feature vector. Contact relationship patterns include elements such as the time interval between confirmed contacts in days, the type of contact location, and the duration of contact in minutes, consistent with the aforementioned contact relationship feature extraction method, forming a 64-dimensional contact history feature vector.
[0083] The symptom feature vector and the contact history feature vector are concatenated into a 192-dimensional comprehensive feature vector. This vector is a subset of the aforementioned 224-dimensional deviation feature vector, containing only the symptom and contact relationship components, excluding the regional situation component. Cosine similarity is calculated between this comprehensive feature vector and the first 192 dimensions (symptom and contact relationship components) of each sample in the deviation feature vector set. The cosine similarity formula is: Similarity = (Vector A · Vector B) / (||Vector A|| × ||Vector B||), where · represents the vector dot product, and || represents the Euclidean norm of the vectors. The calculated value ranges from -1 to 1; the closer the value is to 1, the more similar the two vectors are.
[0084] The preset similarity threshold is set to 0.75. This threshold represents the cosine value of the angle between the feature vectors, which corresponds to an angle of approximately 41 degrees. When the similarity between the comprehensive feature vector of a new object and any sample in the set of deviation feature vectors exceeds 0.75, the new object is determined to belong to the population with similar historical deviation patterns. The feature pattern of the object is highly similar to the samples that have shown deviations in historical predictions, and confidence correction needs to be applied.
[0085] In the process of calculating the distribution priority, the original priority calculation method is consistent with the aforementioned comprehensive priority assessment value calculation method. Specifically, the risk classification identifier is mapped to a basic priority score of 80-100 points for high risk, 40-79 points for medium risk, and 20-39 points for low risk. The transmission intensity index is converted into an urgency coefficient of 0.5-2.0. The two are multiplied to generate a preliminary priority score, which is then combined with the infection probability value with weights of 0.7 and 0.3 to calculate the original priority, with a value range of 0 to 200.
[0086] When a new recipient is identified as belonging to a group with a similar pattern to the deviation sample, the corresponding confidence correction value is used as a multiplication factor and multiplied by the original priority to obtain the corrected distribution priority. The calculation formula is: Corrected Priority = Original Priority × Confidence Correction Value. For example, if a recipient's original priority is 85 points, and its similarity to the deviation sample is 0.82, exceeding the threshold of 0.75, the corresponding confidence correction value is 1.12. Then, the corrected priority = 85 × 1.12 = 95.2 points. This recipient's distribution priority increases by 10.2 points, moving them to a higher position in the distribution priority ranking sequence, giving them priority in receiving protective equipment allocation.
[0087] The logical basis for the correction operation is that historical deviation samples reflect inaccurate predictions by the risk assessment network under specific characteristic patterns. While parameter updates improve network performance, caution is still needed for new objects highly similar to the deviation samples. The application of confidence correction values essentially adds safety redundancy to the risk prediction of such objects, ensuring that even if the network prediction still has some error, increasing the allocation priority guarantees that the object receives protection resources in a timely manner, reducing the risk of protection gaps due to inaccurate predictions. The corrected allocation priority then enters the aforementioned allocation priority ranking sequence generation process, participating in multi-objective optimization and hierarchical resource allocation, ultimately outputting the corrected allocation instruction.
[0088] The confidence level correction values and their corresponding deviation feature patterns are persistently stored to form a deviation knowledge base. This knowledge base continuously accumulates as the system runs. Each time a new deviation sample is generated, the deviation feature vector set is updated and the network is retrained to calculate a new confidence level correction value. The knowledge base supports version management, retaining historical correction values for traceability and auditing. When the number of deviation samples accumulates to more than 100 or the accuracy improvement ΔP changes by more than 0.1, a full retraining of the network is triggered to ensure that the risk assessment network remains adaptable to the current epidemic situation and population characteristics.
[0089] This invention achieves continuous learning and optimization of the risk assessment network through a parameter update and confidence correction mechanism driven by deviation features, and dynamically adjusts the allocation priority for similar groups of people with historical prediction deviations, thereby improving the accuracy of resource allocation.
[0090] A second aspect of the present invention provides a deep learning-based infectious disease pre-detection and protective equipment distribution system, comprising: The data input unit is used to acquire symptom description data and contact history data of the subject to be tested, input them into the risk assessment network, and output the risk assessment results. The risk classification unit is used to determine the risk classification identifier and transmission intensity index of the object to be detected based on the risk assessment results and regional situation data. The priority calculation unit is used to dynamically calculate the distribution priority of each object to be tested based on the risk classification identifier, the transmission intensity index and the inventory status of protective equipment, and generate the distribution instruction of protective equipment. The issuance execution unit is used to execute the issuance operation according to the issuance instruction and collect issuance feedback data; The network optimization unit is used to perform time-series correlation matching between the health status tracking information in the distribution feedback data and the risk assessment results, extract prediction deviation features, update the parameters of the risk assessment network based on the prediction deviation features and generate a confidence correction value, and use the confidence correction value for subsequent distribution priority calculation of the target objects.
[0091] A third aspect of the present invention provides an electronic device, comprising: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0092] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0093] This invention can be a method, apparatus, system, and / or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of the invention.
[0094] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for infectious disease pre-detection and distribution of protective equipment based on deep learning, characterized in that, include: The system acquires symptom description data and exposure history data of the subjects to be tested, inputs them into the risk assessment network, and outputs risk assessment results. Based on the risk assessment results and regional situation data, the risk classification identifier and transmission intensity index of the target to be monitored are determined; Based on the aforementioned risk classification identifiers, transmission intensity indicators, and protective equipment inventory status, a multi-objective optimization algorithm is used to dynamically calculate the distribution priority for each target to be tested, and to generate distribution instructions for protective equipment. Execute the issuance operation according to the issuance instruction and collect issuance feedback data; The health status tracking information in the distribution feedback data is matched with the risk assessment results in a time series correlation to extract prediction deviation features. Based on the prediction deviation features, the parameters of the risk assessment network are updated and a confidence correction value is generated. The confidence correction value is used to calculate the distribution priority of subsequent test subjects.
2. The method according to claim 1, characterized in that, The steps for obtaining symptom description data and exposure history data of the subjects to be tested, inputting them into the risk assessment network, and outputting risk assessment results include: Semantic features are extracted from the symptom description data, and the extracted symptom semantic features are matched with a preset infectious disease symptom feature database to generate a symptom matching degree vector; The contact history data is analyzed in a spatiotemporal manner to extract contact time nodes, contact location coordinates, and contact object identifiers, and a spatiotemporal contact relationship map is constructed. The symptom matching vector and the spatiotemporal contact relationship map are input into the risk assessment network. The symptom features and contact relationship features are cross-correlated and encoded through a multi-layer feature fusion module to generate a fused feature representation. Based on the fusion feature representation, the infection probability value and transmission risk level are calculated through the risk assessment output layer as the risk assessment result.
3. The method according to claim 1, characterized in that, Based on the risk assessment results and regional situation data, the steps for determining the risk classification identifier and transmission intensity index of the target to be monitored include: The infection probability value from the risk assessment results is compared with a preset set of risk thresholds to determine the preliminary risk level of the target. Spatiotemporal distribution analysis was performed on the regional situation data to extract the number of confirmed cases and population density parameters at the location of the target object, and a regional transmission situation feature vector was constructed. The preliminary risk level identifier, the transmission risk level in the risk assessment result, and the regional transmission trend feature vector are weighted and fused together. The number of confirmed cases is used as the regional risk weight coefficient to dynamically correct the preliminary risk level identifier, thereby generating a risk classification identifier for the object to be tested. Based on the risk classification identifier and the regional transmission trend feature vector, the virus transmission rate and infection spread radius of the location of the object to be detected are calculated, and a transmission intensity index is generated.
4. The method according to claim 1, characterized in that... Based on the risk classification identifier, transmission intensity index, and protective equipment inventory status, the steps for dynamically calculating the distribution priority of each target object and generating a distribution instruction for protective equipment using a multi-objective optimization algorithm include: Obtain the current inventory status of protective equipment, extract the remaining quantity of each type of protective equipment, and construct a set of inventory constraints; The risk classification identifier is mapped to a basic priority score, and the transmission intensity index is converted into an urgency coefficient. The two are then multiplied to generate a preliminary priority score. The infection probability value from the risk assessment results is combined with the preliminary priority score to generate a comprehensive priority assessment value. Based on the comprehensive priority evaluation value and the set of inventory constraints, a multi-objective optimization function containing resource utilization and distribution fairness objectives is constructed using a multi-objective optimization algorithm. The distribution priority of each object to be detected is iteratively solved to generate a distribution priority ranking sequence, wherein the remaining quantity is used for constraint calculation of the resource utilization objective. Based on the distribution priority sorting sequence and the inventory constraint set, the distribution type and quantity of each object to be tested are determined, and a distribution instruction for protective equipment is generated.
5. The method according to claim 4, characterized in that... The steps for generating the priority order sequence for issuance include: The current period's epidemic transmission rate, confirmed case growth rate, and inventory depletion rate are extracted. An adaptive weight adjustment mechanism is used to calculate the target weight coefficients for resource utilization efficiency and distribution fairness. The epidemic transmission rate and the confirmed case growth rate are used as positive correlation factors to increase the target weight coefficient for resource utilization efficiency. All objects to be tested are divided into high-risk, medium-risk and low-risk groups according to the risk classification identifier, and the resource quota ratio of each group is calculated based on the resource utilization target weight coefficient; Pareto optimization is performed on the high-risk group to generate a high-risk group distribution sequence. The remaining resources are then allocated to the medium-risk group to generate a medium-risk group distribution sequence. Finally, the remaining resources are allocated to the low-risk group to generate a low-risk group distribution sequence. The distribution fairness objective weight coefficient of the low-risk group is higher than that of the high-risk group. The distribution sequences for high-risk groups, medium-risk groups, and low-risk groups are merged according to group priority to generate a distribution priority sorting sequence.
6. The method according to claim 1, characterized in that... The step of performing time-series correlation matching between the health status tracking information in the distribution feedback data and the risk assessment results to extract prediction bias features includes: The distribution feedback data was timestamped, and health status tracking records of each subject after the distribution of protective equipment were extracted. The health status tracking records are time-series aligned with the risk assessment results of the corresponding subjects to be tested to construct a prediction-actual control sequence, wherein the risk assessment results include the infection probability value at the time of distribution; A bias analysis was performed on the predicted-actual control sequences. When the actual diagnosis time was earlier than the time window predicted by the risk assessment results, or when the actual infection probability value was higher than the set infection threshold but the actual infection was not, the corresponding sample was marked as a biased sample. Extract the symptom feature combinations, contact relationship patterns, and regional situation parameters corresponding to the deviation samples, construct a set of deviation feature vectors, and statistically analyze the occurrence frequency and correlation strength of various deviation features as predicted deviation features.
7. The method according to claim 6, characterized in that... The steps of updating the parameters of the risk assessment network based on the prediction bias characteristics and generating a confidence correction value, and using the confidence correction value for subsequent distribution priority calculation of the subjects to be tested, include: Using the set of deviation feature vectors as training samples, gradient descent optimization is performed on the weight parameters of the feature fusion module and the risk assessment output layer in the risk assessment network. Based on the updated risk assessment network, the set of deviation feature vectors is re-predicted, and the improvement in prediction accuracy before and after the update is calculated. The improvement in prediction accuracy is then mapped to a confidence correction value; the greater the improvement in prediction accuracy, the higher the confidence correction value. For newly added subjects to be tested, after the risk assessment results are output through the risk assessment network, the combination of symptom features and contact relationship patterns of the subject to be tested are extracted and matched with the deviation feature vector set for similarity. When the similarity exceeds the preset similarity threshold, the confidence correction value is used as the priority adjustment coefficient to correct the calculated distribution priority when calculating the distribution priority of the object to be detected, so as to generate the distribution priority after confidence correction.
8. A deep learning-based infectious disease pre-detection and protective equipment distribution system, used to implement the method of any one of claims 1-7, characterized in that, include: The data input unit is used to acquire symptom description data and contact history data of the subject to be tested, input them into the risk assessment network, and output the risk assessment results. The risk classification unit is used to determine the risk classification identifier and transmission intensity index of the object to be detected based on the risk assessment results and regional situation data. The priority calculation unit is used to dynamically calculate the distribution priority of each object to be tested based on the risk classification identifier, the transmission intensity index and the inventory status of protective equipment, and generate the distribution instruction of protective equipment. The issuance execution unit is used to execute the issuance operation according to the issuance instruction and collect issuance feedback data; The network optimization unit is used to perform time-series correlation matching between the health status tracking information in the distribution feedback data and the risk assessment results, extract prediction deviation features, update the parameters of the risk assessment network based on the prediction deviation features and generate a confidence correction value, and use the confidence correction value for subsequent distribution priority calculation of the target objects.
9. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1 to 7.
10. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 7.