Evaluation method for infectious disease transmission trend deduction data based on large model assistance
By parsing multi-source event texts using a large language model and combining it with a time decay kernel function, a causal logic assessment and physical compliance assessment of infectious disease transmission trends are generated. This solves the problem of the lack of physical constraints in infectious disease inference models and improves the accuracy and reliability of the assessment.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- INST OF AUTOMATION CHINESE ACAD OF SCI
- Filing Date
- 2026-05-15
- Publication Date
- 2026-06-12
AI Technical Summary
Existing models for predicting the spread of infectious diseases lack physical constraints, are prone to generating irrational data, and are difficult to cope with interference from real-world events. The existing assessment system is unable to effectively evaluate the rationality of the predicted data, resulting in low accuracy and credibility of the assessment results.
We use a pre-trained large language model to perform semantic parsing and encoding on multi-source event texts, extract event parameter tuples, combine them with time decay kernel functions to generate event impact coefficient sequences, and generate reasonableness assessment results through causal logic evaluation and physical compliance evaluation.
This improves the accuracy and reliability of infectious disease transmission projection data, provides a reliable basis for assessment, and ensures the scientific validity and rationality of the assessment results.
Smart Images

Figure CN122196460A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and more specifically, to an evaluation method for infectious disease transmission trend projection data based on large model assistance. Background Technology
[0002] Accurate projection of infectious disease transmission trends is central to public health monitoring and decision-making. It not only provides a scientific basis for intervention strategy formulation but also serves as a crucial support for optimizing medical resource allocation and assessing social risks. In recent years, with the deep integration of computational epidemiology and artificial intelligence technologies, infectious disease modeling paradigms are undergoing a profound evolution from classical dynamic models (such as susceptible-infected-recovered models and susceptible-latent-infected-recovered models) to data-driven models represented by deep learning. These high-dimensional nonlinear models, with their powerful feature extraction and sequence fitting capabilities, have demonstrated significant predictive performance when processing routine transmission data with complex spatiotemporal dependencies.
[0003] However, in practical deployments, the conflict between this modeling approach based on "probability mapping" and "consistency with physical mechanisms" is becoming increasingly prominent. While data-driven models perform well in statistical metrics, they are essentially "black box" mappings lacking inherent constraints. They often overemphasize fitting historical observations while neglecting the intrinsic dynamics of infectious disease development. This lack of physical consistency easily leads to "data illusions," where the generated predicted sequences may contain non-physical negative values, or even predict abnormal fluctuations that violate the basic reproduction number limit without external disturbances, producing deductions that contradict real-world logic. Furthermore, infectious disease transmission, as an open and complex system subject to dynamic adjustments in intervention measures, pathogen biological variations, and dramatic disturbances in social behavior, exhibits typical non-stationarity in its evolutionary trajectory. Most related predictive models rely on the "statistical inertia" of historical data for extrapolation, making it difficult to achieve real-time perception and dynamic mapping of environmental disturbance signals and complex exogenous factors. Consequently, when faced with structural breakpoints such as changes in response strategies or viral mutations, these models often fail to capture causal logic, resulting in lagging or misleading trend predictions.
[0004] More seriously, current evaluation methods for projection results remain at the stage of simply measuring statistical errors, making it difficult to effectively determine the "intrinsic rationality" of the predicted data. Related evaluation systems have not yet effectively integrated complex environmental factors and domain-specific prior constraints, often making it difficult for decision-makers to select scientifically reliable solutions from numerous projection results. Therefore, constructing a rationality evaluation system that does not rely on a specific model architecture and can automatically identify and correct non-physical projection biases has become an urgent need to improve the scientific decision-making level of public health emergency response. Summary of the Invention
[0005] In view of this, this application provides an evaluation method, apparatus and equipment for predicting the spread of infectious diseases based on large model-assisted data.
[0006] One aspect of this application provides an evaluation method for infectious disease transmission trend projection data based on a large model, comprising: acquiring an infectious disease transmission projection data sequence to be evaluated, the infectious disease transmission projection data sequence including multiple infectious disease transmission prediction data for a target area and arranged in chronological order within a predetermined time window; based on the text semantic feature vector obtained by semantic parsing and encoding multi-source event texts of the target area using a pre-trained large language model, extracting event parameter tuples for multiple events from multiple multi-source event texts by calculating the relative projection distance between the text semantic feature vector and a preset extreme value semantic benchmark vector, the event parameter tuples include the event occurrence time, event type, and event intensity characterizing the intensity of the event's impact on infectious disease transmission; for each event, using a time decay kernel function matched to the event type, based on the event... The event intensity and initiation time of each event, as well as the range of event impact coefficients corresponding to the event type, are used to generate an event impact coefficient sequence characterizing the impact of each event on the spread of infectious diseases. The initiation time of each event is determined based on the event occurrence time and the lag parameter corresponding to the event type. The comprehensive event impact coefficient sequence obtained by linearly superimposing and fusing multiple event impact coefficient sequences and performing nonlinear truncation is used as a causal prior for the fluctuation trend of the effective reproduction number. By comparing the dynamic consistency between the infectious disease spread projection data sequence and the comprehensive event impact coefficient sequence at key feature points, a causal logic evaluation result is generated. Based on the causal logic evaluation result, the smoothness evaluation result characterizing the smoothness of the infectious disease spread projection data sequence, and the physical compliance evaluation result characterizing whether the infectious disease spread projection data sequence meets the physical boundary constraints, a rationality evaluation result is generated.
[0007] Another aspect of this application provides an evaluation device for infectious disease transmission trend prediction data based on a large model, comprising: a data acquisition module for acquiring an infectious disease transmission prediction data sequence to be evaluated, the infectious disease transmission prediction data sequence including multiple infectious disease transmission prediction data for a target area and arranged in chronological order within a predetermined time window; a parsing and extraction module for extracting event parameter tuples for multiple events from multiple multi-source event texts based on text semantic feature vectors obtained by semantic parsing and encoding multi-source event texts of the target area using a pre-trained large language model, by calculating the relative projection distance between the text semantic feature vectors and a preset extreme value semantic benchmark vector, the event parameter tuples including the event occurrence time, event type, and event intensity characterizing the intensity of the event's impact on infectious disease transmission; and an impact determination module for determining the impact of each event using a time decay kernel function matched to the event type. The system generates a sequence of event impact coefficients, representing the impact of each event on the spread of infectious diseases, based on the event intensity, the event initiation time, and the range of event impact coefficient values corresponding to the event type. The event initiation time is determined based on the event occurrence time and the lag parameter corresponding to the event type. The impact assessment module uses the comprehensive event impact coefficient sequence obtained by linearly superimposing and fusioning multiple event impact coefficient sequences and nonlinearly truncating them as a causal prior for the fluctuation trend of the effective reproduction number. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event impact coefficient sequence at key feature points, a causal logic assessment result is generated. The result generation module generates a rationality assessment result based on the causal logic assessment result, the smoothness assessment result representing the smoothness of the infectious disease transmission projection data sequence, and the physical compliance assessment result representing whether the infectious disease transmission projection data sequence meets physical boundary constraints.
[0008] Another aspect of this application provides an electronic device comprising: one or more processors; and a memory for storing one or more programs, wherein, when the one or more programs are executed by the one or more processors, the one or more processors cause the one or more processors to perform the method as described above.
[0009] Another aspect of this application provides a computer-readable storage medium storing computer-executable instructions that, when executed, are used to implement the method described above.
[0010] Another aspect of this application provides a computer program product comprising computer-executable instructions which, when executed, are used to implement the method described above.
[0011] According to embodiments of this application, a pre-trained large language model is used to semantically parse, encode, and extract multiple unstructured, multi-source exogenous event information in a target region. This yields the event occurrence time, event type, and event intensity for each event. Using a time decay kernel function and event influence coefficient sequence matching the event type, an event influence coefficient sequence representing the event's impact on infectious disease transmission is determined based on event intensity and occurrence time. A comprehensive event influence coefficient sequence is obtained by fusing and truncating multiple event influence coefficient sequences. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event influence coefficient sequence at key feature points, a causal logic assessment of the fluctuation trend of the infectious disease transmission projection data sequence is achieved. This allows the assessment process to incorporate multi-source exogenous event information influencing infectious disease transmission. Furthermore, based on the causal logic assessment, smoothness and physical compliance assessments are performed on the infectious disease transmission projection data. This multi-dimensional assessment of the rationality of the infectious disease transmission projection data sequence indirectly evaluates the rationality of the projection method, resulting in a comprehensive and accurate rationality assessment. This improves the accuracy and credibility of the assessment results and provides a reliable basis for evaluating infectious disease transmission projection methods. Attached Figure Description
[0012] The above and other objects, features and advantages of this application will become clearer from the following description of embodiments of this application with reference to the accompanying drawings.
[0013] Figure 1 This paper illustrates an application scenario of the evaluation method for infectious disease transmission trend projection data based on large model assistance, according to an embodiment of this application.
[0014] Figure 2 A flowchart is shown for an evaluation method based on large model-assisted inference data on the spread of infectious diseases, according to an embodiment of this application.
[0015] Figure 3 A schematic diagram of determining the sequence of event influence coefficients according to an embodiment of this application is shown.
[0016] Figure 4 A schematic diagram illustrating the determination of the reasonableness assessment results according to an embodiment of this application is shown.
[0017] Figure 5 A block diagram of an evaluation apparatus for infectious disease transmission trend projection data based on large model-assisted methods, according to an embodiment of this application, is shown.
[0018] Figure 6 A block diagram of an electronic device suitable for implementing an evaluation method based on large model-assisted data for predicting the spread of infectious diseases, according to an embodiment of this application, is shown. Detailed Implementation
[0019] The embodiments of this application will now be described with reference to the accompanying drawings. However, it should be understood that these descriptions are exemplary only and are not intended to limit the scope of this application. In the following detailed description, numerous specific details are set forth to provide a thorough understanding of the embodiments of this application for ease of explanation. However, it will be apparent that one or more embodiments may be implemented without these specific details. Furthermore, descriptions of well-known structures and technologies are omitted in the following description to avoid unnecessarily obscuring the concepts of this application.
[0020] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of this application. The terms “comprising,” “including,” etc., as used herein indicate the presence of the stated features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.
[0021] All terms used herein (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein are to be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.
[0022] When using expressions such as "at least one of A, B and C", they should generally be interpreted in accordance with the meaning that is commonly understood by those skilled in the art (e.g., "a system having at least one of A, B and C" should include, but is not limited to, a system having A alone, a system having B alone, a system having C alone, a system having A and B, a system having A and C, a system having B and C, and / or a system having A, B and C, etc.).
[0023] In the embodiments of this application, the collection, updating, analysis, processing, use, transmission, provision, application, and storage of data (e.g., including but not limited to user personal information) comply with relevant laws and regulations, are used for legitimate purposes, and do not violate public order and good morals. In particular, necessary measures have been taken to prevent unauthorized access to user personal information data and to safeguard user personal information security, network security, and national security.
[0024] In the embodiments of this application, the user's authorization or consent was obtained before obtaining or collecting the user's personal information.
[0025] With the development of big data and artificial intelligence, infectious disease simulation has evolved from classical dynamic models to deep learning models with stronger predictive performance. However, these data-driven models lack physical mechanism constraints due to over-reliance on statistical fitting, making them prone to generating irrational data (such as negative values, out of touch with reality, etc.) and unable to cope with interference from real-world events. In addition, the existing evaluation system also focuses on statistical errors while neglecting in-depth measurement of the rationality of the simulation data, which leads to challenges to the accuracy and credibility of the overall simulation and evaluation.
[0026] Therefore, embodiments of this application provide an evaluation method for infectious disease transmission trend projection data based on a large model, comprising: acquiring an infectious disease transmission projection data sequence to be evaluated, the infectious disease transmission projection data sequence including multiple infectious disease transmission prediction data for a target area and arranged in chronological order within a predetermined time window; based on the text semantic feature vector obtained by semantic parsing and encoding multi-source event texts of the target area using a pre-trained large language model, by calculating the relative projection distance between the text semantic feature vector and a preset extreme value semantic benchmark vector, extracting event parameter tuples for each event from multiple multi-source event texts, the event parameter tuples including the event occurrence time, event type, and event intensity characterizing the intensity of the event's impact on infectious disease transmission; for each event, using a time decay kernel function matched to the event type, based on... The event intensity, event onset time, and the range of event impact coefficient values corresponding to the event type are used to generate an event impact coefficient sequence characterizing the impact of each event on the spread of infectious diseases. The event onset time is determined based on the event occurrence time and the lag parameter corresponding to the event type. The comprehensive event impact coefficient sequence obtained by linearly superimposing and fusing multiple event impact coefficient sequences and performing nonlinear truncation is used as a causal prior for the fluctuation trend of the effective reproduction number. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event impact coefficient sequence at key feature points, a causal logic evaluation result is generated. Based on the causal logic evaluation result, the smoothness evaluation result characterizing the smoothness of the infectious disease transmission projection data sequence, and the physical compliance evaluation result characterizing whether the infectious disease transmission projection data sequence meets physical boundary constraints, a rationality evaluation result is generated.
[0027] The embodiments of this application utilize a pre-trained large language model to semantically parse, encode, and extract multiple unstructured, multi-source exogenous event information from a target region. This yields the event occurrence time, event type, and event intensity for each event. Using a time decay kernel function and event influence coefficient sequence matching the event type, and based on event intensity and occurrence time, an event influence coefficient sequence characterizing the event's impact on infectious disease transmission is determined. A comprehensive event influence coefficient sequence is obtained by fusing and truncating multiple event influence coefficient sequences. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event influence coefficient sequence at key feature points, a causal logic assessment of the fluctuation trend of the infectious disease transmission projection data sequence is achieved. This incorporates multi-source exogenous event information influencing infectious disease transmission into the assessment process. Furthermore, based on the causal logic assessment, smoothness and physical compliance assessments are performed on the infectious disease transmission projection data. This multi-dimensional assessment of the rationality of the infectious disease transmission projection data sequence indirectly evaluates the rationality of the projection method, resulting in a comprehensive and accurate rationality assessment. This improves the accuracy and credibility of the assessment results and provides a reliable basis for evaluating infectious disease transmission projection methods.
[0028] Figure 1 This paper illustrates an application scenario of the evaluation method for infectious disease transmission trend projection data based on large model assistance, according to an embodiment of this application. It should be noted that... Figure 1 The examples shown are merely application scenarios that can be applied to the embodiments of this application, in order to help those skilled in the art understand the technical content of this application, but do not mean that the embodiments of this application cannot be used in other devices, systems, environments or scenarios.
[0029] like Figure 1 As shown, the application scenario according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 serves as a medium for providing a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired and / or wireless communication links, etc.
[0030] Users can use the first terminal device 101, the second terminal device 102, and the third terminal device 103 to interact with the server 105 via the network 104 to receive or send messages, etc. Various communication client applications can be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, and / or social media platform software, etc. (for example only).
[0031] The first terminal device 101, the second terminal device 102, and the third terminal device 103 can be various electronic devices with displays and support web browsing, including but not limited to smartphones, tablets, laptops, and desktop computers.
[0032] Server 105 can be a server that provides various services, such as a backend management server that supports websites browsed by users using the first terminal device 101, the second terminal device 102, and the third terminal device 103 (this is just an example). The backend management server can analyze and process data such as received user requests, and feed back the processing results (such as web pages, information, or data obtained or generated according to user requests) to the terminal devices.
[0033] It should be noted that the evaluation method for infectious disease transmission trend projection data based on large model assistance provided in this application embodiment can generally be executed by server 105. Correspondingly, the evaluation device for infectious disease transmission trend projection data based on large model assistance provided in this application embodiment can generally be located in server 105. The evaluation method for infectious disease transmission trend projection data based on large model assistance provided in this application embodiment can also be executed by a server or server cluster that is different from server 105 and capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and / or server 105. Correspondingly, the evaluation device for infectious disease transmission trend projection data based on large model assistance provided in this application embodiment can also be located in a server or server cluster that is different from server 105 and capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and / or server 105. Alternatively, the evaluation method for infectious disease transmission trend projection data based on large model assistance provided in this application embodiment can also be executed by the first terminal device 101, the second terminal device 102, or the third terminal device 103, or by other terminal devices different from the first terminal device 101, the second terminal device 102, or the third terminal device 103. Correspondingly, the evaluation device for infectious disease transmission trend projection data based on large model assistance provided in this application embodiment can also be located in the first terminal device 101, the second terminal device 102, or the third terminal device 103, or in other terminal devices different from the first terminal device 101, the second terminal device 102, or the third terminal device 103.
[0034] It should be understood that Figure 1 The number of first terminal devices, second terminal devices, third terminal devices, networks, and servers shown in the diagram is merely illustrative. Depending on implementation needs, any number of first terminal devices, second terminal devices, third terminal devices, networks, and servers can be included.
[0035] Figure 2 A flowchart is shown for an evaluation method based on large model-assisted inference data on the spread of infectious diseases, according to an embodiment of this application.
[0036] like Figure 2 As shown, the method includes operations S210~S250.
[0037] In operation S210, acquire the data sequence of the infectious disease transmission projection to be evaluated.
[0038] In operation S220, based on the text semantic feature vector obtained by semantic parsing and encoding the multi-source event text of the target region using a pre-trained large language model, the event parameter tuples of multiple events are extracted from multiple multi-source event texts by calculating the relative projection distance between the text semantic feature vector and the preset extreme value semantic benchmark vector.
[0039] In operation S230, for each event, a sequence of event impact coefficients is generated, which characterizes the impact of each event on the spread of infectious diseases, based on the event intensity, the event onset time, and the range of event impact coefficient values corresponding to the event type, using a time decay kernel function that matches the event type.
[0040] In operation S240, the comprehensive event impact coefficient sequence obtained by linear superposition and nonlinear truncation of multiple event impact coefficient sequences is used as the causal prior of the effective reproduction number fluctuation trend. By comparing the dynamic consistency between the infectious disease transmission simulation data sequence and the comprehensive event impact coefficient sequence at key feature points, the causal logic evaluation result is generated.
[0041] In operation S250, a rationality assessment result is generated based on the causal logic assessment result, the smoothness assessment result characterizing the smoothness of the infectious disease transmission projection data sequence, and the physical compliance assessment result characterizing whether the infectious disease transmission projection data sequence meets the physical boundary constraints.
[0042] The infectious disease projection data sequence includes multiple infectious disease transmission prediction data for a target area, arranged chronologically within a predetermined time window. The predetermined time window can be a prediction time window set when performing infectious disease trend projection.
[0043] In some embodiments, the infectious disease extrapolation data sequence can be obtained by extrapolation or prediction models such as statistical models, agent models, complex network models, and neural network models based on real-world scenario data. The performance of the extrapolation model can also be indirectly evaluated by evaluating the infectious disease transmission extrapolation data sequence.
[0044] Because methods for extrapolating the spread of infectious diseases typically rely solely on the statistical characteristics of historical data, neglecting the impact of real-world events on the spread of infectious diseases, they fail to capture the causal logic between real-world events and the spread trends of infectious diseases, resulting in insufficient rationality in the generated infectious disease extrapolation data sequences. Furthermore, evaluation methods for these data sequences often overly rely on measuring statistical errors, making it difficult to effectively assess their rationality. Consequently, the evaluation results fail to reflect the rationality of the extrapolation method, leading to low accuracy and reliability of the evaluation results.
[0045] Therefore, this application utilizes unstructured multi-source exogenous event information about the target region when evaluating infectious disease transmission projection data sequences. Based on the causal logic between the events and the spread of the infectious disease, it assesses the fluctuation trend of the infectious disease transmission projection data sequences, effectively evaluating the rationality of the projection data sequences and thus achieving an effective assessment of the rationality of the projection method. The multi-source event text typically includes factors and events that cause abrupt changes in the trend of infectious disease transmission.
[0046] When integrating multi-source unstructured exogenous events for propagation assessment, traditional extraction methods struggle to effectively capture deep semantic relationships between different sources due to the high heterogeneity and semantic diffusion of multi-source event texts. Therefore, a pre-trained large language model is employed to map the disordered raw text into a structured event parameter stream space with epidemiological priors, ensuring high physical consistency and semantic confidence of the input features for subsequent causal logic assessment.
[0047] Pre-trained large language models possess powerful natural language processing capabilities, enabling them to automatically identify and understand key information such as entities, relationships, and events in unstructured multi-source exogenous event information, and transform it into structured data formats. These formats include event parameter tuples containing the event's occurrence time, type, and intensity, where event intensity characterizes the event's impact on the spread of infectious diseases. This structured data not only facilitates subsequent processing and analysis but also effectively improves the accuracy and efficiency of assessments.
[0048] For example, the acquired collection of raw multi-source event texts, including news reports and social media data, can be defined as... , Indicates multi-source event text, This represents the number of multi-source event texts in a multi-source event text set, and utilizes a pre-trained large language model as a high-dimensional semantic feature extraction operator by constructing a specific instruction stream. This induces large language models to perform knowledge retrieval and reasoning within the latent space. Specifically, it involves defining a nonlinear mapping function. The aim is to transform disordered text information Mapped to a structured event parameter stream space Among them Representing a large language model, This represents the event parameter tuple for event j. Indicates the time when the event occurred. Indicates the type of event (control by non-pharmaceutical interventions, lifting of non-pharmaceutical interventions, virus mutation, population gathering, etc.). This represents the event intensity after normalization. Furthermore, to ensure the scientific rigor of the extraction process, a prompting engineering architecture based on Chain of Thought (CoT) can be adopted.
[0049] Under this mapping mechanism, the large language model deeply weights spatiotemporal entities and action semantics in multi-source event texts through a self-attention mechanism, thereby identifying intervention signals with physical meaning. For the extraction of event intensity, it can be formalized into a semantic confidence-weighted metric process through probability distribution sampling. Let... The semantic feature vector of the text after LLM encoding is By constructing a semantic alignment operator based on cosine similarity, the calculation With the preset extreme value semantic benchmark vector (e.g., the semantic centroid corresponding to "fully intervened state") and The relative projective distance (such as the semantic centroid corresponding to "normal state without intervention"). In mathematical terms, event intensity... The calculation can be expressed as shown in the following formula (1):
[0050] (1);
[0051] in, To standardize the function, ensure that the output value is strictly mapped to the closed interval [0, 1]. Through iterative computation of this operator, a tuple sequence of event parameters for the target region is finally obtained. , m represents the number of event parameter tuples in the event parameter tuple sequence, where each tuple... It not only captures the objective physical properties of exogenous events, but also provides epidemiological priors for the subsequent construction of a causal logic-driven time decay model through semantic consistency alignment.
[0052] Given the significant heterogeneous decay characteristics of the time-varying impacts of different types of exogenous events on the dynamics of infectious disease transmission, embodiments of this application project discrete, heterogeneous events onto a continuous temporal evolution dimension using a time decay kernel function to generate a quantitative sequence characterizing the impact of external interventions on the dynamics of infectious disease transmission. For example, for release-type or suppression-type events, a first-order exponential decay function can be introduced as the time decay kernel function to capture the immediacy of the response and the marginal diminishing characteristics of subsequent effects; for short-term, large-scale aggregation and other impulsive events, a Gaussian distribution function can be introduced as the time decay kernel function to simulate the non-monotonic fluctuation effects generated by external disturbances. By matching specific types of time decay kernel functions, the dynamic evolution process of the event intensity response to the transmission rate can be characterized with high fidelity.
[0053] After determining a suitable time decay kernel function for each event, the event intensity and the event's onset time can be used as input parameters and substituted into the time decay kernel function for calculation. Combined with the range of event impact coefficient values corresponding to the event type, a sequence of event impact coefficients arranged chronologically within a predetermined time window is obtained. Each event impact coefficient characterizes the degree of influence of that event on the spread of infectious diseases at the corresponding time point.
[0054] The effective time of an event is determined based on the event's occurrence time and a lag parameter corresponding to the event type. The lag parameter characterizes the time delay between the event's occurrence and its initial impact on the spread of the infectious disease. Different types of events may have different lag parameters; for example, some events may immediately impact the spread of infectious diseases, while some environmental change events may require a period of time to manifest their effects. Therefore, by setting appropriate lag parameters for different types of events, the effective time of the event can be determined more accurately.
[0055] For example, for event parameter tuples First, a time-delay operator based on epidemiological priors is introduced to define the effective time of an event. As shown in formula (2) below:
[0056] (2);
[0057] in, The hysteresis parameter is designed to match the event type and simulate the physical time it takes from the occurrence of an event to a real change in population contact patterns or virus transmission efficiency.
[0058] The range of values for the event impact coefficient is determined based on the positive and negative impacts of different types of events on the spread of infectious diseases. For example, for release-type events, the event impact coefficient is usually set to a positive range, indicating that it may lead to an increase in the transmission rate; while for suppression-type events, the event impact coefficient is usually set to a negative range, indicating that it may lead to a decrease in the transmission rate.
[0059] To comprehensively evaluate the synergistic effect of multidimensional exogenous event factors on the epidemic trend, a weighted superposition mapping mechanism can be constructed. By performing linear superposition and nonlinear truncation on the sequence of influence coefficients of all events, a comprehensive sequence of event influence coefficients can be obtained. The specific process is shown in the following formula (3):
[0060] (3);
[0061] in, This represents the sequence of comprehensive event impact coefficients, which actually constitutes the effective reproduction number of infectious diseases. Causal prior estimation of fluctuation trends This indicates nonlinear truncation. Indicates the first A sequence of event influence coefficients.
[0062] In some embodiments, the fluctuation trend of the comprehensive event impact coefficient sequence can be compared with the fluctuation trend of the infectious disease transmission projection data sequence. If the similarity between the two is high, the causal logic assessment result is determined to characterize the infectious disease transmission projection data sequence as satisfying causal logic.
[0063] After obtaining the causal logic assessment results, the reasonableness assessment results for the infectious disease transmission projection data sequence can be determined by combining other assessment results, such as smoothness assessment results and physical compliance assessment results. For example, if the smoothness assessment results indicate that the infectious disease transmission projection data sequence meets the smoothness requirements, the physical compliance assessment results indicate that the infectious disease transmission projection data sequence meets the physical compliance requirements, and the causal logic assessment results indicate that the infectious disease transmission projection data sequence meets the causal logic requirements, then the reasonableness assessment result indicates that the infectious disease transmission projection data sequence meets the reasonableness requirements. If any of these results are not met, then the reasonableness assessment result indicates that the infectious disease transmission projection data sequence does not meet the reasonableness requirements.
[0064] In some embodiments, smoothness can be measured by calculating the rate of change between adjacent infectious disease transmission prediction data in the infectious disease transmission projection data sequence. If the rate of change fluctuates within a preset reasonable range, the infectious disease transmission projection data sequence is considered to meet the smoothness requirements. Physical boundary constraints can be set according to the biological characteristics of infectious disease transmission and the actual conditions of the target area, such as the upper limit of the number of infected persons and the threshold of transmission speed. When none of the infectious disease transmission prediction data in the infectious disease transmission projection data sequence exceeds these boundary values, it is determined that the physical boundary constraints are met.
[0065] According to embodiments of this application, a pre-trained large language model is used to semantically parse, encode, and extract multiple unstructured, multi-source exogenous event information in a target region. This yields the event occurrence time, event type, and event intensity for each event. Using a time decay kernel function and event influence coefficient sequence matching the event type, an event influence coefficient sequence representing the event's impact on infectious disease transmission is determined based on event intensity and occurrence time. A comprehensive event influence coefficient sequence is obtained by fusing and truncating multiple event influence coefficient sequences. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event influence coefficient sequence at key feature points, a causal logic assessment of the fluctuation trend of the infectious disease transmission projection data sequence is achieved. This allows the assessment process to incorporate multi-source exogenous event information influencing infectious disease transmission. Furthermore, based on the causal logic assessment, smoothness and physical compliance assessments are performed on the infectious disease transmission projection data. This multi-dimensional assessment of the rationality of the infectious disease transmission projection data sequence indirectly evaluates the rationality of the projection method, resulting in a comprehensive and accurate rationality assessment. This improves the accuracy and credibility of the assessment results and provides a reliable basis for evaluating infectious disease transmission projection methods.
[0066] According to embodiments of this application, the evaluation method further includes: when the event type characterizes the event as an inhibitory event that leads to obstruction of transmission or a release event that leads to accelerated transmission, using an exponential decay function as a time decay kernel function to simulate the gradual weakening of the event's impact on the spread of infectious diseases over time; when the event type characterizes the event as a pulse-type event, using a Gaussian distribution function as a time decay kernel function to simulate the fluctuation of the event's impact on the spread of infectious diseases, where the impact duration on the spread of infectious diseases is less than a predetermined impact duration threshold, and the predetermined impact duration threshold is less than the duration of a predetermined time window.
[0067] In the embodiments of this application, event types are classified into inhibition events, release events, and impulse events. Inhibition events can be understood as events that can slow down the spread of infectious diseases or limit the spread of infectious diseases. Since these events gradually show their inhibitory effect on the spread of infectious diseases over time after they occur, and this inhibitory effect usually weakens over time, it is suitable to use an exponential decay function to simulate the change of their effect over time.
[0068] Release events refer to events that accelerate the spread of infectious diseases. While these events, like other infectious diseases, gradually affect the spread over time, their effect is primarily accelerated, and this acceleration gradually weakens. Therefore, release events can also be simulated using an exponential decay function. However, it's important to note that the parameters of the exponential decay function for release events should differ from those for suppression events to reflect their accelerated spread characteristics.
[0069] For example, for suppression or release events, a first-order exponential decay model can be used to simulate the immediacy of their response to the propagation rate and the diminishing marginal returns of their subsequent effects, thus providing a sequence of event impact coefficients for suppression or release events. This can be expressed as shown in the following formula (4):
[0070] (4);
[0071] Where t represents time, when hour , To characterize the attenuation coefficient of biological effects, the sign is determined by the range of values of the event influence coefficient corresponding to the event type.
[0072] Impulse events refer to those events that have a brief but strong impact on the spread of infectious diseases. Because the duration of their impact is usually short, less than a predetermined threshold, and their influence initially rises and then falls, exhibiting a fluctuating pattern, a Gaussian distribution function is suitable for simulating the time-varying impact of impulse events on the spread of infectious diseases. The Gaussian distribution function effectively captures this fluctuating characteristic of rising and falling, thus providing a more accurate assessment of the impact of impulse events on the spread of infectious diseases. Examples of impulse events include large-scale concerts and sporting events.
[0073] For example, for impulsive events, a Gaussian distribution function can be introduced as the time decay kernel function to simulate the fluctuation effect of external disturbances on the spread of infectious diseases, which first rises and then falls. The event impact coefficient sequence of impulsive events... This can be expressed as shown in the following formula (5):
[0074] (5);
[0075] in, This indicates the time span from the moment the event takes effect until it reaches its peak impact. This represents the Gaussian bandwidth. It should be noted that the time decay kernel function parameters (such as the decay coefficient)... Gaussian bandwidth and hysteresis parameters The data was obtained by fitting historical epidemiological surveillance data using maximum likelihood estimation.
[0076] According to embodiments of this disclosure, by determining appropriate time decay models for suppression events, release events, and impulse events respectively, the determined time decay models can more accurately reflect the spread trend of infectious diseases, improve the accuracy of the event influence coefficient sequence determined by the time decay models, and thus improve the accuracy and credibility of the rationality assessment results.
[0077] According to embodiments of this application, using a time decay kernel function matched to the event type, an event impact coefficient sequence characterizing the impact of each event on the spread of infectious diseases is generated based on the event intensity, the event initiation time, and the range of event impact coefficient values corresponding to the event type. This includes: determining the event impact coefficient range to be negative when the event type characterizes the event as an inhibitory event; determining the event impact coefficient range to be positive when the event type characterizes the event as a release event; using the event initiation time as the initiation point of the time decay kernel function on the time axis, calculating the event impact coefficient at each time point based on the event intensity and the range of event impact coefficient values, thus obtaining the event impact coefficient sequence.
[0078] Since suppressive events slow the spread of infectious diseases, their impact coefficients are set to negative values when determining their range. Conversely, release events accelerate the spread of infectious diseases, so their impact coefficients are set to positive values. In practice, the decay coefficient of the time decay kernel function corresponding to suppressive events can be set to negative values, while the decay coefficient of the time decay kernel function corresponding to release events can be set to positive values.
[0079] When calculating the time-dependent impact coefficient, the event's initiation point can be used as the effective point of the time decay kernel function on the time axis. This means that from the event's initiation point, the event's impact on the spread of infectious diseases will change according to the selected time decay kernel function. Next, based on the event intensity, the event impact coefficients at each time point are calculated, resulting in a sequence of event impact coefficients arranged chronologically within a predetermined time window.
[0080] According to embodiments of this disclosure, by determining the range of event impact coefficient values based on event type and using the event's effective time as the effective point of the time decay model, the starting time of the event's impact on the spread of infectious diseases and its subsequent changes over time can be simulated more accurately. This further improves the accuracy of the event impact coefficient sequence and enhances the effectiveness and credibility of the assessment results in evaluating the rationality of the infectious disease spread projection method.
[0081] Figure 3 A schematic diagram of determining the sequence of event influence coefficients according to an embodiment of this application is shown.
[0082] like Figure 3 As shown, semantic parsing, encoding, and extraction are performed on the multi-source event text 301 to obtain the event occurrence time 302, event type 303, and event intensity 304. After determining the time decay kernel function 305 that matches the event type 303, the event take-off time 306 is determined based on the event occurrence time 302 and the hysteresis parameter corresponding to the event type 303. The event take-off time 306 and the event intensity 304 are used as parameters of the time decay kernel function 305 to obtain the event influence coefficient sequence 307.
[0083] According to embodiments of this application, a comprehensive event impact coefficient sequence obtained by linearly superimposing and fusing multiple event impact coefficient sequences and performing nonlinear truncation is used as a causal prior for the effective reproduction number fluctuation trend. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event impact coefficient sequence at key feature points, a causal logic evaluation result is generated, including: when the comprehensive event impact coefficient sequence indicates the presence of a target suppression event, determining the impact time window of the target suppression event based on the comprehensive event impact coefficient sequence; the target suppression event is a suppression event with an event intensity greater than a predetermined event intensity threshold; and when the infectious disease transmission projection data sequence has an impact time window... When the mean of the first derivative of the time window is less than zero or shows a monotonically decreasing trend, the causal logic evaluation result indicates that the infectious disease transmission simulation data sequence satisfies causal logic. When the comprehensive event influence coefficient sequence indicates the presence of a target pulse-type event, the theoretical peak time is determined based on the comprehensive event influence coefficient sequence. The target pulse-type event is a pulse-type event with an event intensity greater than a predetermined event intensity threshold. When it is determined that the infectious disease transmission simulation data sequence has a local maximum within the neighborhood time window of the theoretical peak time, the causal logic evaluation result indicates that the infectious disease transmission simulation data sequence satisfies causal logic, and the first derivative of the local maximum is zero and the second derivative is less than zero.
[0084] Before evaluating the fluctuation trend of the infectious disease transmission projection data sequence using the comprehensive event impact coefficient sequence, it is possible to identify whether there are inhibitory events in the sequence whose intensity exceeds a predetermined event intensity threshold. If so, the specific time window in which this event affects the spread of the infectious disease is further determined, i.e., the impact time window. Subsequently, the changes in the first derivative of the infectious disease transmission projection data sequence within this impact time window are analyzed. If the mean of the first derivative is less than zero or shows a monotonically decreasing trend, it indicates that the spread of the infectious disease is suppressed during this period, which is consistent with the impact of the target inhibitory event. At this point, the causal logic assessment result can be determined that the infectious disease transmission projection data sequence satisfies causal logic.
[0085] For example, comparing the data sequences of the infectious disease transmission projections to be evaluated. With the comprehensive event impact coefficient sequence When determining causal rationality based on the dynamic consistency of key feature points, when within a specific time window (Within the time window of influence) it exhibits a significant negative inhibition characteristic (i.e.) When calculating the first derivative of the infectious disease transmission projection data sequence within the time window of influence, it is possible to calculate the first derivative of the data sequence within the time window of influence. If its mean is significantly less than zero or shows a monotonically decreasing trend, then the inference sequence is considered to be self-consistent in terms of suppression logic. Similarly, if It exhibits a significant positive promoting characteristic within the time window of influence, and the mean of the first derivative should be significantly greater than 0 or show a monotonically increasing trend.
[0086] Before evaluating the fluctuation trend of the infectious disease transmission projection data sequence using the comprehensive event impact coefficient sequence, it is possible to identify whether there are impulsive events in the comprehensive event impact coefficient sequence whose intensity exceeds a predetermined event intensity threshold. If so, the theoretical peak time of the event, i.e., the time point when the event's impact reaches its maximum, is determined based on the comprehensive event impact coefficient sequence. Next, it is detected whether there are local maxima in the infectious disease transmission projection data sequence near the theoretical peak time. If so, it indicates that the transmission of the infectious disease is significantly affected by the impulsive event at that moment, and the transmission speed or range experiences a brief increase, consistent with theoretical expectations. Therefore, the causal logic assessment result can be confirmed that the infectious disease transmission projection data sequence satisfies causal logic.
[0087] For example, the theoretical peak time induced by impulsive events. It can be used to extrapolate data sequences of infectious disease transmission. neighborhood Local maxima are searched within the interior. If the condition is met... Make and If the first derivative is 0 and the second derivative is less than 0, then the data sequence of the infectious disease transmission projection accurately captures the epidemiological fluctuation characteristics triggered by exogenous disturbances, and thus generates a causal logic evaluation result that characterizes the causal logic of the infectious disease transmission projection data sequence.
[0088] According to the embodiments of this application, by utilizing key information in the comprehensive event impact coefficient sequence, the fluctuation trend of the infectious disease transmission projection data sequence is accurately assessed. This not only considers the different impact mechanisms of different types of events on the spread of infectious diseases, but also combines the specific intensity and timing of the events, making the assessment results more comprehensive and accurate.
[0089] According to embodiments of this application, a rationality assessment result is generated based on causal logic assessment results, smoothness assessment results characterizing the smoothness of the infectious disease transmission projection data sequence, and physical compliance assessment results characterizing whether the infectious disease transmission projection data sequence meets physical boundary constraints. This includes: determining an infectious disease transmission change rate threshold and an infectious disease transmission data threshold within a predetermined time window based on the statistical distribution characteristics of historical infectious disease transmission data in the target area; assessing the change rate of every two adjacent infectious disease transmission prediction data in the infectious disease transmission projection data sequence based on the infectious disease transmission change rate threshold to obtain a smoothness assessment result; assessing each infectious disease transmission prediction data in the infectious disease transmission projection data sequence based on the infectious disease transmission data threshold, and evaluating the degree of deviation of the infectious disease transmission projection data sequence from the historical epidemic curve manifold through a significance test to obtain a physical compliance assessment result; and performing a high-order nonlinear integration of the smoothness assessment result, physical compliance assessment result, and causal logic assessment result using a weighted entropy method to generate a rationality assessment result for the infectious disease transmission projection data sequence.
[0090] After completing the causal logic assessment based on exogenous events, statistical topological constraints based on historical prior distributions can be introduced to assess whether the infectious disease transmission projection data sequence meets compliance requirements at the levels of dynamic smoothness and physical boundaries.
[0091] When assessing smoothness, a threshold for the rate of change in infectious disease transmission is used to evaluate the rate of change between every two adjacent predicted data points in the infectious disease transmission projection data sequence. If the rate of change is within the threshold range, it indicates that the data sequence is relatively smooth; otherwise, abnormal fluctuations may exist.
[0092] When conducting physical compliance assessments, infectious disease transmission data thresholds are used to evaluate the predicted transmission data of each infectious disease in the inferred transmission data sequence. If the predicted infectious disease data falls within the threshold range, it indicates that the data conforms to physical reality; otherwise, there may be unreasonable aspects. Furthermore, a significance test is used to assess the degree of deviation of the inferred infectious disease transmission data sequence from the historical epidemic curve manifold. If the deviation of the predicted infectious disease data from the historical epidemic curve manifold is small, it indicates that the data conforms to physical reality; otherwise, there may be unreasonable aspects.
[0093] When determining the results of the rationality assessment, the aforementioned causal logic assessment results can be used. Smoothness evaluation results and physical compliance assessment results A high-order nonlinear integration is performed. To reflect the heterogeneity of the contributions of each evaluation result to the rationality assessment, a weighted entropy weighting method is used to generate the final rationality assessment result. In mathematical terms, this evaluation framework can be defined as the joint projection of the infectious disease transmission sequence into semantic space, statistical space, and physical space. The calculation process of the rationality evaluation result is shown in the following formula (6):
[0094] (6);
[0095] in, (·,·,·) denotes higher-order nonlinear integration. For the first One evaluation result, For the first The weight of each evaluation result, To adjust the first The sensitivity scaling factor of each evaluation result. Through the above calculation process of the rationality evaluation results, the embodiments of this application can effectively identify those inference results that, although they meet the statistical fitting index, have defects in dynamic mechanism, external environment response or physical boundary consistency, thereby providing the decision-making system with a high-confidence risk assessment basis.
[0096] According to embodiments of this disclosure, by comprehensively considering the smoothness assessment results, physical compliance assessment results, and previously obtained causal logic assessment results, a rationality assessment result for the infectious disease transmission projection data sequence is generated. This rationality assessment process also incorporates the statistical characteristics of historical infectious disease transmission data, thereby enabling a more comprehensive and accurate assessment of the rationality of the infectious disease transmission projection data sequence.
[0097] Figure 4 A schematic diagram illustrating the determination of the reasonableness assessment results according to an embodiment of this application is shown.
[0098] like Figure 4 As shown, a causal logic evaluation of the infectious disease transmission projection data sequence 401 is performed using the comprehensive event impact coefficient sequence, resulting in a causal logic evaluation result 402. A smoothness evaluation of the infectious disease transmission variation rate threshold is performed on the infectious disease transmission projection data sequence 401, resulting in a smoothness evaluation result 403. A physical compliance evaluation of the infectious disease transmission projection data sequence 401 is performed using an infectious disease transmission data threshold, resulting in a physical compliance evaluation result 404. Finally, a high-order linear integration of the causal logic evaluation result 402, the smoothness evaluation result 403, and the physical compliance evaluation result 404 is performed using the weighted entropy method, resulting in a rationality evaluation result 405.
[0099] According to an embodiment of this application, based on the statistical distribution characteristics of historical infectious disease transmission data in a target area, the threshold for the rate of change of infectious disease transmission data within a predetermined time window is determined, including: performing first-order difference calculation on multiple historical infectious disease transmission data arranged in chronological order to obtain a historical rate of change sequence; determining the mean and standard deviation of the historical rate of change sequence; and determining the rate of change of infectious disease transmission threshold based on the standard deviation of the mean and a preset multiple, so as to constrain the degree of mutation between every two adjacent infectious disease transmission prediction data in the infectious disease transmission projection data sequence.
[0100] In the embodiments of this application, historical infectious disease transmission data can first be arranged chronologically to obtain a historical infectious disease transmission projection data sequence. Then, a first-order difference calculation can be performed on this historical infectious disease transmission projection data sequence to obtain a historical rate of change sequence. The first-order difference calculation can reflect the changes in historical infectious disease transmission data between adjacent time points, characterizing the changing trend of the infectious disease transmission speed. That is, for the historical infectious disease transmission projection data sequence... Through calculation The historical rate of change sequence is obtained.
[0101] After obtaining the historical rate of change sequence, the mean and standard deviation of the historical rate of change sequence are further calculated. Furthermore, to quantify abnormal deviations in non-stationary processes, the k-sigma principle can be used to determine the threshold for the rate of change of infectious disease transmission. The sum of the standard deviations of the mean and a preset multiple is used as the upper limit threshold for the rate of change of infectious disease transmission, and the difference between the standard deviations of the mean and a preset multiple is used as the lower limit threshold. The mean represents the average level of historical infectious disease transmission speed, while the standard deviation reflects the degree of fluctuation in historical infectious disease transmission speed. This allows the threshold for the rate of change of infectious disease transmission to constrain the degree of abrupt change between every two adjacent predicted infectious disease transmission data in the infectious disease transmission data sequence, thereby providing a basis for the infectious disease transmission data inference sequence. The Lipschitz continuity between every two adjacent infectious disease transmission prediction data provides statistical support, generating a smoothness assessment result. The threshold for the rate of change of infectious disease transmission can be determined by the following formula (7):
[0102] (7);
[0103] in, The threshold for the rate of change in the transmission rate of infectious diseases. The mean, The preset multiplier is calibrated based on the peak-weighted characteristics of historical infectious disease transmission data fluctuations in the target area. The standard deviation is denoted as .
[0104] According to an embodiment of this application, by using a historical rate of change sequence obtained by first-order difference calculation of multiple historical infectious disease transmission data arranged in chronological order, the threshold of the infectious disease transmission rate of change is determined. This takes into account both the average level of historical infectious disease transmission data and the degree of fluctuation of historical infectious disease transmission data, which can more accurately reflect the range of changes in the speed of infectious disease transmission, improve the accuracy of smoothness assessment, and thus improve the accuracy of rationality assessment results.
[0105] According to an embodiment of this application, based on the statistical distribution characteristics of historical infectious disease transmission data in a target area, the threshold for infectious disease transmission data within a predetermined time window is determined, including: determining historical infectious disease transmission peak data from multiple historical infectious disease transmission data; and determining the minimum value between the historical infectious disease transmission peak data and the total population of the target area as the infectious disease transmission data threshold.
[0106] In determining the threshold for infectious disease transmission data, the first step is to filter out historical infectious disease transmission peak data from multiple collected historical infectious disease transmission data sets. Historical infectious disease transmission peak data characterizes the maximum value of infectious disease transmission in the target area over a past period.
[0107] In some embodiments, historical peak data of infectious disease transmission can be directly used as the threshold for infectious disease transmission data. However, to further improve the accuracy of physical compliance assessment, this application also introduces the total population of the target area to determine the threshold for infectious disease transmission data. Specifically, based on the total population of the target area... and historical peak data of infectious disease transmission Define thresholds for infectious disease transmission data. As shown in the following formula (8):
[0108] (8);
[0109] in, The proportional coefficient characterizing the herd immunity threshold The generational coefficient characterizes the enhanced properties of the strain.
[0110] By analyzing the infectious disease transmission projection data sequence for each infectious disease prediction data conduct The inclusion test can identify and eliminate non-physical illusions that exceed the carrying capacity of the area or violate basic counting logic, thereby obtaining physical compliance assessment results.
[0111] According to the embodiments of this application, the threshold for infectious disease transmission data is determined by combining historical peak data of infectious disease transmission with the total population of the target area. This not only considers the historical maximum value of infectious disease transmission but also takes into account the population size of the target area, making the assessment results closer to the actual situation. As a result, the accuracy and reliability of physical compliance assessment are significantly improved, thereby enhancing the accuracy of rationality assessment.
[0112] According to embodiments of this application, the evaluation method further includes: determining the degree of deviation of the infectious disease transmission projection data sequence from the historical epidemic curve manifold in the probability space by using a significance test operator based on Mahalanobis distance or dynamic time regularization distance, and obtaining a deviation statistical score; if the deviation statistical score is greater than an anomaly judgment threshold set based on confidence level, determining that the infectious disease transmission prediction data corresponding to the deviation statistical score is abnormal data, and determining that the physical compliance evaluation result characterizes the infectious disease transmission projection data sequence as non-compliant.
[0113] To delve deeper into the statistical deviations of the projected data from historical evolution patterns, embodiments of this application introduce significance testing operators based on Mahalanobis distance or Dynamic Time Warping (DTW) distance. These significance testing operators measure the sequence of projected infectious disease transmission data. Calculate the degree of deviation from the historical popular curve manifold within the probability space, and then calculate the statistical score of the degree of deviation. Let the mean distribution of the historical popular curve manifold be... The covariance matrix is Then the exception detection function This can be expressed as shown in the following formula (9):
[0114] (9);
[0115] Where T represents the transpose operation. Degrees of freedom are usually equal to the dimension of X. Indicates the significance level. The quantiles on a chi-square distribution with degrees of freedom represent the percentage of deviation from the predicted spread of infectious diseases when the statistical score exceeds the confidence level. When the set anomaly detection threshold is reached, the infectious disease transmission prediction data is marked as anomalous data.
[0116] According to embodiments of this application, by introducing statistical significance testing, abnormal data in the infectious disease transmission projection data sequence can be identified, making the assessment process more comprehensive and reliable, thereby improving the accuracy and reliability of the assessment results.
[0117] Figure 5 A block diagram of an evaluation apparatus for infectious disease transmission trend projection data based on large model-assisted methods, according to an embodiment of this application, is shown.
[0118] like Figure 5 As shown, the evaluation device 500 based on large model-assisted infectious disease transmission trend projection data includes a data acquisition module 510, an analysis and extraction module 520, an impact determination module 530, an impact assessment module 540, and a result generation module 550.
[0119] The data acquisition module 510 is used to acquire the infectious disease transmission projection data sequence to be evaluated. The infectious disease transmission projection data sequence includes multiple infectious disease transmission prediction data arranged in chronological order for a target area within a predetermined time window. In some embodiments, the data acquisition module 510 can be used to perform the operation S210 described above, which will not be repeated here.
[0120] The parsing and extraction module 520 is used to extract event parameter tuples for multiple events from multiple multi-source event texts based on the text semantic feature vectors obtained by semantic parsing and encoding the multi-source event texts of the target region using a pre-trained large language model. These tuples include the event occurrence time, event type, and event intensity representing the intensity of the event's impact on the spread of infectious diseases. In some embodiments, the parsing and extraction module 520 can be used to perform the operation S220 described above, which will not be repeated here.
[0121] The impact determination module 530 is used to generate a sequence of event impact coefficients characterizing the impact of each event on the spread of infectious diseases, based on the event intensity, the event onset time, and the range of event impact coefficient values corresponding to the event type, using a time decay kernel function matched to the event type. The event onset time is determined based on the event occurrence time and the hysteresis parameter corresponding to the event type. In some embodiments, the impact determination module 530 can be used to perform the operation S230 described above, which will not be repeated here.
[0122] The impact assessment module 540 is used to take the comprehensive event impact coefficient sequence obtained by linearly superimposing and fusing multiple event impact coefficient sequences and nonlinearly truncating them as a causal prior for the effective reproduction number fluctuation trend. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event impact coefficient sequence at key characteristic points, a causal logic assessment result is generated. In some embodiments, the impact assessment module 540 can be used to perform the operation S250 described above, which will not be repeated here.
[0123] The result generation module 550 is used to generate a reasonableness assessment result based on the causal logic assessment result, the smoothness assessment result characterizing the smoothness of the infectious disease transmission projection data sequence, and the physical compliance assessment result characterizing whether the infectious disease transmission projection data sequence meets physical boundary constraints. In some embodiments, the result generation module 550 can be used to perform the operation S250 described above, which will not be repeated here.
[0124] According to an embodiment of this application, the evaluation device 500 based on large model-assisted infectious disease transmission trend projection data further includes a first determining module and a second determining module.
[0125] The first determining module is used to simulate the gradual weakening of the event's impact on the spread of infectious diseases over time when the event type characterizes the event as either an inhibitory event that hinders transmission or a release event that accelerates transmission.
[0126] The second determining module is used to use the Gaussian distribution function as the time decay kernel function when the event type characterization event is a pulse-type event, so as to simulate the impact of the event on the spread of infectious diseases as a fluctuation that first rises and then falls. A pulse-type event is an event whose impact on the spread of infectious diseases is less than a predetermined impact duration threshold, and the predetermined impact duration threshold is less than the duration of a predetermined time window.
[0127] According to an embodiment of this application, the influence determination module 530 includes a first determination submodule, a second determination submodule, and an influence determination submodule.
[0128] The first determination submodule is used to determine that the value range of the event influence coefficient is negative when the event type characterization of the event is a suppression event.
[0129] The second determination submodule is used to determine that the value range of the event influence coefficient is positive when the event type characterization of the event is a release type event.
[0130] The impact determination submodule is used to take the event's effective time as the effective point of the time decay kernel function on the time axis, and calculate the event impact coefficient at each time point based on the event intensity and the range of event impact coefficient values to obtain the event impact coefficient sequence.
[0131] According to an embodiment of this application, the impact assessment module 540 includes an impact window determination submodule, a first logical assessment submodule, an impact peak determination submodule, and a second logical assessment submodule.
[0132] The impact window determination submodule is used to determine the impact time window of a target suppression event based on the comprehensive event impact coefficient sequence, when the comprehensive event impact coefficient sequence indicates the presence of a target suppression event. A target suppression event is a suppression event whose event intensity is greater than a predetermined event intensity threshold.
[0133] The first logic evaluation submodule is used to determine the causal logic evaluation results when the mean of the first derivative of the infectious disease transmission projection data sequence is less than zero or shows a monotonically decreasing trend within the influence time window. This result indicates that the infectious disease transmission projection data sequence satisfies causal logic.
[0134] The impact peak determination submodule is used to determine the theoretical peak time based on the comprehensive event impact coefficient sequence, when the comprehensive event impact coefficient sequence characterizes the presence of a target pulse-type event. The target pulse-type event is a pulse-type event with an event intensity greater than a predetermined event intensity threshold.
[0135] The second logic evaluation submodule is used to determine the causal logic evaluation result when the infectious disease transmission simulation data sequence has a local maximum in the neighborhood time window of the theoretical peak time. This result indicates that the infectious disease transmission simulation data sequence satisfies causal logic, and the first derivative of the local maximum is zero and the second derivative is less than zero.
[0136] According to an embodiment of this application, the result generation module 550 includes a historical propagation determination submodule, a smoothness evaluation submodule, a physical compliance evaluation submodule, and a result generation submodule.
[0137] The historical transmission determination submodule is used to determine the threshold for the rate of change of infectious disease transmission and the threshold for infectious disease transmission data within a predetermined time window based on the statistical distribution characteristics of historical infectious disease transmission data in the target area.
[0138] The smoothness determination submodule is used to evaluate the rate of change of every two adjacent infectious disease transmission prediction data in the infectious disease transmission projection data sequence based on the infectious disease transmission change rate threshold, and obtain the smoothness evaluation result.
[0139] The physical compliance assessment submodule is used to evaluate the infectious disease transmission prediction data in the infectious disease transmission projection data sequence based on infectious disease transmission data thresholds, and to assess the degree of deviation of the infectious disease transmission projection data sequence from the historical epidemic curve manifold through significance testing, thereby obtaining the physical compliance assessment results.
[0140] The results generation submodule is used to perform high-order nonlinear integration of smoothness assessment results, physical compliance assessment results, and causal logic assessment results using the weighted entropy method, and generate reasonableness assessment results for the data sequence of infectious disease transmission projection.
[0141] According to an embodiment of this application, the historical propagation determination submodule includes a sequence generation unit, a historical statistics unit, and a rate of change determination unit.
[0142] The sequence generation unit is used to perform first-order difference calculations on multiple historical infectious disease transmission data arranged in chronological order to obtain a historical rate of change sequence.
[0143] Historical statistical units are used to determine the mean and standard deviation of historical rate of change series.
[0144] The rate of change determination unit is used to determine the threshold of the rate of change of infectious disease transmission based on the standard deviation of the mean and a preset multiple, so as to constrain the degree of mutation between every two adjacent infectious disease transmission prediction data in the infectious disease transmission extrapolation data sequence.
[0145] According to an embodiment of this application, the historical propagation determination submodule includes a peak value determination unit and a threshold value determination unit.
[0146] The peak determination unit is used to determine the peak data of historical infectious disease transmission from multiple historical infectious disease transmission data.
[0147] The threshold determination unit is used to determine the minimum value among the historical peak data of infectious disease transmission and the total population of the target area as the threshold for infectious disease transmission data.
[0148] According to embodiments of this application, the physical compliance assessment submodule based on large model-assisted infectious disease transmission trend projection data includes a deviation determination unit and an anomaly determination unit.
[0149] The deviation determination unit is used to determine the degree of deviation of the infectious disease transmission projection data sequence from the historical epidemic curve manifold in the probability space by using significance test operators based on Mahalanobis distance or dynamic time regularization distance, and obtains the deviation degree statistical score.
[0150] The anomaly determination unit is used to determine that the infectious disease transmission prediction data corresponding to the deviation degree statistical score is abnormal data when the deviation degree statistical score is greater than the anomaly judgment threshold set based on confidence level, and to determine that the physical compliance assessment result characterizes the non-compliance of the infectious disease transmission projection data sequence.
[0151] Any one or more of the modules, submodules, units, and subunits according to the embodiments of this application, or at least part of the functions of any one or more of them, can be implemented in one module. Any one or more of the modules, submodules, units, and subunits according to the embodiments of this application can be implemented by dividing them into multiple modules. Any one or more of the modules, submodules, units, and subunits according to the embodiments of this application can be at least partially implemented as hardware circuits, such as field-programmable gate arrays (FPGAs), programmable logic arrays (PLAs), systems-on-a-chip, systems-on-a-substrate, systems-on-package, application-specific integrated circuits (ASICs), or implemented by hardware or firmware in any other reasonable manner by integrating or packaging circuits, or implemented in any one of software, hardware, and firmware, or in a suitable combination of any of these. Alternatively, one or more of the modules, submodules, units, and subunits according to the embodiments of this application can be at least partially implemented as computer program modules, which, when run, can perform corresponding functions.
[0152] For example, any multiple of the data acquisition module 510, parsing and extraction module 520, impact determination module 530, impact assessment module 540, and result generation module 550 can be combined into one module / unit / subunit, or any one of these modules / units / subunits can be split into multiple modules / units / subunits. Alternatively, at least part of the functionality of one or more of these modules / units / subunits can be combined with at least part of the functionality of other modules / units / subunits and implemented in one module / unit / subunit. According to embodiments of this application, at least one of the data acquisition module 510, parsing and extraction module 520, impact determination module 530, impact assessment module 540, and result generation module 550 can be at least partially implemented as hardware circuitry, such as a field-programmable gate array (FPGA), a programmable logic array (PLA), a system-on-a-chip, a system-on-a-substrate, a system-on-package, an application-specific integrated circuit (ASIC), or any other reasonable means of integrating or packaging the circuitry, or implemented in software, hardware, or firmware, or in any suitable combination of any of these three implementation methods. Alternatively, at least one of the data acquisition module 510, the parsing and extraction module 520, the impact determination module 530, the impact assessment module 540, and the result generation module 550 may be implemented at least partially as a computer program module, which can perform corresponding functions when the computer program module is run.
[0153] It should be noted that the evaluation device part based on large model-assisted infectious disease transmission trend prediction data in the embodiments of this application corresponds to the evaluation method part based on large model-assisted infectious disease transmission trend prediction data in the embodiments of this application. For a detailed description of the evaluation device part based on large model-assisted infectious disease transmission trend prediction data, please refer to the evaluation method part based on large model-assisted infectious disease transmission trend prediction data, which will not be repeated here.
[0154] Figure 6 A block diagram of an electronic device suitable for implementing an evaluation method based on large model-assisted data for predicting the spread of infectious diseases, according to an embodiment of this application, is shown. Figure 6 The electronic device shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments of this application.
[0155] like Figure 6 As shown, an electronic device 600 according to an embodiment of this application includes a processor 601, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 602 or a program loaded from a storage portion 608 into a random access memory (RAM) 603. The processor 601 may include, for example, a general-purpose microprocessor (e.g., a CPU), an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (e.g., an application-specific integrated circuit (ASIC)), etc. The processor 601 may also include onboard memory for caching purposes. The processor 601 may include a single processing unit or multiple processing units for performing different actions of the method flow according to an embodiment of this application.
[0156] RAM 603 stores various programs and data required for the operation of electronic device 600. Processor 601, ROM 602, and RAM 603 are interconnected via bus 604. Processor 601 executes various operations of the method flow according to embodiments of this application by executing programs in ROM 602 and / or RAM 603. It should be noted that programs may also be stored in one or more memories other than ROM 602 and RAM 603. Processor 601 may also execute various operations of the method flow according to embodiments of this application by executing programs stored in one or more memories.
[0157] According to embodiments of this application, the electronic device 600 may further include an input / output (I / O) interface 605, which is also connected to a bus 604. The electronic device 600 may also include one or more of the following components connected to the input / output (I / O) interface 605: an input section 606 including a keyboard, mouse, etc.; an output section 607 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc.; and a communication section 609 including a network interface card such as a LAN card, modem, etc. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the input / output (I / O) interface 605 as needed. A removable medium 611, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on the drive 610 as needed so that computer programs read from it can be installed into the storage section 608 as needed.
[0158] According to embodiments of this application, the method flow according to embodiments of this application can be implemented as a computer software program. For example, embodiments of this application include a computer program product comprising a computer program carried on a computer-readable storage medium, the computer program containing program code for performing the methods shown in the flowchart. In such embodiments, the computer program can be downloaded and installed from a network via communication section 609, and / or installed from removable medium 611. When the computer program is executed by processor 601, it performs the functions defined in the system of embodiments of this application. According to embodiments of this application, the systems, devices, apparatuses, modules, units, etc., described above can be implemented by computer program modules.
[0159] This application also provides a computer-readable storage medium, which may be included in the device / apparatus / system described in the above embodiments; or it may exist independently and not assembled into the device / apparatus / system. The computer-readable storage medium carries one or more programs, which, when executed, implement the method according to the embodiments of this application.
[0160] According to embodiments of this application, the computer-readable storage medium can be a non-volatile computer-readable storage medium. Examples include, but are not limited to: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In this application, the computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
[0161] For example, according to embodiments of this application, a computer-readable storage medium may include the ROM 602 and / or RAM 603 described above and / or one or more memories other than ROM 602 and RAM 603.
[0162] Embodiments of this application also include a computer program product comprising a computer program containing program code for performing the methods provided in the embodiments of this application. When the computer program product is run on an electronic device, the program code is used to enable the electronic device to implement the evaluation method for inferring the spread of infectious diseases based on large model-assisted data provided in the embodiments of this application.
[0163] When the computer program is executed by the processor 601, it performs the functions defined in the system / apparatus of this application embodiment. According to the embodiments of this application, the systems, apparatuses, modules, units, etc., described above can be implemented by computer program modules.
[0164] In one embodiment, the computer program may rely on a tangible storage medium such as an optical storage device or a magnetic storage device. In another embodiment, the computer program may also be transmitted and distributed in the form of signals over a network medium, and downloaded and installed via the communication section 609, and / or installed from the removable medium 611. The program code contained in the computer program can be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination thereof.
[0165] According to embodiments of this application, program code for executing the computer programs provided in the embodiments of this application can be written in any combination of one or more programming languages. Specifically, these computational programs can be implemented using high-level procedural and / or object-oriented programming languages, and / or assembly / machine languages. Programming languages include, but are not limited to, languages such as Java, C++, Python, "C", or similar programming languages. The program code can be executed entirely on the user's computing device, partially on the user's device, partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).
[0166] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this application. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions. Those skilled in the art will understand that the features described in the various embodiments of this application can be combined and / or combined in various ways, even if such combinations are not explicitly described in this application. In particular, without departing from the spirit and teachings of this application, the features described in the various embodiments of this application can be combined and / or combined in various ways. All such combinations and / or combinations fall within the scope of this application.
[0167] The embodiments of this application have been described above. However, these embodiments are merely illustrative and not intended to limit the scope of this application. Although various embodiments have been described above, this does not mean that the measures in the various embodiments cannot be used advantageously in combination. Without departing from the scope of this application, those skilled in the art can make various substitutions and modifications, all of which should fall within the scope of this application.
Claims
1. An evaluation method based on large-scale model-assisted data for predicting the spread of infectious diseases, characterized in that, The evaluation method includes: Obtain the infectious disease transmission projection data sequence to be evaluated, which includes multiple infectious disease transmission prediction data for a target area and arranged in chronological order within a predetermined time window; Based on the text semantic feature vector obtained by semantic parsing and encoding the multi-source event text of the target region using a pre-trained large language model, the relative projection distance between the text semantic feature vector and the preset extreme value semantic benchmark vector is calculated to extract multiple event parameter tuples from the multiple multi-source event texts. The event parameter tuples include the event occurrence time, event type, and event intensity characterizing the intensity of the event's impact on the spread of infectious diseases. For each of the events, a time decay kernel function matching the event type is used to generate an event influence coefficient sequence characterizing the impact of each event on the spread of infectious diseases, based on the event intensity and the event onset time, as well as the range of event influence coefficient values corresponding to the event type. The event onset time is determined based on the event occurrence time and the lag parameter corresponding to the event type. The comprehensive event impact coefficient sequence obtained by linearly superimposing and fusing multiple event impact coefficient sequences and nonlinearly truncating them is used as the causal prior of the effective reproduction number fluctuation trend. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event impact coefficient sequence at key feature points, a causal logic evaluation result is generated. Based on the causal logic evaluation results, the smoothness evaluation results characterizing the smoothness of the infectious disease transmission projection data sequence, and the physical compliance evaluation results characterizing whether the infectious disease transmission projection data sequence meets physical boundary constraints, a rationality evaluation result is generated.
2. The evaluation method according to claim 1, characterized in that, The evaluation method also includes: When the event type characterizes the event as an inhibitory event that hinders transmission or a release event that accelerates transmission, the exponential decay function is used as the time decay kernel function to simulate the gradual weakening of the event's impact on the spread of infectious diseases over time. When the event type characterization of the event is a pulse-type event, the Gaussian distribution function is used as the time decay kernel function to simulate the fluctuation of the event's impact on the spread of infectious diseases, which first rises and then falls. The pulse-type event is an event whose impact duration on the spread of infectious diseases is less than a predetermined impact duration threshold, and the predetermined impact duration threshold is less than the duration of the predetermined time window.
3. The evaluation method according to claim 1, characterized in that, The step of generating an event impact coefficient sequence characterizing the impact of each event on the spread of infectious diseases, based on the event intensity, the event onset time, and the range of event impact coefficient values corresponding to the event type, using a time decay kernel function matched to the event type, includes: When the event type indicates that the event is a suppressive event, the influence coefficient of the event is determined to be negative. When the event type indicates that the event is a release-type event, the value range of the event influence coefficient is determined to be positive. The effective time of the event is taken as the effective point of the time decay kernel function on the time axis. The event influence coefficient at each time point is calculated based on the event intensity and the range of the event influence coefficient, thus obtaining the event influence coefficient sequence.
4. The evaluation method according to claim 1, characterized in that, The comprehensive event impact coefficient sequence, obtained by linearly superimposing and fusing multiple event impact coefficient sequences and performing nonlinear truncation, is used as a causal prior for the effective reproduction number fluctuation trend. By comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event impact coefficient sequence at key characteristic points, a causal logic evaluation result is generated, including: When the comprehensive event impact coefficient sequence indicates the presence of a target-inhibiting event, the impact time window for the target-inhibiting event to exert its influence is determined based on the comprehensive event impact coefficient sequence. The target-inhibiting event is an inhibitory event whose event intensity is greater than a predetermined event intensity threshold. When the mean of the first derivative of the infectious disease transmission projection data sequence is less than zero or shows a monotonically decreasing trend in the time window of influence, the causal logic evaluation result is determined to characterize the causal logic of the infectious disease transmission projection data sequence. When the comprehensive event impact coefficient sequence indicates the presence of a target pulse-type event, the theoretical peak time is determined based on the comprehensive event impact coefficient sequence. The target pulse-type event is a pulse-type event with an event intensity greater than the predetermined event intensity threshold. If it is determined that the infectious disease transmission projection data sequence has a local maximum within the neighborhood time window of the theoretical peak time, the causal logic evaluation result is determined to characterize that the infectious disease transmission projection data sequence satisfies causal logic, and the first derivative of the local maximum is zero and the second derivative is less than zero.
5. The evaluation method according to claim 1, characterized in that, The reasonableness assessment result is generated based on the causal logic assessment result, the smoothness assessment result characterizing the smoothness of the infectious disease transmission projection data sequence, and the physical compliance assessment result characterizing whether the infectious disease transmission projection data sequence meets physical boundary constraints. This includes: Based on the statistical distribution characteristics of historical infectious disease transmission data in the target area, the threshold for the rate of change of infectious disease transmission and the threshold for infectious disease transmission data are determined within a predetermined time window. Based on the infectious disease transmission change rate threshold, the change rate of every two adjacent infectious disease transmission prediction data in the infectious disease transmission extrapolation data sequence is evaluated to obtain the smoothness evaluation result. Based on the infectious disease transmission data threshold, the infectious disease transmission prediction data of each infectious disease in the infectious disease transmission projection data sequence are evaluated, and the degree of deviation of the infectious disease transmission projection data sequence from the historical epidemic curve manifold is evaluated through significance test to obtain the physical compliance evaluation result. The smoothness assessment results, physical compliance assessment results, and causal logic assessment results are integrated using a weighted entropy method to generate a rationality assessment result for the data sequence of the infectious disease transmission projection.
6. The evaluation method according to claim 5, characterized in that, Based on the statistical distribution characteristics of historical infectious disease transmission data in the target area, a threshold for the rate of change of infectious disease transmission data within a predetermined time window is determined, including: First-order difference calculations were performed on multiple historical infectious disease transmission data arranged in chronological order to obtain a historical rate of change sequence; Determine the mean and standard deviation of the historical rate of change series; Based on the standard deviation of the mean and a preset multiple, the threshold for the rate of change of infectious disease transmission is determined to constrain the degree of mutation between every two adjacent infectious disease transmission prediction data in the infectious disease transmission projection data sequence.
7. The evaluation method according to claim 5, characterized in that, Based on the statistical distribution characteristics of historical infectious disease transmission data in the target area, the threshold for infectious disease transmission data within a predetermined time window is determined, including: Determine the peak data of historical infectious disease transmission from multiple historical infectious disease transmission data; The minimum value between the historical peak data of infectious disease transmission and the total population of the target area is determined as the threshold value for the infectious disease transmission data.
8. The evaluation method according to claim 5, characterized in that, The physical compliance assessment results are obtained by evaluating the degree of deviation of the infectious disease transmission projection data sequence from the historical epidemic curve manifold through significance testing, including: The degree of deviation of the infectious disease transmission projection data sequence from the historical epidemic curve manifold in the probability space is determined by a significance test operator based on Mahalanobis distance or dynamic time regularization distance, and a deviation statistical score is obtained. If the deviation score is greater than the anomaly determination threshold set based on confidence level, the infectious disease transmission prediction data corresponding to the deviation score is determined to be abnormal data, and the physical compliance assessment result is determined to indicate that the infectious disease transmission projection data sequence is non-compliant.
9. An evaluation device based on large-scale model-assisted infectious disease transmission trend projection data, characterized in that, The evaluation device includes: The data acquisition module is used to acquire the infectious disease transmission projection data sequence to be evaluated. The infectious disease transmission projection data sequence includes multiple infectious disease transmission prediction data arranged in chronological order for a target area within a predetermined time window. The parsing and extraction module is used to extract multiple event parameter tuples from multiple multi-source event texts based on the text semantic feature vector obtained by semantic parsing and encoding the multi-source event texts of the target region using a pre-trained large language model. The event parameter tuples include the event occurrence time, event type, and event intensity characterizing the intensity of the event's impact on the spread of infectious diseases. The impact determination module is used to generate a sequence of event impact coefficients characterizing the impact of each event on the spread of infectious diseases, based on the event intensity and the event onset time of the event, and the range of event impact coefficient values corresponding to the event type, using a time decay kernel function that matches the event type. The event onset time is determined based on the event occurrence time of the event and the lag parameter corresponding to the event type. The impact assessment module is used to take the comprehensive event impact coefficient sequence obtained by linear superposition and nonlinear truncation of multiple event impact coefficient sequences as the causal prior of the effective reproduction number fluctuation trend, and generate causal logic assessment results by comparing the dynamic consistency between the infectious disease transmission projection data sequence and the comprehensive event impact coefficient sequence at key feature points. The result generation module is used to generate a rationality assessment result based on the causal logic assessment result, the smoothness assessment result characterizing the smoothness of the infectious disease transmission projection data sequence, and the physical compliance assessment result characterizing whether the infectious disease transmission projection data sequence meets physical boundary constraints.
10. An electronic device, comprising: One or more processors; Memory, used to store one or more computer programs. The characteristic feature is that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1 to 8.