Processing method and system of index data of childhood autoimmune encephalitis based on mechanism information, and medium
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN CHILDRENS HOSPITAL
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-12
AI Technical Summary
Existing methods for processing clinical data on childhood autoimmune encephalitis suffer from poor data interpretability and low prediction accuracy, especially when sparse data conditions make it difficult to accurately fit the pathological process.
A physical information neural network model based on mechanistic information (PINN model) is adopted. The model is trained by window data processing and feature encoding, combined with mechanistic constraints. It uses sparse clinical time series data for prediction and outputs the predicted state estimation vector and inverted mechanism parameters.
Under sparse data conditions, the model can accurately reconstruct the continuous evolution trajectory of pathological indicators, providing highly interpretable and accurate reference data, and is suitable for predicting the course of autoimmune encephalitis in children.
Smart Images

Figure CN122201725A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the technical field of data processing for childhood autoimmune encephalitis, and in particular, to methods, systems, and media for processing childhood autoimmune encephalitis index data based on mechanistic information. Background Technology
[0002] Autoimmune encephalitis is an inflammatory cascade response triggered by the body's immune system misrecognizing antigens in the central nervous system. Pediatric autoimmune encephalitis (AE) is a serious disease (brain disease) caused by the body's immune system mistakenly attacking the central nervous system. In the pediatric population, anti-NMDAR encephalitis is the most common type of AE, followed by acute disseminated encephalomyelitis and anti-MOG antibody-associated encephalitis.
[0003] With the development of computer technology, processing clinical data on encephalitis can provide doctors with data references. Providing data references for pediatric autoimmune encephalitis has mainly gone through three stages: traditional clinical scoring systems, statistical machine learning models, and deep learning and large language models. However, existing methods for processing encephalitis clinical data have the following shortcomings: First, they heavily rely on massive amounts of labeled data, failing to resolve the contradiction between small samples and data sparsity. Traditional deep learning models (CNN models, RNN models) are essentially statistical correlation models, requiring a large number of high-quality labeled samples to converge. However, in pediatric AE clinical data, due to poor patient compliance and the invasiveness of lumbar puncture, key biomarkers (such as cerebrospinal fluid antibody titers, receptor density data, etc.) are often scarce. First, clinical data is typically available only at a very few time points, such as admission and discharge, resulting in highly sparse clinical data. Furthermore, due to the extremely small sample size at each center, existing models cannot reasonably fit the evolution curve with only sparse detection data points, easily leading to overfitting or underfitting. Second, current fitting methods only converge at the data level, without considering the biological mechanisms, resulting in a lack of domain constraints and thus misleading predictions. This makes the model's predictions uninterpretable and fails to meet the needs of clinical evidence-based medicine. Third, acute exacerbations (AEs) are a dynamic pathological process evolving from prodromal fever to mental symptoms and then to motor disorders; existing models lack the ability to model this dynamic evolution. In summary, current medical data processing methods are sensitive to missing pediatric data and have low accuracy in inferring data.
[0004] Therefore, it is necessary to provide a method, system, and medium for processing pediatric autoimmune encephalitis index data based on mechanistic information, aiming to solve the technical problems of poor data interpretability and low data prediction accuracy when processing existing encephalitis clinical data to provide data reference. Summary of the Invention
[0005] The present invention provides a method for processing indicator data of childhood autoimmune encephalitis based on mechanistic information, which aims to solve the technical problems of poor data interpretability and low data prediction accuracy when processing existing clinical data of encephalitis to provide data reference.
[0006] To achieve the above objectives, the technical solution adopted by the present invention is as follows: A method for processing childhood autoimmune encephalitis marker data based on mechanistic information, comprising the following steps: S10, Training the neural network model to acquire physical information, specifically includes the following steps: Window data processing and feature encoding are performed on individual clinical time series data to obtain a single-unit global semantic feature vector, which is used to characterize the statistical correlation of the single-unit clinical time series data. A physical information neural network architecture is constructed. The input of the physical information neural network architecture is a combination of the time variable t and the global semantic feature vector of the individual. The output of the physical information neural network architecture is the estimated individual encephalitis state vector corresponding to the time variable t. The estimated individual encephalitis state vector includes the estimated antibody titer vector. and predicted receptor density vector The label data of the physical information neural network architecture is clinical observation data extracted from the single clinical time series data. The clinical observation data includes antibody titer data and receptor density data. A physical residual function is constructed based on preset mechanism constraints. The data fitting loss is calculated based on the clinical observation data. The physical residual loss and the data fitting loss constitute the total loss. The initial model is iteratively trained using backpropagation optimization to update the neural network parameters and personalized mechanism parameters. When the preset convergence condition is met, the trained physical information neural network model is obtained. S20, acquire the current sparse clinical time-series data, perform window data processing and feature encoding on the current sparse clinical time-series data, and acquire the current global semantic feature vector; The current global semantic feature vector and the prediction time node Input the physical information neural network model, output the predicted time node. Current predicted state estimation vector at time and inversion mechanism parameters, among which, Indicates the predicted time point Predicted antibody titer vector at time, Indicates the predicted time point The predicted receptor density vector at that time.
[0007] Furthermore, in step S10, the mechanistic constraints include conditions related to changes in autoantibody titer and changes in neuronal receptor density. The functional expression for the condition of autoantibody titer change is: ; The functional expression for the condition of neuronal receptor density change is: ; in, The time derivative of the theoretical antibody titer data. This is a physical parameter representing the antibody production rate. These are theoretical antibody titer data. This represents clinically detectable data on the strength of immune activation. The preset clearing coefficient; The time derivative representing the theoretical receptor density, Indicates theoretical receptor density, Indicates the physical parameters of pathological damage. To preset the natural synthesis rate of the receptor, The preset natural degradation rate.
[0008] Furthermore, in step S10, , , , This represents the total loss during model training. Indicates data observation loss. Indicates the observation loss weight, This represents the total number of observations of the detected antibody titer data in the clinical observation data during model training. , Indicates the sampling time node The detection antibody titer data, Indicates the sampling time node during model training. The predicted antibody titer vector, Indicates the mechanism residual loss. Indicates the mechanism loss weight. + =1, This represents the total number of points in the random sampling configuration. , This indicates a randomly sampled node during model training. The physical residual of antibody titer This indicates a randomly sampled node during model training. The time derivative of the predicted antibody titer vector, This indicates a randomly sampled node during model training. The predicted immune activation intensity vector, This indicates a randomly sampled node during model training. The physical parameters of antibody production rate This indicates a randomly sampled node during model training. The predicted antibody titer vector, This indicates a randomly sampled node during model training. The physical residual term of receptor density, This indicates a randomly sampled node during model training. The estimated receptor density vector.
[0009] Furthermore, a lexical feature encoding model is used to perform window data processing and feature encoding on the individual clinical time-series data to obtain the individual global semantic feature vector; the lexical feature encoding model is also used to perform window data processing and feature encoding on the current sparse clinical time-series data to obtain the current global semantic feature vector.
[0010] Furthermore, obtaining the current global semantic feature vector in step S20 specifically includes: S201, Obtain the current sparse clinical time-series data, and divide the current sparse clinical time-series data into continuous feature index data, single-choice category feature index data, and multi-choice discrete feature index data; S202, using a vector embedding method based on feature word tuples, the feature values of each single-index continuous feature data of the standardized continuous feature index data are... Mapped to a d-dimensional single-index embedding vector , ; The single-category indicator data in the single-selection category indicator data Mapped to d-dimensional single-class embedding vector , The individual symptom label data of the multi-select discrete feature data are combined to form a multi-select discrete feature symptom subset, and the multi-select discrete feature symptom subset is mapped to a multi-select feature embedding vector using a checkbox embedding mechanism. ; S203, embed the single index into the vector The single-class embedding vector and the multi-select feature embedding vector Concatenate into a clinical indicator vector sequence ; S204, the clinical indicator vector sequence The input is fed into a multi-layer Transformer encoder, which performs feature interaction, missing value imputation, and feature encoding, outputting the current global semantic feature vector. .
[0011] Furthermore, the continuous characteristic index data includes age data, disease duration data, peak body temperature data, cerebrospinal fluid white blood cell count data, protein quantification data, antibody titer data, and receptor density data; the continuous independent label data of each continuous characteristic index data is normalized and mapped to a standard normal distribution to obtain the corresponding single-index continuous characteristic data; The single-category feature index data includes gender data, past medical history data, and previous infection history data. A learnable embedding table is used to query the single-category index data to obtain each single-category index data. Corresponding single-class embedding vector ; Using formula The multi-select discrete feature symptom subset is weighted and aggregated to obtain the multi-select feature embedding vector. ,in, , This is a subset of the discrete feature symptoms. For the complete set of total symptom tags, Let be the learnable vector corresponding to the single symptom label data i in the discrete feature data subset.
[0012] Further, in step S204, the clinical indicator vector sequence is... The input is fed into the multi-layer Transformer encoder, and the attention score is calculated using a self-attention calculation mechanism to obtain the association weight between each clinical indicator label in the clinical indicator vector sequence, thereby capturing the association of clinical features in the time-series dimension. The clinical indicator vector sequence The missing real features are replaced with mask vectors. During the self-attention calculation process, the model automatically adjusts the attention weights at the mask positions, utilizing the aforementioned clinical indicator vector sequence. The context information is adaptively filled into the mask vector; Based on feature encoding and attention completion, the current global semantic feature vector is output. .
[0013] Furthermore, it also includes step S30. A logical rule knowledge base for medical attribute tags is constructed, wherein the logical rule knowledge base has a standardized tag rule structure; Construct a neural predicate and differentiable logic layer, input the single global semantic feature vector and the single encephalitis state vector output in step S10 into the neural predicate network for mapping, and obtain the predicate truth probability corresponding to each medical attribute label; By employing logic relaxation techniques, Boolean logic is extended into real-valued logic corresponding to the label rule structure, and the label rules in the logic rule knowledge base are transformed into computable and differentiable mathematical functions using triangular modular theory. Logical satisfaction is calculated based on logical consistency loss and total loss. With logical satisfaction as a constraint, the neural predicate network, lexical feature encoding model and physical information neural network model are trained in a coordinated iterative manner. The weight parameters of the neural predicate network, lexical feature encoding model and physical information neural network model are corrected. At the same time, the personalized mechanism parameters of the physical information neural network model are corrected until the model training converges to the point that the output conforms to medical logic. The modified global semantic feature vector of a single entity is obtained based on the modified lexical feature encoding model after collaborative training, and the modified current prediction state estimation vector is obtained based on the modified physical information neural network model after collaborative training. The modified semantic feature vector and the current prediction state estimation vector are used to map the modified semantic feature vector and the trained neural predicate network to obtain the probability of each medical attribute label.
[0014] This invention also provides a system for processing indicator data of childhood autoimmune encephalitis based on mechanistic information. It includes a lexical feature encoding module, a physical information neural network module, a neural symbolic logic reasoning module, and a fully connected processing module. The lexical feature encoding module is used to process and encode single clinical time-series data to obtain single global semantic feature vectors; the lexical feature encoding module is used to process and feature-encode the current sparse clinical time-series data to obtain the current global semantic feature vectors; The physical information neural network module is used to predict time nodes based on the current global semantic feature vector. Output the current predicted state estimation vector and inversion mechanism parameters; The neural symbolic logic reasoning module is used to predict the probability of obtaining the predicate truth value corresponding to each medical attribute label, so as to realize the mapping and association between clinical features and medical attribute labels. The fully connected processing module connects the lexical feature encoding module, the physical information neural network module, and the neural symbolic logic reasoning module. The fully connected processing module is used to update and converge the lexical feature encoding module, the physical information neural network module, and the neural symbolic logic reasoning module. The fully connected processing module is also used to obtain the probability of each medical attribute label.
[0015] The present invention also provides an electronic device, including a processor, a memory, and a program stored in the memory and executable on the processor, wherein when the program is executed by the processor, it implements the steps of the above-described method for processing pediatric autoimmune encephalitis indicator data based on mechanistic information.
[0016] The present invention also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the above-described method for processing indicator data of childhood autoimmune encephalitis based on mechanistic information.
[0017] The present invention has the following beneficial effects: The present invention provides a method for processing pediatric autoimmune encephalitis index data based on mechanistic information. First, a physical information neural network model (PINN model) is trained. Specifically, window data processing and feature encoding are performed on individual clinical time-series data to obtain individual global semantic feature vectors. A physical residual function is constructed based on mechanistic constraints. Data fitting loss is calculated based on the predicted values at sparse observation time points and the corresponding clinical observation data. A total loss is constructed by combining the physical residual loss and the data fitting loss. The combination of the time variable t and the individual global semantic feature vector is used as the model input, and the estimated individual encephalitis state vector corresponding to the time variable t is used as the model output. Sparse clinical observation data is used as supervision data (labeled data). The initial model of the physical information neural network architecture is iteratively trained, and a physical information neural network model is obtained after convergence. Then, the current sparse clinical time-series data is processed to obtain the current global semantic feature vector. The current global semantic feature vector and the predicted time points are then combined... The physical information neural network model is input, and the current predicted state estimation vector and inversion mechanism parameters are output. The current predicted state estimation vector includes the predicted time node. The method of this invention embeds mechanistic constraints during the training of the physical information neural network model to impose physical constraints. For pediatric clinical data, even if only sparse node pathological index data (such as cerebrospinal fluid antibody titer and receptor density data at admission / discharge) can be obtained, the convergence path of the model can be forcibly constrained by physical laws. The continuous evolution trajectory of the index of the current predicted state estimation vector obtained based on the current sparse clinical time series data over time can be accurately restored. The model training and convergence process has low dependence on the amount of data, and the physical information neural network model has good generalization ability and robustness in scenarios with existing sparse data points. When the predicted current predicted state estimation vector and inversion mechanism parameters are used as reference data for brain diseases, the data interpretability is strong, the data prediction accuracy is high, and the inversion mechanism parameters corresponding to the current sparse clinical time series data can be intuitively displayed.
[0018] In addition to the objectives, features, and advantages described above, the present invention has other objectives, features, and advantages. The invention will now be described in further detail with reference to the figures. Attached Figure Description
[0019] The accompanying drawings, which form part of this application, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an undue limitation of the invention. In the drawings: Figure 1 This is a flowchart illustrating a method for processing indicator data of childhood autoimmune encephalitis based on mechanistic information in one embodiment of the present invention. Figure 2 This is a schematic diagram illustrating the framework principle of a system for processing pediatric autoimmune encephalitis indicator data based on mechanistic information, according to another embodiment of the invention. Detailed Implementation
[0020] It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
[0021] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.
[0022] It should be noted that all directional indications (such as up, down, left, right, front, back, etc.) in the embodiments of the present invention are only used to explain the relative positional relationship and movement of each component in a certain specific posture (as shown in the figure). If the specific posture changes, the directional indication will also change accordingly.
[0023] Furthermore, the use of terms such as "first" and "second" in this invention is for descriptive purposes only and should not be construed as indicating or implying their relative importance or implicitly specifying the number of technical features indicated. Therefore, a feature defined with "first" or "second" may explicitly or implicitly include at least one of that feature. Additionally, the technical solutions of the various embodiments can be combined with each other, but only on the basis of being achievable by those skilled in the art. When the combination of technical solutions is contradictory or impossible to implement, such a combination of technical solutions should be considered non-existent and not within the scope of protection claimed by this invention.
[0024] Please refer to Figure 1 and Figure 2The present invention provides a preferred embodiment of a method for processing childhood autoimmune encephalitis indicator data based on mechanistic information, comprising the following steps: S10, Training the neural network model for acquiring physical information (PINN model), specifically includes the following steps: Window data processing and feature encoding are performed on individual clinical time series data to obtain a single-unit global semantic feature vector, which is used to characterize the statistical correlation of the single-unit clinical time series data. A physical information neural network architecture is constructed. The input of the physical information neural network architecture is a combination of the time variable t and the global semantic feature vector of the individual. The output of the physical information neural network architecture is the estimated individual encephalitis state vector corresponding to the time variable t. The estimated individual encephalitis state vector includes the estimated antibody titer vector. and predicted receptor density vector The labeled data of the physical information neural network architecture is clinical observation data extracted from the single clinical time series data. The clinical observation data includes antibody titer data and receptor density data. A physical residual function is constructed based on preset mechanism constraints. The data fitting loss is calculated based on the clinical observation data. The physical residual loss is calculated by randomly sampling configuration points throughout the time period. The data fitting loss is calculated by using the estimated value of the observation time point and the corresponding clinical observation data. The physical residual loss and the data fitting loss constitute the total loss. The initial model is iteratively trained using backpropagation optimization to update the neural network parameters and personalized mechanism parameters. When the preset convergence condition is met, the trained physical information neural network model is obtained. S20, acquire the current sparse clinical time-series data, perform window data processing and feature encoding on the current sparse clinical time-series data, and acquire the current global semantic feature vector; The current global semantic feature vector and the prediction time node Input the physical information neural network model, output the predicted time node. Current predicted state estimation vector at time and inversion mechanism parameters, among which, Indicates the predicted time point Predicted antibody titer vector at time, Indicates the predicted time point The predicted receptor density vector at that time.
[0025] Understandably, in a preferred embodiment of the present invention, a lexical feature encoding model (FT-Transformer model) is used to process clinical time-series data to obtain semantic feature vectors; specifically, the lexical feature encoding model is used to perform window data processing and feature encoding on individual clinical time-series data to obtain individual global semantic feature vectors; the lexical feature encoding model is used to perform window data processing and feature encoding on the current sparse clinical time-series data to obtain the current global semantic feature vectors.
[0026] The present invention provides a method for processing pediatric autoimmune encephalitis index data based on mechanistic information. First, a physical information neural network model (PINN model) is trained. Specifically, a token feature encoding model (FT-Transformer model) is used to perform window data processing and feature encoding on individual clinical time-series data to obtain a global semantic feature vector for each individual. A physical residual function is constructed based on mechanistic constraints. Data fitting loss is calculated based on the predicted values at sparse observation time points and the corresponding clinical observation data. A total loss is constructed by combining the physical residual loss and the data fitting loss. The combination of the time variable t and the global semantic feature vector for each individual is used as the model input, and the estimated individual encephalitis state vector corresponding to the time variable t is used as the model output. Sparse clinical observation data is used as supervision data (labeled data). The initial model of the physical information neural network architecture is iteratively trained, and a physical information neural network model is obtained after convergence. Then, the token feature encoding model is used to process the current sparse clinical time-series data to obtain the current global semantic feature vector. The current global semantic feature vector and the predicted time points are then combined... The physical information neural network model is input, and the current predicted state estimation vector and inversion mechanism parameters are output. The current predicted state estimation vector includes the predicted time node. The method of this invention embeds mechanistic constraints during the training of the physical information neural network model to impose physical constraints. For pediatric clinical data, even if only sparse node pathological index data (such as cerebrospinal fluid antibody titer and receptor density data at admission / discharge) can be obtained, the convergence path of the model can be forcibly constrained by physical laws. The continuous evolution trajectory of the index of the current predicted state estimation vector obtained based on the current sparse clinical time series data over time can be accurately restored. The model training and convergence process has low dependence on the amount of data, and the physical information neural network model has good generalization ability and robustness in scenarios with existing sparse data points. When the predicted current predicted state estimation vector and inversion mechanism parameters are used as reference data for brain diseases, the data interpretability is strong, the data prediction accuracy is high, and the inversion mechanism parameters corresponding to the current sparse clinical time series data can be intuitively displayed.
[0027] Understandably, in the scheme of the present invention, all individual clinical time-series data are used to extract all individual global semantic feature vectors based on the feature extraction of the FT-Transformer model, which are used as model inputs for training the PINN model; sparse clinical observation data extracted from the individual clinical time-series data are used as label data for training the PINN model; the PINN model is trained based on the input data, output labels, and mechanistic constraints (ODEs).
[0028] Understandably, due to poor patient compliance in pediatrics and the invasive nature of lumbar puncture, key biomarkers / pathological indicators (such as cerebrospinal fluid antibody titer data and receptor density data) are typically only available at a very few time points, such as admission and discharge, resulting in a small data volume. The solution of this invention is primarily used to process clinical data on pediatric adverse events (AEs) and provide logically consistent reference data. This addresses the existing technical problems of insufficient pediatric AE data from a single center, poor interpretability of the medical logic of predictive neural networks when labeled data is highly sparse, inaccurate reference data generated by predictive neural networks, and low medical confidence. The solution of this invention constructs a physical information neural network model (immunokinetic model) based on the time variable t, which can simulate and predict disease progression data within a time window. The corresponding future trend enables continuous prediction of disease progression data at unobserved time points.
[0029] In a preferred embodiment of the present invention, to facilitate the calculation of the physical loss term, the estimated individual encephalitis state vector further includes an estimated immune activation intensity vector. Clinical observation data also includes data on the detection of immune strength.
[0030] Furthermore, in step S10, the mechanistic constraints include conditions related to changes in autoantibody titer and changes in neuronal receptor density. The functional expression for the condition of autoantibody titer change is: ; The functional expression for the condition of neuronal receptor density change is: ; in, The time derivative of the theoretical antibody titer data. This is a physical parameter representing the antibody production rate. The theoretical antibody titer data is a continuous function of the time variable t, representing the theoretical change in antibody titer under the mechanistic model. This represents clinically detectable immune activation intensity data, which is a continuous function of time t, and represents the theoretical level of immune activation under the mechanistic model. The preset clearing coefficient; The time derivative representing the theoretical receptor density, Indicates theoretical receptor density, Indicates the physical parameters of pathological damage. To preset the natural synthesis rate of the receptor, The preset natural degradation rate.
[0031] Understandably, in a preferred embodiment of the present invention, the inversion mechanism parameter is at least one of the following: inversion antibody generation rate, inversion clearance coefficient, inversion receptor natural synthesis rate, and inversion natural degradation rate.
[0032] Preferably, in a typical embodiment of the present invention, the natural degradation rate is configured for data processing targeting NMDAR encephalitis. It is 0.3 0.8 The preferred value is between 0.5 and 0.5. This corresponds to the biological half-life of the receptor, which is approximately 24-48 hours; the preset natural synthesis rate of the receptor... In its initial state, it is configured to be similar to Equal to maintain normalized receptor density Steady state; physical parameters of pathological damage. The values, configured between 1.0 and 5.0, preferably 2.0, characterize the pathological process of rapid cross-linking and endocytosis of the receptor under the action of pathogenic antibodies, a rate significantly higher than the natural degradation rate. It is worth noting that these parameters are not fixed constants, but rather serve as trainable variables or central values of regularization constraints in the PINN network. The system allows for fine-tuning of these parameters using gradient descent based on individual time-series observation data (such as actually measured antibody titer decline curves), thereby obtaining individualized immunodynamic characteristic parameters.
[0033] Furthermore, in step S10, , , , This represents the total loss during model training. This represents the data observation loss. In this embodiment, the data observation loss is the fitting loss of the antibody titer data in the clinical observation data. Indicates the observation loss weight, This represents the total number of observations of the detected antibody titer data in the clinical observation data during model training. , Indicates the sampling time node The detection antibody titer data, Indicates the sampling time node during model training. The predicted antibody titer vector, Indicates the mechanism residual loss. Indicates the mechanism loss weight. + =1, This represents the total number of points in the random sampling configuration. , This indicates a randomly sampled node during model training. The physical residual of antibody titer This indicates a randomly sampled node during model training. The time derivative of the predicted antibody titer vector, This indicates a randomly sampled node during model training. The predicted immune activation intensity vector, This indicates a randomly sampled node during model training. The physical parameters of antibody production rate This indicates a randomly sampled node during model training. The predicted antibody titer vector, This indicates a randomly sampled node during model training. The physical residual term of receptor density, This indicates a randomly sampled node during model training. The estimated receptor density vector.
[0034] Understandably, in another embodiment of the present invention, the data observation loss may also be the sum of the fitting losses of the antibody titer data and receptor density data in the clinical observation data, i.e. ], Indicates the sampling time node The detection receptor concentration data, Indicates the sampling time node during model training. The predicted receptor concentration vector; if the clinical observation data during the model training process also includes data on the detection of immune activation intensity, the data observation loss can also be the sum of the fitting losses of the detection antibody titer data, detection receptor density data, and detection of immune activation intensity data in the clinical observation data.
[0035] The present invention introduces a Physical Information Neural Network (PINN) model, employing a dual-driven approach of mechanism and data. Compared to traditional black-box models that rely solely on massive data fitting, this model embeds the immune dynamics differential equations (ODEs) as prior physical knowledge into the total loss function. Physical constraints are introduced into network training, enabling the model to convergence path even with very few training samples (small / sparse samples) or only sparse observation points (such as only admission and discharge data). This allows for the accurate prediction of the continuous evolution trajectory of antibody titer vector indicators, receptor density vector indicators, and immune activation intensity vector indicators over time, thereby achieving continuous disease progression data extrapolation that conforms to biological laws and providing reference data that aligns with medical logic.
[0036] In one specific embodiment of the present invention, it specifically includes: Step 1: Construct a set of differential equations (ODEs) for neuroimmunological dynamics. Based on the receptor-antibody interaction kinetics, define the governing equations describing the pathophysiological process of autoimmune reactions (AE). The functional expression of the autoantibody titer change equation is as follows: ,in, This indicates the titer of pathogenic autoantibodies in cerebrospinal fluid. Indicates the strength of immune activation (level of inflammatory factors). This represents the theoretical antibody production rate. The natural clearance rate of the antibody or the clearance coefficient after plasma exchange therapy (preset clearance coefficient); the functional expression of the neuronal receptor density change equation is: in, This represents the density of functional receptors on the synaptic surface. To preset the natural synthesis rate of the receptor, To preset the natural degradation rate, This refers to the rate of antibody-mediated receptor endocytosis or destruction (i.e., a pathological damage parameter).
[0037] Step 2: Construct a physical information neural network architecture, building a fully connected deep neural network. ,in These are the learnable parameters of the network, including weights and biases. In the physical information neural network architecture, the initial model input is a time variable. The initial parameters of the dynamic equation mapping the global semantic feature vector of the individual (such as initial antibody titer, individualization parameters, etc.) are the output of the initial model training at time nodes. Predicted single encephalitis state vector , Indicates time node The predicted antibody titer vector, Indicates time node The predicted receptor density vector, Indicates time node The predicted immune activation intensity vector.
[0038] Step 3: Construct a composite loss function based on physical residuals. , This represents the data observation loss when fitting sparse, real-world clinical test data. , in, For a few sparse clinical sampling time points (such as the admission date, lumbar puncture date, and discharge date). Indicates the sampling time node The detection antibody titer data, Indicates the sampling time node during model training. The predicted antibody titer vector, 2 Represents the L2 norm; This represents the physical residual loss used to constrain the predictions at unobserved time points to conform to biological laws. Using automatic differentiation techniques, the residuals of the equations are calculated by directly differentiating the network output. In step S10, , , , in, For configuration points randomly sampled across the entire timeline, no real label data is required; physical constraints are embedded, even... With only two time points, admission and discharge, the data is insufficient. By forcing the network to follow differential equations at hundreds or thousands of configuration points, the antibody curves predicted by the network will not exhibit random oscillations, but will instead be smooth and conform to immunological principles.
[0039] Step four involves predicting the current predicted state vector and inversion mechanism parameters based on the disease progression. After network training converges, the physical information neural network model outputs two types of key information, predicting the future... Time nodes within the day Predicted antibody titer vector Change curves and predicted receptor density vector The change curve provides continuous disease trajectory data, offering a data reference for assessing the risk of immune rebound or relapse. Simultaneously, it inverts mechanistic parameters, using unknown parameters in the differential equation (such as antibody production rate and receptor destruction rate) as trainable variables. After training, the specific values of these unknown parameters reflect the individualized immune characteristics of each individual. For example, if the inverted mechanistic parameters... If the value is extremely high, then after reading the data and conducting appropriate examinations, the doctor can infer that the antibody in the individual has extremely strong pathogenicity, and even if the titer is not high, it may still lead to severe receptor loss.
[0040] The method described above in this invention transforms static, sparse clinical data into dynamic, continuous disease trajectory data and inversion mechanism parameters, providing a high-level semantic foundation for subsequent neural symbolic reasoning regarding medical attribute labels.
[0041] Furthermore, obtaining the current global semantic feature vector in step S20 specifically includes: S201, Obtain the current sparse clinical time-series data, and divide the current sparse clinical time-series data into continuous feature index data, single-choice category feature index data, and multi-choice discrete feature index data; S202, using a vector embedding method based on feature word tuples, the single-index continuous feature data of the standardized continuous feature index data are... Mapped to a d-dimensional single-index embedding vector , ; The single-category indicator data in the single-selection category indicator data Mapped to d-dimensional single-class embedding vector , The individual symptom label data of the multi-select discrete feature data are combined to form a multi-select discrete feature symptom subset, and the multi-select discrete feature symptom subset is mapped to a multi-select feature embedding vector using a checkbox embedding mechanism. ; S203, embed the single index into the vector The single-class embedding vector and the multi-select feature embedding vector Concatenate into a clinical indicator vector sequence ; S204, the clinical indicator vector sequence The input is fed into a multi-layer Transformer encoder, which performs feature interaction, missing value imputation, and feature encoding, outputting the current global semantic feature vector. .
[0042] Furthermore, the continuous characteristic index data includes age data, disease duration data, peak body temperature data, cerebrospinal fluid white blood cell count data, protein quantification data, antibody titer data, receptor density data, and immune activation intensity data; the continuous independent label data of each continuous characteristic index data is normalized and mapped to a standard normal distribution to obtain the corresponding single-index continuous characteristic data. The single-category feature index data includes gender data, past medical history data, and previous infection history data. A learnable embedding table is used to query the single-category index data to obtain each single-category index data. Corresponding single-class embedding vector ; Using formula The multi-select discrete feature symptom subset is weighted and aggregated to obtain the multi-select feature embedding vector. ,in, , This is a subset of the discrete feature symptoms. For the complete set of total symptom tags, Let be the learnable vector corresponding to a single symptom-labeled data point i in the discrete feature data subset. Wherein, This is for layer normalization processing.
[0043] Further, in step S204, the clinical indicator vector sequence is... The input is fed into the multi-layer Transformer encoder, and the attention score is calculated using a self-attention calculation mechanism to obtain the association weight between each clinical indicator label in the clinical indicator vector sequence, thereby capturing the association of clinical features in the time-series dimension. The clinical indicator vector sequence The missing real features are replaced with mask vectors. During the self-attention calculation process, the model automatically adjusts the attention weights at the mask positions, utilizing the aforementioned clinical indicator vector sequence. The context information is adaptively filled into the mask vector; Based on feature encoding and attention completion, the current global semantic feature vector is output. .
[0044] In another specific embodiment of the present invention, considering that clinical data of pediatric autoimmune encephalitis (AE) is characterized by multimodal data (numerical, textual, and categorical data), sparsity, and strong nonlinear correlation, the solution of the present invention abandons the traditional one-hot encoding and simple mean interpolation methods, and constructs a deep feature extractor based on the lexical feature encoding model (FT-Transformer model) architecture, specifically including: Step one involves the standardized cleaning of multi-source heterogeneous data. With the informed consent of the patient's family, the current sparse clinical time-series data (raw clinical data) of the child is obtained through the interface between the hospital information system and the laboratory information system. This current sparse clinical time-series data is then divided into three categories of heterogeneous feature data. For continuous feature index data (Numerical Features),... This includes data on the child's age, duration of illness, peak body temperature, cerebrospinal fluid white blood cell count, CSF protein quantification, antibody titer, and inflammatory markers (such as CRP and PCT). For each continuous independent label data point of the continuous feature index data, Z-score standardization is performed to map them to a standard normal distribution, obtaining the corresponding single-index continuous feature data. For single-category feature index data (CategoricalFeatures),... This includes data on gender (male / female), past medical history (present / absent), and previous infection history (present / absent / unknown); for multi-choice / set features, Multi-select discrete feature index data is complex feature data in AE diagnosis, referring to a set of non-mutually exclusive clinical symptom subsets, such as {mental and behavioral abnormalities, epileptic seizures, movement disorders, language disorders, etc.}. In the solution of this invention, the traditional method of decomposing it into a sparse 0 / 1 matrix is avoided, and the multi-select discrete feature data is regarded as a variable-length multi-select discrete feature symptom subset.
[0045] Step two involves vector embedding based on the lexical feature encoding model (FT-Transformer model) to map heterogeneous data to a unified high-dimensional (d-dimensional) semantic space; specifically, for continuous feature index data, each numerical feature... Learning a specific weight vector and bias vector Eigenvalues of continuous characteristic data of a single indicator Mapped to dimensional embedding vector , ,in, This allows neural networks to understand the specific semantics of numerical magnitudes; for example, the distance between body temperatures of 37.5℃ and 39.0℃ in pathological space is non-linearly stretched. For single-category index data, a learnable embedding table is used for features. The Each category has an embedding vector obtained by looking up a table. , For multiple-choice discrete feature data, considering that individuals may exhibit multiple symptoms simultaneously, a checkbox embedding mechanism is introduced, assuming that the complete set of all possible symptoms is... The subset of symptoms exhibited by individuals is For each symptom in the entire collection Assign a learnable vector Individual multi-select feature embedding By weighted aggregation of all symptom vectors in the subset, the variable-length clinical manifestation descriptions are effectively compressed into dense vectors of fixed length, while preserving the potential information of symptom co-occurrence.
[0046] Step 3: Based on the feature interaction and missing value handling of the multi-layer Transformer encoder, all feature embedding vectors obtained in Step 2 are concatenated into a feature sequence. And add a special [CLS] category token to aggregate global information. Subsequently, the sequence is input into a multi-layer Transformer encoder, which utilizes a self-attention mechanism to address feature interactions and missing values. This is achieved by calculating attention scores. The model automatically learns the association weights between different clinical indicators. For example, when antibody titer features are present, the model will give higher attention to features of mental and behavioral abnormalities. It considers the judgment logic that the combination of antibody positivity and mental symptoms is a high risk and realizes non-linear interaction modeling between features. When performing adaptive missing value processing, for common missing cases in pediatric data (such as lumbar puncture being refused, resulting in missing CSF data), the missing features in the input sequence are replaced with specific [MASK] vectors. During the self-attention calculation process, the Transformer will automatically adjust the attention weight of the [MASK] position based on other existing features (such as clinical symptoms and MRI reports), thereby using contextual information to infer or compensate for missing information.
[0047] Step four: After processing by multiple Transformer encoders, extract the vector corresponding to the [CLS] token from the output sequence. , It is a highly condensed global pathological state feature vector (multimodal feature latent vector) that contains all clinical and laboratory examination information of an individual.
[0048] Understandably, the extraction of the individual global semantic feature vector in step S10 is consistent with that in step S20. Optionally, in step S20, the current global semantic feature vector can also be obtained by encoding and extracting multiple types of clinical feature data through one-hot encoding embedding, TabNet model, or attention-based table feature encoding model. Alternatively, the SAINT model or time-series table attention network can be used to encode the current sparse clinical time-series data to obtain the global semantic feature vector. Specifically, for continuous feature index data, feature embedding is performed based on existing technologies that combine normalization with feature mapping. A standardized + MLP embedding approach is adopted, where Z-Score / Min-Max normalization is followed by mapping to a high-dimensional embedding vector using a single-layer perceptron, replacing the Linear(x)+b embedding method in this patent. For single-choice category feature index data, based on the existing concept of transforming discrete values into dense vectors, Word2Vec embedding is used, treating category features as words and mapping them to embedding vectors through a word vector model, replacing the existing lookup table embedding method. For multi-choice discrete feature index data, based on transforming variable-length symptom subsets into fixed-length dense vectors, a bag-of-embedding approach is used. A learnable embedding vector is assigned to each symptom, and the vectors of individual symptom subsets are mean-summed / max-pooled to obtain fixed-length vectors. This method is a commonly used technique in clinical NLP and feature processing.
[0049] Furthermore, it also includes step S30. A logical rule knowledge base for medical attribute tags is constructed, wherein the logical rule knowledge base has a standardized tag rule structure; Construct a neural predicate and differentiable logic layer, input the single global semantic feature vector and the single encephalitis state vector output in step S10 into the neural predicate network for mapping, and obtain the predicate truth probability corresponding to each medical attribute label; By employing logical relaxation techniques, Boolean logic is extended into real-valued logic corresponding to the label rule structure. Furthermore, the triangular modular theory is used to transform the label rules in the logical rule knowledge base into computable and differentiable mathematical functions. ; Based on logical consistency loss Total loss The logical satisfaction is calculated, and the neural predicate network, the lexical feature encoding model and the physical information neural network model are trained in a coordinated iterative manner with the logical satisfaction as a constraint. The weight parameters of the neural predicate network, the lexical feature encoding model and the physical information neural network model are corrected. At the same time, the personalized mechanism parameters of the physical information neural network model are corrected simultaneously until the model training converges to the point that the output conforms to medical logic. The modified global semantic feature vector of a single entity is obtained based on the modified lexical feature encoding model after collaborative training, and the modified current prediction state estimation vector is obtained based on the modified physical information neural network model after collaborative training. The modified semantic feature vector and the current prediction state estimation vector are used to map the modified semantic feature vector and the trained neural predicate network to obtain the probability of each medical attribute label.
[0050] In one specific embodiment of the present invention, a logical rule knowledge base for medical attribute tags is constructed. The medical attribute tags correspond to the clinical diagnostic criteria and related prerequisites for childhood autoimmune encephalitis. The logical rule knowledge base has a standardized tag rule structure (i.e., a first-order logical rule structure). The specific forms of expression can be: ,in, Representing an individual; This represents logical AND. Represents logical OR, Indicates logical NOT. This invention employs logical relaxation techniques to seamlessly integrate logical rules with the Physical Information Neural Network Model (PINN model) and the Lexical Feature Encoding Model (FT-Transformer) for gradient descent training. This involves relaxing Boolean logic (True / False, ...) into the Boolean logic. Extended to Real Logic ); When using the trigonometric modular algebra (T-Norm) theory to transform logical operations into differentiable algebraic operations, the specific form is logical AND (…). ), or ; Logical OR ( ), or ; Logical NOT ( ), Logical implication ( ), Through transformation, complex medical rules are... It is transformed into a computable and differentiable mathematical function. .
[0051] As a further technical improvement of the present invention, the solution in step S30 can be implemented by adding a differentiable logic layer to a neural predicate network. Alternatively, it can be replaced by existing neural symbolic reasoning techniques such as neural symbolic concept learner (NSCL), probabilistic soft logic (PSL), and tensor logic (Tensor Log). All of the above techniques can achieve the collaborative training of medical logic rules and neural networks, so that the model output conforms to the following knowledge logic.
[0052] This invention also provides a system for processing indicator data of childhood autoimmune encephalitis based on mechanistic information. It includes a lexical feature encoding module, a physical information neural network module, a neural symbolic logic reasoning module, and a fully connected processing module. The lexical feature encoding module is used to process and encode single clinical time-series data to obtain single global semantic feature vectors; the lexical feature encoding module is used to process and feature-encode the current sparse clinical time-series data to obtain the current global semantic feature vectors; The physical information neural network module is used to predict time nodes based on the current global semantic feature vector. Output the current predicted state estimation vector and inversion mechanism parameters; The neural symbolic logic reasoning module is used to predict the probability of obtaining the predicate truth value corresponding to each medical attribute label, so as to realize the mapping and association between clinical features and medical attribute labels. The fully connected processing module connects the lexical feature encoding module, the physical information neural network module, and the neural symbolic logic reasoning module. The fully connected processing module is used to update and converge the lexical feature encoding module, the physical information neural network module, and the neural symbolic logic reasoning module. The fully connected processing module is also used to obtain the probability of each medical attribute label.
[0053] The present invention also provides an electronic device, including a processor, a memory, and a program stored in the memory and executable on the processor, wherein the program, when executed by the processor, implements the steps of the above-described method for processing mechanistic information-based indicators of childhood autoimmune encephalitis.
[0054] The present invention also provides a computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, it implements the steps of the above-described method for processing indicator data of childhood autoimmune encephalitis based on mechanistic information.
[0055] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method for processing indicator data of childhood autoimmune encephalitis based on mechanistic information, characterized in that, Including the following steps: S10, Training the neural network model to acquire physical information, specifically includes the following steps: Window data processing and feature encoding are performed on individual clinical time series data to obtain a single-unit global semantic feature vector, which is used to characterize the statistical correlation of the single-unit clinical time series data. A physical information neural network architecture is constructed. The input of the physical information neural network architecture is a combination of the time variable t and the global semantic feature vector of the individual. The output of the physical information neural network architecture is the estimated individual encephalitis state vector corresponding to the time variable t. The estimated individual encephalitis state vector includes the estimated antibody titer vector. and predicted receptor density vector The label data of the physical information neural network architecture is clinical observation data extracted from the single clinical time series data. The clinical observation data includes antibody titer data and receptor density data. A physical residual function is constructed based on preset mechanism constraints. The data fitting loss is calculated based on the clinical observation data. The physical residual loss and the data fitting loss constitute the total loss. The initial model is iteratively trained using backpropagation optimization to update the neural network parameters and personalized mechanism parameters. When the preset convergence condition is met, the trained physical information neural network model is obtained. S20, acquire the current sparse clinical time-series data, perform window data processing and feature encoding on the current sparse clinical time-series data, and acquire the current global semantic feature vector; The current global semantic feature vector and the prediction time node Input the physical information neural network model, output the predicted time node. Current predicted state estimation vector at time and inversion mechanism parameters, among which, Indicates the predicted time point Predicted antibody titer vector at time, Indicates the predicted time point The predicted receptor density vector at that time.
2. The method for processing childhood autoimmune encephalitis index data based on mechanistic information according to claim 1, characterized in that, In step S10, the mechanistic constraints include changes in autoantibody titer and changes in neuronal receptor density. The functional expression for the condition of autoantibody titer change is: ;in, The time derivative of the theoretical antibody titer data. This is a physical parameter representing the antibody production rate. These are theoretical antibody titer data. This represents data on the intensity of immune activation. The preset clearing coefficient; The functional expression for the condition of neuronal receptor density change is: ;in, The time derivative representing the theoretical receptor density, Indicates theoretical receptor density, Indicates the physical parameters of pathological damage. To preset the natural synthesis rate of the receptor, The preset natural degradation rate.
3. The method for processing childhood autoimmune encephalitis index data based on mechanistic information according to claim 1, characterized in that, In step S10, , , , This represents the total loss during model training. Indicates data observation loss. Indicates the observation loss weight, This represents the total number of observations of the detected antibody titer data in the clinical observation data during model training. , Indicates the sampling time node The detection antibody titer data, Indicates the sampling time node during model training. The predicted antibody titer vector, Indicates the mechanism residual loss. Indicates the mechanism loss weight. + =1, This represents the total number of points in the random sampling configuration. , This indicates a randomly sampled node during model training. The physical residual of antibody titer This indicates a randomly sampled node during model training. The time derivative of the predicted antibody titer vector, This indicates a randomly sampled node during model training. The physical parameters of antibody production rate This indicates a randomly sampled node during model training. The predicted immune activation intensity vector, This indicates a randomly sampled node during model training. The predicted antibody titer vector, This indicates a randomly sampled node during model training. The physical residual term of receptor density, This indicates a randomly sampled node during model training. The estimated receptor density vector.
4. The method for processing pediatric autoimmune encephalitis index data based on mechanistic information according to any one of claims 1 to 3, characterized in that, A lexical feature encoding model is used to perform window data processing and feature encoding on the individual clinical time series data to obtain the individual global semantic feature vector; the same lexical feature encoding model is used to perform window data processing and feature encoding on the current sparse clinical time series data to obtain the current global semantic feature vector.
5. The method for processing childhood autoimmune encephalitis index data based on mechanistic information as described in claim 4, characterized in that, Step S20, obtaining the current global semantic feature vector, specifically includes: S201, Obtain the current sparse clinical time-series data, and divide the current sparse clinical time-series data into continuous feature index data, single-choice category feature index data, and multi-choice discrete feature index data; S202, using a vector embedding method based on feature word tuples, the feature values of each single-index continuous feature data of the standardized continuous feature index data are... Mapped to a d-dimensional single-index embedding vector , ; The single-category indicator data in the single-selection category indicator data Mapped to d-dimensional single-class embedding vector , The individual symptom label data of the multi-select discrete feature data are combined to form a multi-select discrete feature symptom subset, and the multi-select discrete feature symptom subset is mapped to a multi-select feature embedding vector using a checkbox embedding mechanism. ; S203, embed the single index into the vector The single-class embedding vector and the multi-select feature embedding vector Concatenate into a clinical indicator vector sequence ; S204, the clinical indicator vector sequence The input is fed into a multi-layer Transformer encoder, which performs feature interaction, missing value imputation, and feature encoding, outputting the current global semantic feature vector. .
6. The method for processing childhood autoimmune encephalitis index data based on mechanistic information according to claim 5, characterized in that, The continuous characteristic index data includes age data, disease duration data, peak body temperature data, cerebrospinal fluid white blood cell count data, protein quantification data, antibody titer data, and receptor density data; the continuous independent label data of each continuous characteristic index data is normalized and mapped to a standard normal distribution to obtain the corresponding single-index continuous characteristic data. The single-category feature index data includes gender data, past medical history data, and previous infection history data. A learnable embedding table is used to query the single-category index data to obtain each single-category index data. Corresponding single-class embedding vector ; Using formula The multi-select discrete feature symptom subset is weighted and aggregated to obtain the multi-select feature embedding vector. ,in, , This is a subset of the discrete feature symptoms. For the complete set of total symptom tags, Let be the learnable vector corresponding to the single symptom label data i in the discrete feature data subset.
7. The method for processing childhood autoimmune encephalitis index data based on mechanistic information according to claim 5, characterized in that, In step S204, the clinical indicator vector sequence is... The input is fed into the multi-layer Transformer encoder, and the attention score is calculated using a self-attention calculation mechanism to obtain the association weight between each clinical indicator label in the clinical indicator vector sequence, thereby capturing the association of clinical features in the time-series dimension. The clinical indicator vector sequence The missing real features are replaced with mask vectors. During the self-attention calculation process, the model automatically adjusts the attention weights at the mask positions, utilizing the aforementioned clinical indicator vector sequence. The context information is adaptively filled into the mask vector; Based on feature encoding and attention completion, the current global semantic feature vector is output. .
8. The method for processing childhood autoimmune encephalitis indicator data based on mechanistic information as described in claim 4, characterized in that, It also includes step S30, A logical rule knowledge base for medical attribute tags is constructed, wherein the logical rule knowledge base has a standardized tag rule structure; Construct a neural predicate and differentiable logic layer, input the single global semantic feature vector and the single encephalitis state vector output in step S10 into the neural predicate network for mapping, and obtain the predicate truth probability corresponding to each medical attribute label; By employing logic relaxation techniques, Boolean logic is extended into real-valued logic corresponding to the label rule structure, and the label rules in the logic rule knowledge base are transformed into computable and differentiable mathematical functions using triangular modular theory. Logical satisfaction is calculated based on logical consistency loss and total loss. With logical satisfaction as a constraint, the neural predicate network, lexical feature encoding model and physical information neural network model are trained in a coordinated iterative manner. The weight parameters of the neural predicate network, lexical feature encoding model and physical information neural network model are corrected. At the same time, the personalized mechanism parameters of the physical information neural network model are corrected until the model training converges to the point that the output conforms to medical logic. The modified global semantic feature vector of a single entity is obtained based on the modified lexical feature encoding model after collaborative training, and the modified current prediction state estimation vector is obtained based on the modified physical information neural network model after collaborative training. The modified semantic feature vector and the current prediction state estimation vector are used to map the modified semantic feature vector and the trained neural predicate network to obtain the probability of each medical attribute label.
9. A system for processing indicator data of childhood autoimmune encephalitis based on mechanistic information, characterized in that, It includes a lexical feature encoding module, a physical information neural network module, a neural symbolic logic reasoning module, and a fully connected processing module. The lexical feature encoding module is used to process and encode single clinical time-series data to obtain single global semantic feature vectors; the lexical feature encoding module is used to process and feature-encode the current sparse clinical time-series data to obtain the current global semantic feature vectors; The physical information neural network module is used to predict time nodes based on the current global semantic feature vector. Output the current predicted state estimation vector and inversion mechanism parameters; The neural symbolic logic reasoning module is used to predict the probability of obtaining the predicate truth value corresponding to each medical attribute label, so as to realize the mapping and association between clinical features and medical attribute labels. The fully connected processing module connects the lexical feature encoding module, the physical information neural network module, and the neural symbolic logic reasoning module. The fully connected processing module is used to update and converge the lexical feature encoding module, the physical information neural network module, and the neural symbolic logic reasoning module. The fully connected processing module is also used to obtain the probability of each medical attribute label.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the method for processing pediatric autoimmune encephalitis index data based on mechanistic information as described in any one of claims 1 to 8.