Multimodal glucose prediction method based on physiological decoupling and dynamic attention mechanism
By employing a multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms, this approach addresses the issues of ambiguous physiological mechanism characterization and uninterpretable models in existing technologies. It achieves high-precision, interpretable blood glucose prediction, thereby enhancing user trust and the individual adaptability of the model.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- XIDIAN UNIV
- Filing Date
- 2026-04-10
- Publication Date
- 2026-06-26
AI Technical Summary
Existing blood glucose prediction technologies have limitations in terms of physiological mechanism characterization and model interpretability. They cannot effectively learn clear physiological causal relationships, resulting in large prediction biases in complex scenarios, low user trust, and a lack of reliable decision guidance.
A multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanism is adopted. The multimodal data is transformed by physiological feature engineering and then input into parallel physiological sub-modules for feature extraction. The dynamic attention unit between modules is used for weighted fusion to generate a context vector containing key physiological information. Finally, the blood glucose prediction value and attribution explanation text are generated in parallel.
It achieves high-precision, interpretable blood glucose prediction, reduces prediction errors in key scenarios, enhances user trust, has stronger robustness and individual adaptability, and can provide reliable intervention measures.
Smart Images

Figure CN122291024A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of medical and health artificial intelligence and physiological signal prediction technology, specifically involving a multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanism. Background Technology
[0002] Accurate blood glucose prediction is a core requirement for automated diabetes management, but it has long faced key challenges such as opaque model decision-making processes, low user trust, and obstacles to clinical application.
[0003] Currently, AI-based blood glucose prediction technologies can be mainly categorized as follows: Unimodel Time-series Models, Multimodel Concatention Models which simply stitch together multimodal data, Models with Intra-sequence Attention, and Models with Rudimentary Modularization.
[0004] Single-modal time-series prediction models are the foundation of blood glucose prediction. They primarily employ recurrent neural networks (RNNs) and their variants, such as long short-term memory networks (LSTM) or gated recurrent units (GRUs), to directly model historical time series data from continuous glucose monitoring (CGM) to predict future blood glucose levels and trends. However, these models completely ignore the influence of key external factors such as diet and insulin, leading to a significant decrease in prediction accuracy in scenarios such as after meals and after insulin injections. Furthermore, the models exhibit poor generalization ability and cannot adapt to individual changes in daily life events.
[0005] Fusion models that simply concatenate multimodal data take heterogeneous data from different sources, such as CGM, carbohydrate intake, and insulin dosage, and after time alignment and normalization, simply concatenate them into a longer feature vector at each time step. This vector sequence is then input into the network for learning and prediction. However, this method treats data with completely different physiological dynamics (such as the glycemic effect curve of food and the glycemic effect curve of insulin) as homogeneous information, making it difficult for the model to learn clear causal relationships. Its essence remains a "black box" operation, unable to explain the specific physiological attribution of blood glucose changes.
[0006] Multimodal models incorporating temporal attention introduce an attention mechanism on top of multimodal fusion models. This attention mechanism primarily operates on the temporal dimension of the input sequence, aiming to allow the model to dynamically assign different weights to input features at different historical moments, thereby focusing on historical moments that are more important for future predictions. For example, the model might consider data points from one hour after a meal to be more important for current predictions than data points from four hours before a meal. However, this method's attention is intratemporal, only addressing the question of when something is important. Because all modal data are still bundled together, it cannot distinguish which physiological factor plays a dominant role at that important moment.
[0007] Preliminary modular multimodal models establish different processing pipelines for different types of data. For example, a convolutional neural network (CNN) is used to extract local features from CGM sequences, while a separate fully connected network processes static personal health records. Finally, the features extracted from these different pipelines are fused. However, the decoupling degree of this type of model is very limited, especially for the two core external factors, food (raising blood sugar) and insulin (lowering blood sugar), which have opposite and dynamically changing effects. Existing models usually still combine them into the same input stream for processing.
[0008] In summary, existing blood glucose prediction technologies have the following limitations in terms of physiological mechanism characterization and model interpretability: The physiological mechanism characterization is vague, simply fusing multi-source data with opposing physiological effects, making it difficult for the model to learn clear physiological causal relationships; "black box" models lack attribution: relying solely on intra-temporal attention, they can only identify important time points but cannot attribute them to specific physiological events; They lack dynamic adaptive capabilities, failing to dynamically adjust the weights of various factors based on real-time physiological states, resulting in large prediction biases in complex scenarios; They lack credible decision guidance, failing to effectively provide native interpretability, making it impossible for users to formulate precise intervention measures based on the prediction results, leading to low clinical trust and practicality. Summary of the Invention
[0009] To address the aforementioned problems in existing technologies, this invention provides a multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms. The technical problem to be solved by this invention is achieved through the following technical solution: This invention provides a multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms, including the following steps: Acquire multimodal data, which includes at least: continuous blood glucose monitoring data and multiple discrete physiological data; The discrete physiological data are transformed by physiological feature engineering to obtain multiple decoupled continuous physiological feature sequences. The continuous blood glucose monitoring data and the multiple continuous physiological feature sequences are respectively input into multiple parallel physiological sub-modules for feature extraction to obtain multiple deep state features; A dynamic attention unit between modules is used to generate attention weights corresponding to the multiple deep state features at each prediction time step, and the multiple deep state features are weighted and fused based on the attention weights to obtain a context vector containing key physiological information. The context vector and multiple attention weights are input into the prediction and explanation generation module to generate blood glucose prediction values and attribution explanation text in parallel.
[0010] In one embodiment of the present invention, each of the physiological sub-modules includes a first gated recurrent neural network and a second gated recurrent neural network stacked together. Both the first and second gated recurrent neural networks include 64 hidden units, and both use tanh as the activation function. The expression for the activation function is:
[0011] in, This indicates that the output of neurons within a gated recurrent network is a one-dimensional value. is the base of the natural logarithm; Alternatively, each of the physiological sub-modules may include a temporal convolutional network.
[0012] In one embodiment of the present invention, the inter-module dynamic attention unit includes a single-layer feedforward neural network and a Softmax function, the expression of which is:
[0013] in, This indicates that the i-th physiological submodule outputs a 1-dimensional value through a feedforward neural network, where e is the base of the natural logarithm.
[0014] In one embodiment of the present invention, the plurality of discrete physiological data includes: discrete carbohydrate intake data and discrete insulin injection data.
[0015] In one embodiment of the present invention, the plurality of discrete physiological data are respectively subjected to physiological feature engineering transformation to obtain a plurality of decoupled continuous physiological feature sequences, including: The discrete carbohydrate intake data were transformed using physiological feature engineering to obtain continuous carbohydrate intake data.
[0016] in, for t The amount of carbohydrates on the plate at any given time. for t- The amount of carbohydrates on the plate at time 1. For time step, The time constant for carbon absorption is... This represents the number of grams of carbohydrates newly ingested at the current moment. The discrete insulin injection data were transformed using physiological feature engineering to obtain continuous insulin levels on the plate.
[0017] in, for t The amount of insulin on the plate at any given time. for t- Insulin levels at time 1 This is the time constant of insulin action. This is the current insulin dose injected. For different types of insulin, the total insulin level is obtained by linearly adding the levels of different types of insulin on the plate.
[0018] In one embodiment of the present invention, the context vector is:
[0019]
[0020] in, For context vectors, , , These are the attention weights for carbohydrate intake data, insulin injection data, and blood glucose monitoring data, respectively. , , These are the deep state features of carbohydrate intake data, insulin injection data, and blood glucose monitoring data, respectively.
[0021] In one embodiment of the present invention, the plurality of discrete physiological data further includes discrete motion recording data.
[0022] In one embodiment of the present invention, the process of converting the discrete motion recording data into a decoupled continuous motion effect feature sequence includes: The discrete motion recording data are transformed using physiological feature engineering to obtain the motion effect in the latent effect chamber on the plate.
[0023] in, For the motion effect of the latent effect chamber in the plate quantity, for t The equivalent dose of the new starting motion at a given moment. The rise time constant is For time step, for t time; The motion effect of the latent effect chamber is combined with the plate quantity to calculate the motion effect of the active effect chamber, which is then used as the decoupled continuous motion effect characteristic sequence:
[0024] in, For the motion effect of the active effect chamber in the plate quantity, The descent time constant, For the transfer rate coefficient; The context vector is:
[0025]
[0026] in, For context vectors, , , , These are the attention weights for carbohydrate intake data, insulin injection data, blood glucose monitoring data, and exercise data, respectively. , , , These are the deep state features of carbohydrate intake data, insulin injection data, blood glucose monitoring data, and exercise record data, respectively.
[0027] In one embodiment of the present invention, the prediction and explanation generation module includes a blood glucose prediction value generation unit and an attribution explanation text generation unit; The blood glucose prediction generation unit includes a fully connected layer for extracting features from the context vector and outputting a blood glucose prediction value. The attribution explanation text generation unit is used to extract the maximum weight value at the current moment and its corresponding physiological submodule index from the multiple attention weights, and output a single dominant factor explanation or a summary attribution description based on the relationship between the maximum weight value and the threshold.
[0028] Another embodiment of the present invention provides a multimodal blood glucose prediction system based on physiological decoupling and dynamic attention mechanisms, for performing the methods described in the above embodiments, the system comprising: The data acquisition module is used to acquire multimodal data, which includes at least: continuous blood glucose monitoring data and multiple discrete physiological data. The physiological feature engineering module is used to transform the multiple discrete physiological data into multiple decoupled continuous physiological feature sequences through physiological feature engineering. The parallel decoupled modeling module is used to input the continuous blood glucose monitoring data and the multiple continuous physiological feature sequences into multiple parallel physiological sub-modules for feature extraction, thereby obtaining multiple deep state features; The dynamic attention fusion module is used to generate attention weights corresponding to the multiple deep state features at each prediction time step using dynamic attention units between modules, and to perform weighted fusion of the multiple deep state features based on the attention weights to obtain a context vector containing key physiological information. The prediction and explanation generation module is used to input the context vector and multiple attention weights into the prediction and explanation generation module, and generate blood glucose prediction values and attribution explanation text in parallel.
[0029] Compared with the prior art, the beneficial effects of the present invention are as follows: 1. This invention achieves physiological decoupling by transforming multiple discrete physiological data into physiological features through physiological feature engineering. Combined with the inter-module attention mechanism, it transforms the abstract prediction process into a quantitative analysis of the contribution of physiological factors such as food and insulin, and generates natural language explanations. This fundamentally solves the "black box" problem in the prior art, overcomes the trust gap caused by model decision-making, and enables users to formulate precise intervention measures based on the prediction results. 2. This invention completely separates core physiological processes (such as food raising blood sugar and insulin lowering blood sugar) into independent modules for modeling, avoiding mutual interference between features, and achieves optimal fusion through a dynamic attention mechanism between modules. This effectively breaks through the accuracy bottleneck of existing technologies in multimodal data fusion, and is expected to reduce the prediction error of key scenarios (such as post-meal) by more than 10%-20%. 3. The modular design of this invention learns the underlying physiological causal relationships rather than the surface data correlations, thus possessing stronger robustness and individual adaptability. It solves the problem that pure data-driven models have poor generalization ability and cannot adapt to changes in individual lifestyles, providing a reliable path to achieve a truly individualized blood glucose model that can provide stable long-term service. 4. This invention not only solves the long-standing "black box" and accuracy bottleneck problems in the field of blood glucose prediction, but its "decoupling-fusion-prediction-interpretation" technical paradigm has good scalability. It can flexibly add new parallel physiological modules (such as exercise effects) and can be extended to the field of comprehensive prediction and attribution analysis of multiple physiological factors such as blood pressure fluctuations and emotional stress, providing core algorithm support for the next generation of personalized digital diagnosis and intelligent chronic disease management platforms. Attached Figure Description
[0030] Figure 1 This is a flowchart illustrating the multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms provided in an embodiment of the present invention. Figure 2 This is a schematic diagram of the module interaction for multimodal blood glucose prediction based on physiological decoupling and dynamic attention mechanisms provided in an embodiment of the present invention. Figure 3 This is a schematic diagram of module interaction for another multimodal blood glucose prediction based on physiological decoupling and dynamic attention mechanism provided in an embodiment of the present invention. Detailed Implementation
[0031] The present invention will be further described in detail below with reference to specific embodiments, but the implementation of the present invention is not limited thereto.
[0032] This invention provides a multimodal interpretable blood glucose prediction system and method based on physiological decoupling and dynamic attention mechanisms. Through three core technologies—physiological feature engineering, parallel decoupling modeling, and dynamic attention fusion between modules—it achieves high-precision and highly reliable dynamic blood glucose prediction, and has significant advantages in "prediction accuracy, model interpretability, and clinical decision support".
[0033] Example 1 Please see Figure 1 and Figure 2 , Figure 1 This is a flowchart illustrating the multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms provided in an embodiment of the present invention. Figure 2 This is a schematic diagram of the module interaction for multimodal blood glucose prediction based on physiological decoupling and dynamic attention mechanisms provided in an embodiment of the present invention.
[0034] This method physiologically decouples the core physiological factors affecting blood glucose—carbohydrate intake and insulin injection—at the model architecture level, and performs real-time attribution and fusion through a dynamic attention mechanism between modules, ultimately outputting simultaneous blood glucose predictions and physiological attribution explanations. The specific steps include: Step 1: Acquire multimodal data, which includes at least continuous glucose monitoring (CGM) data and multiple discrete physiological data.
[0035] In this embodiment, multiple discrete physiological data include: discrete carbohydrate intake data recorded by the user (such as the number of grams of carbohydrates consumed per meal) and discrete insulin injection data (such as the dose of rapid-acting or long-acting insulin per injection). All data are aligned and normalized at a uniform time step (e.g., 5 minutes, consistent with the CGM sampling frequency).
[0036] Step 2: Perform physiological feature engineering on multiple discrete physiological data to obtain multiple decoupled continuous physiological feature sequences.
[0037] In this embodiment, the physiological feature engineering module is used to perform physiological feature engineering transformation on the acquired discrete physiological data, transforming them into a continuous physiological feature sequence that can characterize its continuous dynamic effects in vivo. This decouples the physiological driving factors and ensures that each physiological submodule receives pure, interference-free physiological signals.
[0038] Specifically, the physiological characteristics engineering module includes a carbohydrate on-plate (COB) calculation unit and an insulin on-plate (IOB) calculation unit.
[0039] Carbohydrate intake data is processed using a first-order exponential decay model within the platelet-weighting unit. Carb(t) Physiological characteristic engineering was performed to obtain a continuous amount of carbohydrates on the plate, calculated using the following formula:
[0040] in, for t The amount of carbohydrates on the plate at any given time. for t- The amount of carbohydrates on the plate at time 1. For time step, The time constant for carbon absorption is... This represents the number of grams of carbohydrates newly ingested at the current moment; that is, the COB at time t, which is equal to the COB at the previous moment according to the carbohydrate absorption time constant. The remaining amount after exponential decay, plus the newly ingested carbohydrates at the current moment. This formula simulates the process of food digestion and absorption in the body.
[0041] For example, time step Consistent with the CGM sampling frequency, fixed at 5 minutes. Carbohydrate absorption time constant. This is a core adjustable parameter, and its value range is typically 60-120 minutes; in this embodiment, It was set to an average of 90 minutes to simulate the absorption rate of a mixed diet.
[0042] The insulin on-plate dose calculation unit transforms discrete insulin injection data through physiological feature engineering to obtain continuous insulin on-plate doses. The insulin on-plate dose calculation unit employs a first-order exponential decay model similar to the COB unit, but with independent configurations for different insulin types. The parameter is used to linearly add up the insulin levels of different types of insulin to obtain the total insulin level.
[0043] The formula for calculating insulin in the platelet-weight calculation unit is:
[0044] in, for t The amount of insulin on the plate at any given time. for t- Insulin levels at time 1 This is the time constant of insulin action. This represents the newly injected insulin dose at the current moment; that is, the IOB at time t, which is equal to the IOB at the previous moment according to the insulin action time constant. The remaining amount after exponential decay, plus the newly injected insulin dose at the current moment. This formula simulates the pharmacokinetic process of insulin in the body.
[0045] For example, for high doses of rapid-acting insulin, the insulin action time constant... Set to 240 minutes; for long-acting insulin at basal rate, The timeframe is set to 720 minutes. The system calculates the two types of IOBs separately and then adds them together to obtain the total IOB sequence.
[0046] This embodiment transforms the original discrete event data into continuous feature sequences with clear physiological significance: COB sequences and IOB sequences, thereby achieving decoupling from the original blood glucose signal at the input level.
[0047] Step 3: Input the continuous blood glucose monitoring data and multiple continuous physiological feature sequences into multiple parallel physiological sub-modules for feature extraction to obtain multiple deep state features.
[0048] Specifically, continuous blood glucose monitoring data, along with continuous carbohydrate and insulin levels output from the physiological feature engineering module, are rigorously and one-to-one input into three parallel, structurally independent physiological sub-modules for deep feature extraction. This ensures no crosstalk occurs in the early stages of modeling, resulting in multiple deep-state features. The three physiological sub-modules are: historical blood glucose sub-module, food metabolism sub-module, and insulin action sub-module.
[0049] In one specific embodiment, the three physiological sub-modules employ the same network structure. Each physiological sub-module includes a stacked first-gated recurrent neural network and a second-gated recurrent neural network. Both the first-gated and second-gated recurrent neural networks include 64 hidden units, and the activation function for both is tanh. The expression for the activation function is:
[0050] in, This indicates that the output of the neurons inside the gated recurrent network (GRU) is a one-dimensional value. is the base of the natural logarithm; For example, gated recurrent neural networks employ GRU or LSTM.
[0051] In other embodiments, each physiological submodule includes a Temporal Convolutional Network (TCN). The TCN captures time-series dependencies through causal convolutions, dilated convolutions, and residual connections. For example, the TCN may employ a single TCN containing four residual blocks, with a kernel size of 3 and dilation factors increasing exponentially by 1, 2, 4, and 8.
[0052] Compared to GRU, TCN computation can be fully parallelized, resulting in faster inference speeds and stronger memory capabilities for long sequences, making it more suitable for low-power real-time prediction tasks. The TCN model size can be further compressed through model quantization (such as INT8 quantization), facilitating deployment on mobile frameworks like Core ML. For example, in scenarios where it runs locally on resource-constrained edge devices (such as smartwatches), each physiological submodule can employ a TCN network.
[0053] Furthermore, through parallel processing of three physiological sub-modules, three deep-state features are obtained: deep-state features of carbohydrate intake data. Deep state characteristics of insulin injection data Deep state characteristics of blood glucose monitoring data Each deep state feature has 64 dimensions.
[0054] The parallel architecture in this embodiment ensures that each physiological submodule focuses on learning a single, pure physiological signal pattern, avoiding feature interference between different physiological effects.
[0055] Step 4: At each prediction time step, dynamic attention units between modules are used to generate attention weights corresponding to multiple deep state features. Based on the attention weights, multiple deep state features are weighted and fused to obtain a context vector containing key physiological information.
[0056] Specifically, the inter-module dynamic attention unit consists of a single-layer feedforward neural network and a softmax function. The input dimension of the feedforward neural network is... The output dimension is 3; the Softmax function converts the output of the feedforward neural network into an attention weight vector with a sum of 1, and its expression is:
[0057] in, This indicates that the i-th physiological submodule outputs a 1-dimensional value through a feedforward neural network, where e is the base of the natural logarithm.
[0058] Specifically, the inter-module dynamic attention unit addresses the deep state features output by the three parallel physiological sub-modules. , , At each prediction time step, a set of three attention weights that are combined to a value of 1 are generated. , , This directly quantifies the contribution of three major factors—food, insulin, and historical blood glucose—to future blood glucose trends at the current moment. .
[0059] Furthermore, by multiplying the outputs of each physiological submodule with their corresponding attention weights and then summing the results, a context vector containing key physiological information is generated. Used for final blood glucose prediction:
[0060] in, The context vector represents the fused high-dimensional physiological features, with a dimension of 64. , , These are the attention weights for carbohydrate intake data, insulin injection data, and blood glucose monitoring data, respectively. , , These are the deep state features of carbohydrate intake data, insulin injection data, and blood glucose monitoring data, respectively.
[0061] This embodiment dynamically adjusts the information flow based on real-time weight distribution, thereby achieving adaptive fusion for different physiological states (such as after a meal or at night).
[0062] Step 5: Input the context vector and multiple attention weights into the prediction and explanation generation module to generate blood glucose prediction values and attribution explanation text in parallel.
[0063] Specifically, the prediction and interpretation generation module, as the system's output terminal, receives the fused context vector. It also uses multiple attention weights to generate blood glucose prediction values and attribution explanation text in parallel, achieving synchronous generation of blood glucose prediction values and physiological attribution explanations.
[0064] The prediction and explanation generation module includes a blood glucose prediction value generation unit and an attribution explanation text generation unit. The blood glucose prediction value generation unit is used to generate context vectors. Feature extraction is performed, and the predicted blood glucose value is output. The attribution explanation text generation unit is used to extract the maximum weight value at the current moment and its corresponding physiological submodule index from multiple attention weights, and output a single dominant factor explanation or a review attribution description based on the relationship between the maximum weight value and the threshold.
[0065] In one specific embodiment, the blood glucose prediction value generation unit adopts a linear regression architecture, including a fully connected layer containing one output neuron. This layer does not use a non-linear activation function (i.e., it uses linear activation) to ensure the continuity and physical meaning of the output value. Its mathematical expression is:
[0066] in, This is the weight matrix of the fully connected layer. For bias terms, The target predicted blood glucose concentration (unit: mg / dL).
[0067] In one specific embodiment, the attribution explanation text generation unit is not a traditional independent NLP module, but rather an attribution mapping engine directly built on top of the model's internal attention mechanism. Its core logic is to transform abstract tensor weights into natural language with clinical guidance significance. The attribution explanation text generation unit directly extracts the real-time weight output from the "inter-module dynamic attention" layer, ensuring strict consistency between the attribution results and the prediction logic.
[0068] The processing flow in the attribution explanation text generation unit includes: 1) Feature saliency extraction. Monitor the weight vector in real time and find the maximum weight value at the current moment. And its corresponding physiological submodule index, used to determine whether food, insulin or historical blood glucose is currently dominant.
[0069] 2) Determination of dominant factors. A configurable threshold for determination is introduced. (For example ).like The system determines that the current prediction result has a single dominant factor and matches the corresponding explanatory text from a predefined physiological template library based on the index. If all weights do not exceed [a certain threshold], [the system will proceed as planned]. If the result is positive, it is determined to be a multi-factor coupling effect, and a summary attribution description is output.
[0070] 3) Template matching and synthesis. For example, when food is determined to be the dominant factor, the output is: "Predicted blood glucose trend is upward, the dominant factor is food intake (contribution)". (It is recommended to pay attention to the post-meal peak).
[0071] This embodiment proposes a multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanism. By deeply integrating physiological prior knowledge and artificial intelligence technology, a high-precision and high-transparency dynamic blood glucose prediction system is constructed. The various physiological driving factors affecting blood glucose are completely decoupled at the model architecture level, and an "inter-module attention" mechanism is introduced to quantify and attribute the contribution of each module in real time and dynamically.
[0072] Physiological decoupling maps the core antagonistic effects of carbohydrate-induced glucose elevation and insulin-induced glucose reduction to independent parallel neural networks, resolving the fundamental problem of traditional models obscuring physiological mechanisms. Furthermore, the inter-module attention mechanism transcends the limitations of existing technologies that focus on "intra-temporal attention" (focusing on important time points within a single sequence), enabling higher-dimensional attribution and clearly answering the question of "which physiological factor" plays a dominant role at present." The synergy of these two approaches endows the model with "intrinsic interpretability" at its architectural level, thus overcoming the limitations of existing solutions that rely on "post-hoc explanation" tools and making simultaneous output of prediction and attribution possible.
[0073] This embodiment not only provides a novel paradigm for high-precision blood glucose prediction, but also holds promise for laying the technological foundation for a truly reliable and intervention-guided personalized decision support system. It is the first to propose combining physiological process decoupling with dynamic intermodal attribution, aiming to achieve integrated generation of prediction and explanation, demonstrating significant clinical value and industrialization potential.
[0074] Example 2 Building upon Example 1, this example provides another multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms. This method adds a parallel motion effects module to the system architecture of Example 1; please refer to [link to example]. Figure 3 , Figure 3 This is a schematic diagram of module interaction for another multimodal blood glucose prediction based on physiological decoupling and dynamic attention mechanism provided in an embodiment of the present invention.
[0075] This embodiment of the multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms includes the following steps: Step 1: Acquire multimodal data, which includes at least continuous blood glucose monitoring data and multiple discrete physiological data.
[0076] Specifically, multiple discrete physiological data include: discrete carbohydrate intake data recorded by the user, discrete insulin injection data, and discrete exercise recording data, which includes the equivalent dose of exercise duration and intensity.
[0077] Step 2: Perform physiological feature engineering on multiple discrete physiological data to obtain multiple decoupled continuous physiological feature sequences.
[0078] Building upon the COB and IOB sequence generation in Example 1, this example adds an exercise effect on plate (EOB) calculation unit. To accurately simulate the delayed onset and sustained effects of blood glucose depletion after exercise, this unit employs a two-compartment exponential model, comprising a latent effect compartment and an active effect compartment.
[0079] The process of converting discrete motion recording data into a decoupled continuous motion effect feature sequence includes: First, the discrete motion recording data is transformed using physiological feature engineering to obtain the motion effect in the latent effect chamber on the plate. :
[0080] in, For the motion effect of the latent effect chamber in the plate quantity, for t The equivalent dose of the new motion at a given moment (which can be calculated based on duration and intensity). The rise time constant is For time step, for t The formula is used to simulate the accumulation and onset process of motion effects, expressing that the latent effect will follow a rise time constant. (This represents the time when the exercise effect reaches its peak, set at 15-30 minutes depending on the exercise intensity) then decays and gradually shifts to the active effect chamber.
[0081] Then, the motion effect of the latent effect chamber is combined with the plate quantity to calculate the motion effect of the active effect chamber, which is used as the final decoupled continuous motion effect characteristic sequence. :
[0082] in, For the motion effect of the active effect chamber in the plate quantity, The descent time constant, This is the transfer rate coefficient. The formula calculates the active effect that ultimately influences blood glucose levels, and it consists of two parts: the active effect from the previous time step, calculated according to a decreasing time constant. (This represents the duration of the exercise effect after it ends, which can be set between 60 and 120 minutes to simulate the calorie consumption after exercise) decays; at the same time, a portion of the effect is transferred from the latent effect chamber (the transfer rate is controlled by the coefficient k).
[0083] This embodiment effectively smooths a "square wave" motion input into an effect curve that conforms to physiological laws through a dual-chamber model.
[0084] Step 3: Input the continuous blood glucose monitoring data and multiple continuous physiological feature sequences into multiple parallel physiological sub-modules for feature extraction to obtain multiple deep state features.
[0085] Specifically, the multiple parallel physiological sub-modules are: historical blood glucose sub-module, food metabolism sub-module, insulin action sub-module, and exercise effect sub-module.
[0086] The motion effects submodule uses the same network structure as the other three submodules, with the input being... Sequence, outputting deep state features of motion effects It has 64 dimensions.
[0087] Step 4: At each prediction time step, use the inter-module dynamic attention unit to generate attention weights corresponding to multiple deep state features, and then use the attention weights to weighted fuse the multiple deep state features to obtain a context vector containing key physiological information. The inter-module dynamic attention unit consists of a single-layer feedforward neural network and a softmax function. The input dimension of the feedforward neural network is expanded to... The output dimension is 4, and its output weight vector It is now possible to dynamically arbitrate the contribution of four physiological factors.
[0088] Furthermore, the formula for calculating the context vector is:
[0089]
[0090] in, For context vectors, , , , These are the attention weights for carbohydrate intake data, insulin injection data, blood glucose monitoring data, and exercise data, respectively. , , , These are the deep state features of carbohydrate intake data, insulin injection data, blood glucose monitoring data, and exercise record data, respectively.
[0091] Step 5: Input the context vector and multiple attention weights into the prediction and explanation generation module to generate blood glucose prediction values and attribution explanation text in parallel.
[0092] This step uses the same method as in Example 1, and will not be described again in this example.
[0093] This embodiment demonstrates the flexibility and scalability of the invention's architecture through modular extension, enabling the easy integration of new physiological factors without altering the core decoupling and attention fusion framework.
[0094] Example 3 Based on Embodiments 1 and 2, this embodiment provides a multimodal blood glucose prediction system based on physiological decoupling and dynamic attention mechanisms, used in the method of Embodiment 1 or 2. The system includes: The data acquisition module is used to acquire multimodal data, which includes at least: continuous blood glucose monitoring data and multiple discrete physiological data. The physiological feature engineering module is used to transform multiple discrete physiological data into physiological feature sequences after decoupling. The parallel decoupled modeling module is used to input continuous blood glucose monitoring data and multiple continuous physiological feature sequences into multiple parallel physiological sub-modules for feature extraction, thereby obtaining multiple deep state features; The dynamic attention fusion module is used to generate attention weights corresponding to multiple deep state features at each prediction time step by using dynamic attention units between modules, and to perform weighted fusion of multiple deep state features based on attention weights to obtain a context vector containing key physiological information. The prediction and explanation generation module is used to input the context vector and multiple attention weights into the prediction and explanation generation module to generate blood glucose prediction values and attribution explanation text in parallel.
[0095] Please refer to Embodiment 1 and Embodiment 2 for the specific execution steps and beneficial effects of each module in this embodiment, which will not be repeated here.
[0096] Example 4 Based on Embodiments 1, 2, and 3, this embodiment provides a method for training a physiological decoupling and dynamic attention fusion model. The physiological decoupling and dynamic attention fusion model includes: a physiological feature engineering module, multiple parallel physiological sub-modules, inter-module dynamic attention units, and a prediction and interpretation generation module. The prediction and interpretation generation module includes a blood glucose prediction value generation unit and an attribution explanation text generation unit.
[0097] Training the physiological decoupling and dynamic attention fusion model involves training multiple parallel physiological sub-modules, inter-module dynamic attention units, and blood glucose prediction generation units. This training method includes: Step 1, set parameters: use Adam optimizer, set the initial learning rate to 0.001, and configure the learning rate decay strategy (if there is no improvement in the validation set loss after 5 epochs, the learning rate is multiplied by 0.5).
[0098] Step 2: Obtain the raw multimodal data; input the raw multimodal data into the physiological decoupling and dynamic attention fusion model, and perform the following steps: perform physiological feature engineering transformation on the discrete data in the raw multimodal data to obtain multiple decoupled continuous physiological feature sequences; input the continuous blood glucose monitoring data and multiple continuous physiological feature sequences into multiple parallel physiological sub-modules for feature extraction to obtain multiple deep state features; use the inter-module dynamic attention unit to generate attention weights corresponding to multiple deep state features at each prediction time step, and perform weighted fusion of multiple deep state features based on the attention weights to obtain a context vector containing key physiological information; input the context vector and multiple attention weights into the prediction and interpretation generation module to generate blood glucose prediction values and attribution explanation text in parallel; Step 3: Calculate the loss function and update the network parameters. The loss function used is the Mean Squared Error (MSE), whose mathematical expression is:
[0099] in, This represents the actual blood glucose monitoring value of the i-th sample. This indicates that the output of the fully connected layer of the model is a 1-dimensional predicted value, and n represents the total number of samples in the batch. The batch size is set to 128, the total training epochs are 100, and an early stopping mechanism is adopted to monitor the loss on the validation set. If there is no improvement within 10 epochs, the training is terminated early to prevent overfitting.
[0100] Step 4: Repeat steps 2 and 3 above until the loss function value tends to stabilize.
[0101] This embodiment achieves physiological decoupling by transforming multiple discrete physiological data into physiological features through physiological feature engineering. Combined with the inter-module attention mechanism, the abstract prediction process is transformed into a quantitative analysis of the contribution of physiological factors such as food and insulin, and a natural language explanation is generated. This fundamentally solves the "black box" problem in the existing technology, overcomes the trust gap caused by model decision-making, and enables users to formulate precise intervention measures based on the prediction results. This embodiment completely separates the core physiological processes (such as food raising blood sugar and insulin lowering blood sugar) into independent modules for modeling, avoiding mutual interference between features, and achieves optimal fusion through a dynamic attention mechanism between modules. It effectively breaks through the accuracy bottleneck of existing technologies in multimodal data fusion, and is expected to reduce the prediction error of key scenarios (such as post-meal) by more than 10%-20%. Because the modular design of this invention learns the underlying physiological causal relationships rather than the surface data correlations, it has stronger robustness and individual adaptability. It solves the problem that pure data-driven models have poor generalization ability and cannot adapt to changes in individual lifestyles, and provides a reliable path to realize a truly personalized blood glucose model that can provide stable long-term service. This embodiment not only solves the long-standing "black box" and accuracy bottleneck problems in the field of blood glucose prediction, but its "decoupling-fusion-prediction-interpretation" technical paradigm has good scalability. It can flexibly add new parallel physiological modules (such as exercise effects) and can be extended to the field of comprehensive prediction and attribution analysis of multiple physiological factors such as blood pressure fluctuations and emotional stress, providing core algorithm support for the next generation of personalized digital diagnosis and chronic disease intelligent management platforms.
[0102] The above description, in conjunction with specific preferred embodiments, provides a further detailed explanation of the present invention. It should not be construed that the specific implementation of the present invention is limited to these descriptions. For those skilled in the art, various simple deductions or substitutions can be made without departing from the concept of the present invention, and all such modifications and substitutions should be considered within the scope of protection of the present invention.
Claims
1. A multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms, characterized in that, Including the following steps: Acquire multimodal data, which includes at least: continuous blood glucose monitoring data and multiple discrete physiological data; The discrete physiological data are transformed by physiological feature engineering to obtain multiple decoupled continuous physiological feature sequences. The continuous blood glucose monitoring data and the multiple continuous physiological feature sequences are respectively input into multiple parallel physiological sub-modules for feature extraction to obtain multiple deep state features; A dynamic attention unit between modules is used to generate attention weights corresponding to the multiple deep state features at each prediction time step, and the multiple deep state features are weighted and fused based on the attention weights to obtain a context vector containing key physiological information. The context vector and multiple attention weights are input into the prediction and explanation generation module to generate blood glucose prediction values and attribution explanation text in parallel.
2. The multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanisms according to claim 1, characterized in that, Each of the physiological sub-modules includes a stacked first gated recurrent neural network and a second gated recurrent neural network. Both the first and second gated recurrent neural networks have 64 hidden units, and both use the tanh activation function. The expression for the activation function is: in, This indicates that the output of neurons within a gated recurrent network is a one-dimensional value. is the base of the natural logarithm; Alternatively, each of the physiological sub-modules may include a temporal convolutional network.
3. The method of claim 1, wherein, The inter-module dynamic attention unit comprises a single-layer feedforward neural network and a Softmax function, the expression of which is: in, This indicates that the i-th physiological submodule outputs a 1-dimensional value through a feedforward neural network, where e is the base of the natural logarithm.
4. The method of claim 1, wherein, The discrete physiological data include discrete carbohydrate intake data and discrete insulin injection data.
5. The method of claim 4, wherein, The discrete physiological data are respectively transformed by physiological feature engineering to obtain multiple decoupled continuous physiological feature sequences, including: The discrete carbohydrate intake data were transformed using physiological feature engineering to obtain continuous carbohydrate intake data. in, for t The amount of carbohydrates on the plate at any given time. for t- The amount of carbohydrates on the plate at time 1. For time step, The time constant for carbon absorption is... This represents the number of grams of carbohydrates newly ingested at the current moment. The discrete insulin injection data were transformed using physiological feature engineering to obtain continuous insulin levels on the plate. in, for t The amount of insulin on the plate at any given time. for t- Insulin levels at time 1 This is the time constant of insulin action. This is the current insulin dose injected. For different types of insulin, the total insulin level is obtained by linearly adding the levels of different types of insulin on the plate.
6. The multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanism according to claim 4, characterized in that, The context vector is: in, For context vectors, , , These are the attention weights for carbohydrate intake data, insulin injection data, and blood glucose monitoring data, respectively. , , These are the deep state features of carbohydrate intake data, insulin injection data, and blood glucose monitoring data, respectively.
7. The multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanism according to claim 4, characterized in that, The multiple discrete physiological data also include discrete motion recording data.
8. The multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanism according to claim 7, characterized in that, The process of converting the discrete motion recording data into a decoupled continuous motion effect feature sequence includes: The discrete motion recording data are transformed using physiological feature engineering to obtain the motion effect in the latent effect chamber on the plate. in, For the motion effect of the latent effect chamber in the plate quantity, for t The equivalent dose of the new starting motion at a given moment. The rise time constant is For time step, for t time; The motion effect of the latent effect chamber is combined with the plate quantity to calculate the motion effect of the active effect chamber, which is then used as the decoupled continuous motion effect characteristic sequence: in, For the motion effect of the active effect chamber in the plate quantity, The descent time constant, For the transfer rate coefficient; The context vector is: in, For context vectors, , , , These are the attention weights for carbohydrate intake data, insulin injection data, blood glucose monitoring data, and exercise data, respectively. , , , These are the deep state features of carbohydrate intake data, insulin injection data, blood glucose monitoring data, and exercise record data, respectively.
9. The multimodal blood glucose prediction method based on physiological decoupling and dynamic attention mechanism according to claim 1, characterized in that, The prediction and explanation generation module includes a blood glucose prediction value generation unit and an attribution explanation text generation unit; The blood glucose prediction generation unit includes a fully connected layer for extracting features from the context vector and outputting a blood glucose prediction value. The attribution explanation text generation unit is used to extract the maximum weight value at the current moment and its corresponding physiological submodule index from the multiple attention weights, and output a single dominant factor explanation or a summary attribution description based on the relationship between the maximum weight value and the threshold.
10. A multimodal blood glucose prediction system based on physiological decoupling and dynamic attention mechanisms, characterized in that, The system for performing the method as described in any one of claims 1-9 includes: The data acquisition module is used to acquire multimodal data, which includes at least: continuous blood glucose monitoring data and multiple discrete physiological data. The physiological feature engineering module is used to transform the multiple discrete physiological data into multiple decoupled continuous physiological feature sequences through physiological feature engineering. The parallel decoupled modeling module is used to input the continuous blood glucose monitoring data and the multiple continuous physiological feature sequences into multiple parallel physiological sub-modules for feature extraction, thereby obtaining multiple deep state features; The dynamic attention fusion module is used to generate attention weights corresponding to the multiple deep state features at each prediction time step using dynamic attention units between modules, and to perform weighted fusion of the multiple deep state features based on the attention weights to obtain a context vector containing key physiological information. The prediction and explanation generation module is used to input the context vector and multiple attention weights into the prediction and explanation generation module, and generate blood glucose prediction values and attribution explanation text in parallel.