A multi-modal data fusion intelligent analysis system and method

The multimodal data fusion intelligent analysis system solves the problems of integrating multimodal medical data and allocating resources, enabling a comprehensive assessment of patients' health status and the generation of personalized treatment plans, thereby improving the utilization rate of medical resources and the patient's medical experience.

CN122291010APending Publication Date: 2026-06-26JIANGSU PROVINCE HOSPITAL (THE FIRST AFFILIATED HOSPITAL OF NANJING MEDICAL UNIVERSITY)

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JIANGSU PROVINCE HOSPITAL (THE FIRST AFFILIATED HOSPITAL OF NANJING MEDICAL UNIVERSITY)
Filing Date
2026-03-27
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively integrate multimodal medical data, resulting in information silos, uneven allocation of medical resources, and poor patient experience.

Method used

Design a multimodal data fusion intelligent analysis system to generate comprehensive and reliable patient health assessments through integrated access to multimodal data, deep feature extraction and weighted fusion, and intelligent resource scheduling based on the fusion results.

Benefits of technology

It enables a comprehensive and reliable assessment of patients' health status, generates personalized treatment plans, and improves the utilization rate of medical resources and the efficiency of patients' medical visits.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122291010A_ABST
    Figure CN122291010A_ABST
Patent Text Reader

Abstract

This invention discloses a multimodal data fusion intelligent analysis system and method, including a multimodal data integrated access and acquisition module, a data fusion and intelligent analysis module, and an intelligent decision-making and resource scheduling module. This system integrates medical data from multiple heterogeneous sources and performs fusion analysis and decision support through advanced machine learning models, aiming to improve diagnostic and treatment efficiency, assist clinical decision-making, and optimize the allocation of medical resources.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent analysis technology, and in particular to a multimodal data fusion intelligent analysis system and method. Background Technology

[0002] Currently, medical data is experiencing explosive growth and comes in various forms, including structured text (such as electronic medical records and laboratory reports), unstructured text (such as doctor's consultation records), medical images (such as CT scans and MRI scans), and time-series data (such as vital sign monitoring). However, this multimodal data is often scattered across different systems, lacking effective integration and in-depth analysis, resulting in information silos and making it difficult to form a comprehensive and accurate assessment of patients' health status.

[0003] Meanwhile, existing technologies, when processing multimodal data, typically employ simple data stacking or rule-based fusion methods, failing to fully consider the differences in the contribution of different modalities to the final decision, the reliability of the data itself, and the semantic relationships between modalities. This results in biased or unreliable analysis results. Furthermore, in terms of medical resource allocation, existing systems are mostly based on simple queuing rules or departmental availability, failing to combine the complexity of patient conditions and the professional matching of departments for dynamic and intelligent triage and patient flow management. This leads to uneven utilization of medical resources and a poor patient experience.

[0004] Therefore, there is an urgent need to design an intelligent analysis system that can deeply integrate multimodal medical data and perform intelligent analysis based on the deterioration results. Summary of the Invention

[0005] The purpose of this invention is to provide a multimodal data fusion intelligent analysis system and method to solve the problems existing in the prior art. It can achieve standardized access and efficient collection of multi-source heterogeneous medical data; generate comprehensive and reliable patient health status assessments through deep feature extraction and weighted fusion; generate personalized diagnosis and treatment plans and risk assessments based on the fusion results; and achieve intelligent matching and guidance of patients and departmental resources by combining real-time hospital operation data.

[0006] To achieve the above objectives, the present invention provides the following solution: The present invention provides a multimodal data fusion intelligent analysis system. A multimodal data fusion intelligent analysis system includes:

[0007] The multimodal data integrated access and acquisition module is used to acquire and preprocess multimodal case data and epidemiological data of patients from multiple heterogeneous data sources;

[0008] The data fusion and intelligent analysis module interacts with the multimodal data integrated access and acquisition module; it is used to extract and align features from the preprocessed data, and analyze it based on the training submodule to output patient physical condition assessment and target medical information; wherein, the target medical information includes the patient's department of care and medical assessment suggestions;

[0009] The intelligent decision-making and resource scheduling module interacts with the data fusion and intelligent analysis module to complete resource allocation based on the target patient information and the hospital's real-time resource status.

[0010] The multimodal data integrated access and acquisition module includes an information acquisition port and a transfer module. It is equipped with multiple standardized data interfaces for connecting different medical data sources. Its internal transfer module uses a normalization algorithm to pre-analyze and standardize the raw data.

[0011] The data acquisition module incorporates multiple medical device communication protocols and employs an incremental learning mechanism to update data acquisition and processing strategies.

[0012] The parameter update function of the incremental learning mechanism is:

[0013] ;

[0014] in, To update the data batch.

[0015] The data fusion and intelligent analysis module includes a multimodal feature extraction and alignment submodule, which extracts medical image features based on convolutional neural networks, extracts text features based on pre-trained language models, extracts temporal features based on temporal networks, and uses cross-modal attention mechanisms or variational autoencoders to map different modal features to a shared latent space for alignment.

[0016] The weighted fusion and model training submodule dynamically generates fusion weights for each modality based on the calculated information reliability weights and task importance weights, and performs feature fusion; it then trains a multi-task model based on the fused features.

[0017] The intelligent decision-making and resource scheduling module includes a solution and risk assessment sub-module, which generates multiple treatment plans and calculates the risk assessment score for each plan based on the analysis results and medical knowledge graph.

[0018] The decision model deployment and resource scheduling submodule includes a matching evaluation function to assess the compatibility between patients and departments and to guide and triage patients accordingly.

[0019] A multimodal data fusion intelligent analysis method, including a multimodal data fusion intelligent analysis system, includes the following steps:

[0020] Data access and acquisition: Through the multimodal data integrated access and acquisition module, the patient's time-series vital signs data, textual symptom description data and medical imaging data are acquired in real time from multiple heterogeneous data sources, and normalized preprocessing is performed; the data acquisition module dynamically updates its data acquisition and processing strategy through an incremental learning mechanism.

[0021] Data fusion and analysis: Features are extracted from the time-series data, text data, and image data through the data fusion and intelligent analysis module; the extracted features of different modalities are associated and aligned using the cross-modal alignment module; the fusion weights of each modal feature are dynamically calculated based on information reliability and task importance, and weighted fusion is performed to generate fused features; the fused features are input into the multi-task model for analysis and reasoning, and the analysis results include patient physical condition assessment, multiple treatment plans, and corresponding risk assessments.

[0022] Intelligent Decision-Making and Scheduling: Based on the analysis results obtained from the above steps and combined with the hospital's real-time resource status information, the intelligent decision-making and resource scheduling module calculates the compatibility between the current patient and each medical resource using a matching degree evaluation function. Based on the compatibility degree evaluation results, the optimal triage and resource scheduling plan is automatically generated and executed.

[0023] The dynamic calculation of fusion weights for each modal feature based on information reliability and task importance includes: calculating the information reliability weights of each modal data. It is based on the authority of the data source, the accuracy of the data acquisition equipment, and the signal-to-noise ratio of the data itself; it calculates the task importance weight of each modality of data. It assesses the contribution of different modalities of data to achieving the objective of the current analysis task, using the formula: Dynamically generate the final fusion weights for each modality. ,in These are learnable parameters.

[0024] The training method for the multi-task model is as follows: A training dataset containing multimodal medical data is constructed and the data is labeled; based on the multi-task loss function:

[0025] ;

[0026] in, Loss due to disease classification; For risk assessment of loss; For feature reconstruction loss; These are the weighting coefficients for each loss term.

[0027] The matching degree evaluation function is:

[0028] ;

[0029] in, For patients The required set of treatment types; For the department A collection of professional skills; The physical distance or transfer distance from the patient to the department; These are adjustable weighting coefficients.

[0030] This invention discloses the following technical effects: The multimodal data fusion intelligent analysis system of this invention overcomes the limitations of a single data source through deep fusion and dynamic weighting of multimodal data, enabling a more comprehensive and reliable assessment of patients' health status. It not only provides diagnostic suggestions but also generates multiple personalized treatment plans with accompanying risk assessments, assisting doctors in clinical decision-making. By dynamically combining patients' conditions with the real-time resource status of the hospital, it achieves precise triage, shortens patient waiting times, and improves departmental efficiency and overall utilization of medical resources. The incremental learning mechanism at the data acquisition end and the dynamic weighting mechanism at the fusion end enable the system to adapt to changes in data distribution and the access of new devices, exhibiting excellent scalability. Attached Figure Description

[0031] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0032] Figure 1 This is a schematic diagram of the complete architecture of the system of the present invention. Detailed Implementation

[0033] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0034] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0035] In a specific embodiment of the present invention, the present invention provides a multimodal data fusion intelligent analysis system, including a multimodal data integrated access and acquisition module, a data fusion and intelligent analysis module, and an intelligent decision-making and resource scheduling module;

[0036] The multimodal data integrated access and acquisition module is responsible for acquiring data from multiple heterogeneous data sources both internally and externally, and performing preliminary processing. Specifically, it includes an information acquisition port and transfer module and a data acquisition module. The information acquisition port and transfer module has multiple standardized data interfaces for connecting to hospital information systems (HIS), laboratory information systems (LIS), image archiving and communication systems (PACS), wearable devices, and public health data platforms, etc.

[0037] Furthermore, the port integrates a switching module that pre-analyzes the incoming raw data, employing a normalization algorithm to unify data of different scales and dimensions into the [0,1] interval. For numerical data, a minimum-maximum normalization function is used:

[0038] ;

[0039] In another embodiment of the present invention, for categorical data (such as symptom descriptions), one-hot encoding or word embedding techniques are used for vectorization representation to provide a unified data foundation for subsequent analysis.

[0040] The data acquisition module incorporates various device-specific acquisition protocols (such as DICOM for medical imaging and HL7 / FHIR for text data exchange) to ensure seamless integration with devices from different manufacturers. To address the continuous generation of data, this module introduces an incremental learning mechanism. Its core is the design of an online learning function that learns as new data batches arrive. Upon arrival, model parameters The update formula is:

[0041] ;

[0042] in, For learning rate, For the loss gradient on the new data, The momentum coefficient is used to smooth parameter updates and prevent drastic fluctuations in model performance (catastrophic forgetting) caused by new data. This mechanism demonstrates that the system can continuously optimize as data grows, improving the adaptability and efficiency of the acquisition and preprocessing process.

[0043] Furthermore, the data fusion and intelligent analysis module is the core of this invention, responsible for deep feature extraction, alignment, weighted fusion, and comprehensive analysis of the preprocessed multimodal data.

[0044] The multimodal feature extraction and alignment submodule employs a multi-branch neural network architecture. For medical image data, a convolutional neural network (CNN) is used to extract visual features. For text reports and in-person consultation records, pre-trained language models (such as BERT) are used to extract semantic features. For time-series data such as test indicators, temporal convolutional networks (TCN) or long short-term memory networks (LSTM) are used to extract temporal features. .

[0045] Subsequently, a cross-modal alignment module is used to map features from different modalities to a shared semantic latent space using a variational autoencoder (VAE) or attention mechanism, thereby achieving feature alignment and obtaining an aligned feature set. .

[0046] The weighted fusion and model training submodule's core innovation lies in proposing a dynamic weighted fusion mechanism based on information reliability and task importance. First, an information reliability weight is calculated for each modality of data. This weight can be calculated by comprehensively considering factors such as the accuracy of the data source equipment, data integrity, and the freshness of the data collection time. Secondly, through a task-driven importance assessment network, for the current analysis task (such as disease diagnosis or risk assessment), the network automatically learns and outputs the task importance weight of each modality feature. Ultimately, the fusion weights for each modality... Determined by both factors, the correlation formula is as follows:

[0047] ;

[0048] in, It is a learnable hyperparameter used to balance reliability and importance. The weighted fused comprehensive feature vector is:

[0049] .

[0050] In one specific embodiment of the present invention, a fusion feature is used. Train a multi-task learning model; training directions include disease classification, risk level prediction, and treatment plan recommendation. The training process adopts an end-to-end approach, and the loss function is... The weighted sum of the losses for each task:

[0051] ;

[0052] in, Cross-entropy is the disease classification loss. Mean squared error regression loss, used for risk assessment; This is the recommendation loss based on reinforcement learning or sequence generation; These are the weighting coefficients for each loss term.

[0053] Furthermore, the optimization function employs an adaptive moment estimator optimizer (Adam), whose update rule is as follows:

[0054] ;

[0055] ;

[0056] ;

[0057] in, For gradient; These are the first and second moments of the gradient, respectively; This represents the attenuation rate.

[0058] This optimizer can effectively handle sparse gradients and accelerate model convergence.

[0059] In one specific embodiment of the present invention, the intelligent decision-making and resource scheduling module generates the final decision based on the output of the data fusion module;

[0060] The solution and risk assessment submodule receives disease probabilities and key indicators from the model output and combines them with a knowledge graph (which stores the relationships between diseases, symptoms, treatment plans, and drugs) to generate multiple candidate treatment plans. For each plan, the system calculates a risk assessment score, which is derived from the complexity of the plan itself, the degree of matching with the patient's individual contraindications, and the treatment effect data of similar historical cases.

[0061] The decision model deployment and resource scheduling submodule includes a real-time updated hospital resource status graph. The decision model is deployed as a microservice, receiving patient analysis results (including symptom tags, urgency, required specialty, etc.) via an API interface. The model calculates the patient-department matching degree through an evaluation function.

[0062] ;

[0063] in, For the patient's symptom feature vector; For the department A collection of professional skills; The cosine similarity function; This represents the department's current workload (such as the number of patients waiting to be seen and the average waiting time). The physical distance or transfer distance from the patient to the department; The system uses adjustable weighting coefficients to assign patients to the departments with the highest evaluation function scores, thus optimizing resource allocation.

[0064] In Embodiment 1 of this invention, the system is applied to an intelligent triage and assisted diagnosis scenario in a comprehensive hospital. When a patient with acute chest pain is admitted to the hospital:

[0065] Data Access and Acquisition: The system acquires patient vital signs monitoring data (time-series data), patient-reported symptoms (text), and preliminary electrocardiograms (images) in real time through the information acquisition port. The transfer module immediately normalizes the data. The data acquisition module quickly adapts to the protocols of new monitors through an incremental learning mechanism.

[0066] Data fusion and analysis: The feature extraction submodule extracts ST segment morphological features from the electrocardiogram (ECG), key semantic features such as "crushing pain" and "radiating to the left arm" from the text, and temporal features such as heart rate variability from vital signs. The cross-modal alignment module correlates these features. The weighted fusion module learns from historical data that ECG features have high reliability and importance in this type of emergency, and assigns them greater weight. The fused features are input into the trained multi-task model.

[0067] Intelligent Decision-Making and Scheduling: The model outputs a high probability of "acute myocardial infarction" and recommends options such as "emergency coronary angiography" and "intravenous thrombolysis," while assessing a high bleeding risk associated with thrombolysis. The decision-making module simultaneously checks resource status: the cardiology catheterization lab is currently in use but is expected to become available in 10 minutes; the emergency resuscitation room has vacancies. After calculating the matching degree evaluation function, it decides to place the patient in the emergency resuscitation room for initial treatment and automatically schedules a slot in the cardiology catheterization lab 10 minutes later, notifying the relevant medical team to prepare. The entire process is completed within minutes, saving valuable time for the rescue.

[0068] In Embodiment 2 of the present invention, this system is applied to a regional epidemiological surveillance and chronic disease management scenario. The application scenario is the management of respiratory disease patients during the flu season at a regional medical center.

[0069] Data access and collection: In addition to acquiring in-hospital patient data (face-to-face consultation records, lung CT scans, blood routine tests), the system also accesses information such as the recent incidence rate of influenza-like illnesses and dominant strains in the region through public health ports.

[0070] Data Fusion and Analysis: For a patient with cough and fever, the system extracts ground-glass opacity features from CT images, "sore throat" and "fatigue" features from text records, and lymphocyte count features from blood routine tests. Simultaneously, regional epidemiological data is used as contextual features. In the fusion module, due to the current influenza peak season and the high match between the patient's symptoms and the circulating strain, the importance weight of epidemiological data is increased. After comprehensive analysis, the model provides a diagnosis of "viral pneumonia (highly probable influenza virus)" and recommends multiple treatment options, including antiviral therapy, symptomatic supportive care, and home isolation monitoring, while assessing the risk of progression to severe illness.

[0071] Intelligent decision-making and scheduling: Based on the patient's risk assessment as low to medium risk and the need for treatment mainly consisting of medication and monitoring, combined with the large number of patients waiting in the respiratory medicine outpatient clinic and the relatively low number of patients in the general medicine department, the decision-making module recommends that the patient be diverted to the general medicine department or the Internet hospital for follow-up visits through matching degree assessment, thereby alleviating the pressure on the specialist department and optimizing resource allocation.

[0072] In one embodiment of the present invention, an electronic device is further included, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor; the computer program is used to run the multimodal data fusion intelligent analysis system of the present invention.

[0073] In the description of this invention, it should be understood that the terms "longitudinal", "lateral", "up", "down", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings, and are only for the convenience of describing this invention, and are not intended to indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of this invention.

[0074] The embodiments described above are merely preferred embodiments of the present invention and are not intended to limit the scope of the present invention. Various modifications and improvements made by those skilled in the art to the technical solutions of the present invention without departing from the spirit of the present invention should fall within the protection scope defined by the claims of the present invention.

Claims

1. A multimodal data fusion intelligent analysis system, characterized in that, include: The multimodal data integrated access and acquisition module is used to acquire and preprocess multimodal case data and epidemiological data of patients from multiple heterogeneous data sources; The data fusion and intelligent analysis module interacts with the multimodal data integrated access and acquisition module. This module is used to extract and align features from preprocessed data, and analyze the data based on the training submodule to output patient health status assessment and target medical information; wherein, the target medical information includes the patient's department of care and medical assessment suggestions. The intelligent decision-making and resource scheduling module interacts with the data fusion and intelligent analysis module to complete resource allocation based on the target patient information and the hospital's real-time resource status.

2. The multimodal data fusion intelligent analysis system according to claim 1, characterized in that: The multimodal data integrated access and acquisition module includes an information acquisition port and a transfer module. It is equipped with multiple standardized data interfaces for connecting different medical data sources. Its internal transfer module uses a normalization algorithm to pre-analyze and standardize the raw data. The data acquisition module incorporates multiple medical device communication protocols and employs an incremental learning mechanism to update data acquisition and processing strategies.

3. The multi-modal data fusion intelligent analysis system of claim 2, wherein: The parameter update function of the incremental learning mechanism is: ; wherein, to update the data batch.

4. The multimodal data fusion intelligent analysis system according to claim 1, characterized in that: The data fusion and intelligent analysis module includes a multimodal feature extraction and alignment submodule, which extracts medical image features based on convolutional neural networks, extracts text features based on pre-trained language models, extracts temporal features based on temporal networks, and uses cross-modal attention mechanisms or variational autoencoders to map different modal features to a shared latent space for alignment. The weighted fusion and model training submodule dynamically generates fusion weights for each modality based on the computational information reliability weights and task importance weights, and performs feature fusion. Multi-task models are trained based on fused features.

5. The multi-modal data fusion intelligent analysis system of claim 1, wherein: The intelligent decision-making and resource scheduling module includes a solution and risk assessment sub-module, which generates multiple treatment plans and calculates the risk assessment score for each plan based on the analysis results and medical knowledge graph. The decision model deployment and resource scheduling submodule includes a matching evaluation function to assess the compatibility between patients and departments and to guide and triage patients accordingly.

6. A multimodal data fusion intelligent analysis method, comprising the multimodal data fusion intelligent analysis system as described in any one of claims 1-5, characterized in that, Includes the following steps: Data access and acquisition: Through the multimodal data integrated access and acquisition module, the patient's time-series vital signs data, textual symptom description data and medical imaging data are acquired in real time from multiple heterogeneous data sources, and normalized preprocessing is performed; the data acquisition module dynamically updates its data acquisition and processing strategy through an incremental learning mechanism. Data fusion and analysis: Features are extracted from the time-series data, text data, and image data through the data fusion and intelligent analysis module; The cross-modal alignment module is used to associate and align the extracted modal features; the fusion weight of each modal feature is dynamically calculated based on information reliability and task importance, and weighted fusion is performed to generate fused features; the fused features are input into a multi-task model for analysis and reasoning, and the output includes analysis results including patient physical condition assessment, multiple treatment options and corresponding risk assessments; Intelligent Decision-Making and Scheduling: Based on the analysis results obtained from the above steps and combined with the hospital's real-time resource status information, the intelligent decision-making and resource scheduling module calculates the compatibility between the current patient and each medical resource using a matching degree evaluation function. Based on the compatibility degree evaluation results, the optimal triage and resource scheduling plan is automatically generated and executed.

7. The multi-modal data fusion intelligent analysis method of claim 6, wherein, The dynamic calculation of fusion weights for each modal feature based on information reliability and task importance includes: calculating the information reliability weights of each modal data. It is based on the authority of the data source, the accuracy of the data acquisition equipment, and the signal-to-noise ratio of the data itself; it calculates the task importance weight of each modality of data. It evaluates the contribution of different modalities of data to achieving the goal of the current analysis task, based on the following formula: Dynamically generate the final fusion weights for each modality. ,in These are learnable parameters.

8. The multi-modal data fusion intelligent analysis method of claim 6, wherein, The training method for the multi-task model is as follows: A training dataset containing multimodal medical data is constructed and the data is labeled; based on the multi-task loss function: ; in, Loss due to disease classification; For risk assessment of loss; For feature reconstruction loss; These are the weighting coefficients for each loss term.

9. The multimodal data fusion intelligent analysis method according to claim 6, characterized in that, The matching degree evaluation function is: ; in, For patients The required set of treatment types; For the department A collection of professional skills; The physical distance or transfer distance from the patient to the department; These are adjustable weighting coefficients.