A preoperative planning system for thoracic surgery fusing multi-modal images
The preoperative planning system for thoracic surgery using multimodal image fusion overcomes the limitations of existing systems in multimodal image fusion and anatomical structure recognition, achieving precise three-dimensional anatomical structure visualization and personalized surgical path design, thus improving the efficiency and safety of surgery.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- THE FIRST AFFILIATED HOSPITAL OF ARMY MEDICAL UNIV
- Filing Date
- 2026-03-11
- Publication Date
- 2026-06-12
AI Technical Summary
Existing preoperative planning systems for thoracic surgery have inherent limitations in terms of deep fusion of multimodal images, accurate identification and visualization of complex anatomical structures within the thoracic cavity, and intelligent and personalized path design and intelligent decision support tailored to the specific needs of thoracic surgery. They are unable to construct a unified, comprehensive and highly accurate three-dimensional anatomical model of the patient, which forces surgeons to rely on personal experience and intraoperative exploration in complex lesions, increasing surgical uncertainty and the risk of complications.
The preoperative planning system for thoracic surgery employs multimodal image fusion, including data acquisition and preprocessing, multimodal image registration and fusion, precise segmentation and 3D reconstruction of anatomical structures, lesion and functional information mapping, intelligent preoperative planning and path optimization, and visualization and interaction modules. Through deep learning and high-precision registration technology, it achieves automated integration and 3D reconstruction of multimodal images, supporting personalized surgical path design and risk assessment.
It achieves high-precision registration and fusion of multimodal images, provides accurate visualization of three-dimensional anatomical structures and intelligent surgical path planning, significantly improves the efficiency, accuracy and safety of surgical planning, and reduces surgical time and the risk of complications.
Smart Images

Figure CN122201629A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of digital medical technology, and more specifically, relates to a preoperative planning system for thoracic surgery that integrates multimodal imaging. Background Technology
[0002] With the continuous development of medical imaging technology and computer-aided surgical planning, preoperative planning systems have become a key component in modern medical practice, especially in thoracic surgery, for improving surgical precision, optimizing treatment plans, and significantly reducing surgical risks. These systems aim to provide surgeons with comprehensive preoperative insights through digital modeling and visualization of specific patient anatomy, thereby assisting them in developing more precise and safer intervention strategies. Given the complexity of the intrathoracic anatomy and its close connection with important blood vessels, airways, nerves, and lymphatic systems, an efficient and reliable preoperative planning system plays a crucial role in the successful implementation of complex thoracic surgeries.
[0003] In the prior art, various systems dedicated to specific medical planning tasks have been disclosed. For example, Chinese Patent Publication No. CN107743409B discloses a dose planning system. This system receives biopsy information and creates a spatially annotated biopsy map, then combines it with a tumor probability map to generate a specific dose plan. Its core purpose is to achieve precise dose allocation to the target area during radiotherapy. Specifically, this solution focuses on the specific needs of tumor radiotherapy. Its image processing module mainly serves to identify tumor boundaries and optimize the dose field distribution, aiming to maximize the killing of tumor cells and protect surrounding normal tissues. Correspondingly, Chinese Patent Publication No. CN103717167B discloses an ablation planning system. This system allows users to select ablation probes and related parameters through an interactive interface, and combines them with the patient's internal images to generate an optimized ablation treatment plan, aiming to maximize coverage of the target lesion while effectively avoiding key anatomical structures. The design concept of this solution focuses on the precise destruction of localized lesions through heat, cold, or other energy forms, and its planning logic revolves around the path of the ablation probe and the energy range. In their respective application areas, the two systems mentioned above undoubtedly improve the accuracy and safety of treatment, representing a beneficial exploration of computer-aided planning technology in specific treatment modes.
[0004] However, with the evolution of medical imaging acquisition technology towards multimodal and high-resolution approaches, and the increasing demand for preoperative information integration and refined planning in complex thoracic surgeries, the aforementioned existing technologies have gradually revealed inherent limitations that are difficult to overcome. The reason for this is that both dose planning systems for radiotherapy and those for ablation therapy are built around a single or extremely limited imaging modality (such as relying primarily on CT or MR) and the specific treatment intervention they serve. Specifically, dose planning systems focus on the morphological characteristics and radiosensitivity of tumors, with image analysis focusing more on tumor volume, location, and its spatial relationship with the radiotherapy target area; while ablation planning systems focus on the precise localization of lesions and the geometric coverage of the ablation area, with image processing capabilities often limited to providing the two-dimensional or three-dimensional structural information required for probe guidance. Although these systems perform well within their respective professional fields, their initial design did not fully consider the comprehensive fusion and collaborative analysis capabilities of the multimodal imaging information (such as CT, MRI, and PET) necessary for thoracic surgery.
[0005] Furthermore, thoracic surgery, especially tumor resection or complex vascular / airway surgery, requires surgeons to possess an extremely detailed and multi-dimensional understanding of the patient's individual anatomy. This includes, but is not limited to, the precise location and size of the tumor, its complex spatial relationship with surrounding blood vessels, trachea, bronchi, lymph nodes, pleura, pericardium, and other vital organs and structures, as well as the functional status of these structures. Single-modal imaging, such as CT, excels at displaying the morphology of bones, airways, and lung parenchyma; MRI has advantages in soft tissue contrast and vascular visualization; while PET can reflect tissue metabolic activity, helping to differentiate tumors from inflammatory or necrotic tissue. The lack of the ability to perform high-precision, automated registration and fusion of this multimodal information means that surgeons still need to rely on manual comparison and integration of information between different images during preoperative assessment. This is not only inefficient but also highly susceptible to omissions or misjudgments of crucial information due to visual bias or excessive cognitive load. This fragmented approach to information acquisition makes it difficult for existing systems to construct a unified, comprehensive, and highly accurate three-dimensional anatomical model of the patient. Consequently, it fails to provide surgeons with a complete overview of multi-dimensional anatomical details, which is particularly detrimental when facing complex lesions or anatomical variations during surgery.
[0006] Furthermore, the "planning" function of existing systems is essentially designed for specific therapeutic purposes, rather than for the complex path selection and operative design required for open or minimally invasive thoracic surgeries. For example, dose planning systems generate radiation projection directions and dose distribution maps, while ablation planning systems focus on the linear path of the probe and coverage of the target area. These planning concepts differ significantly from core elements that need to be considered in thoracic surgery, such as "surgical approach selection," "margin definition," "lymph node dissection strategy," "vascular and airway protection," and "tissue layer separation." Existing systems cannot perform intelligent and personalized surgical path planning and risk assessment based on fused multimodal images and the characteristics of thoracic surgery. For example, they cannot simulate the impact of different surgical approaches on surrounding tissues, predict potential bleeding risk points, or provide optimal anatomical dissection planes based on individual differences. This lack of planning capability forces surgeons to rely heavily on personal experience and intraoperative exploration when facing complex lesions, increasing surgical uncertainty, prolonging operation time, and potentially leading to unnecessary complications.
[0007] In summary, while existing planning systems have made positive progress in their respective fields, they all have inherent limitations in terms of deep fusion of multimodal images, accurate identification and visualization of complex thoracic anatomical structures, and personalized path design and intelligent decision support specific to thoracic surgery. Therefore, how to construct a thoracic surgery preoperative planning system that can effectively integrate and collaboratively analyze multimodal image data, achieve high-precision visualization of anatomical structures, and support intelligent and personalized surgical path design, in order to significantly improve the efficiency, accuracy, and safety of surgical planning, has become a key challenge and an urgent technical problem for those skilled in the art. Summary of the Invention
[0008] This invention aims to overcome the inherent limitations of existing thoracic surgery preoperative planning systems in terms of deep fusion of multimodal images, accurate identification and visualization of complex intrathoracic anatomical structures, and intelligent, personalized path design and intelligent decision support tailored to the specific needs of thoracic surgery. To achieve the above objectives, this invention provides a thoracic surgery preoperative planning system and method that integrates multimodal images.
[0009] I. System Aspects
[0010] This invention proposes a preoperative planning system for thoracic surgery that integrates multimodal imaging. Its structure includes: a data acquisition and preprocessing module, a multimodal image registration and fusion module, a precise anatomical structure segmentation and 3D reconstruction module, a lesion and functional information mapping module, an intelligent preoperative planning and path optimization module, a visualization and interaction module, and a knowledge base and database. These modules are interconnected through standard data interfaces and exchange information according to a preset data transmission protocol to ensure the integrity of the data flow and the continuity of processing.
[0011] 1. Data Acquisition and Preprocessing Module
[0012] The data acquisition and preprocessing module is designed to receive and standardize multimodal raw data from various medical imaging devices. This module is equipped with multiple data interfaces, including but not limited to the DICOM (Digital Imaging and Communications in Medicine) standard interface, the HL7 (Health Level Seven) interface, and other common medical imaging data format interfaces. Specifically, the raw image data received by this module can cover computed tomography (CT) images, magnetic resonance imaging (MRI) images, and positron emission tomography (PET) images.
[0013] The preprocessing process includes the following sub-functional units:
[0014] Data format conversion unit: Responsible for converting the received raw image data into a standardized data structure within the system, such as converting DICOM sequences into voxel data in NIfTI format. This conversion process ensures consistency in subsequent processing.
[0015] Image Quality Assessment Unit: This unit performs quality checks on the converted image data, including but not limited to noise level assessment, artifact detection, image integrity verification, and signal-to-noise ratio (SNR) analysis. This unit quantifies image quality by calculating grayscale gradient changes in local image regions, statistical noise distribution characteristics (such as the variance of Gaussian noise), and assessing the presence of holes or anomalous stripes. If the image quality falls below a preset threshold, the system will trigger a warning and may instruct re-acquisition or targeted enhancement processing.
[0016] Image Enhancement Unit: Adaptive image enhancement algorithms are implemented based on the characteristics of different image modalities. For example, for CT images, non-local means filtering or 3D anisotropic diffusion filtering can be used to effectively remove noise and preserve edge details. For MRI images, wavelet transform-based denoising algorithms or adaptive filtering based on the Rician noise model can be applied. For PET images, iterative reconstruction algorithms (such as OSEM) or post-processing smoothing filtering can be used to improve the signal-to-noise ratio and contrast.
[0017] Intensity normalization unit: Eliminates image intensity differences caused by different scanning devices or scanning parameters. This unit can use methods such as histogram matching, white / black point normalization, or Z-score normalization to map the intensity values of all images to a uniform grayscale or density range, such as [0,1] or [-1000,3000] HU (Hounsfield Units), to ensure the stability and consistency of subsequent algorithm processing.
[0018] Spatial resampling unit: Resamples image data with different resolutions and voxel spacing to a uniform spatial resolution, such as isotropic voxels of 1.0mm × 1.0mm × 1.0mm. The resampling algorithm can use trilinear interpolation or higher-order interpolation methods based on B-splines to preserve the spatial information of the original image to the greatest extent.
[0019] 2. Multimodal image registration and fusion module
[0020] The multimodal image registration and fusion module is designed to achieve high-precision, automated spatial alignment and information integration between images of different modalities. This module is one of the key innovations of this invention in overcoming the limitations of existing technologies, aiming to construct a unified three-dimensional anatomical reference system for patients.
[0021] The registration and fusion process includes the following key sub-functional units:
[0022] Initial Alignment Unit: This unit employs a rigid body registration method based on feature points or image moments to perform coarse spatial alignment of the input images from different modalities. This unit performs preliminary correction by identifying common anatomical landmarks (such as bony structural features) in the images or calculating the geometric center and principal axis directions of the images, and then applying three-dimensional rigid body transformations (including a three-dimensional translation matrix T and a three-dimensional rotation matrix R).
[0023] Non-rigid registration unit: Based on the initial alignment, a voxel-based non-rigid registration algorithm is executed to correct local deformations caused by factors such as respiratory motion and organ deformation. This invention preferably employs the Large Deformation Diffeomorphic Metric Mapping (LDDMM) algorithm. The core idea of this algorithm is to find a smooth, reversible differential homeomorphic mapping function φ, such that the source image I_S, after transformation by φ, is highly similar in content to the target image I_T.
[0024] Mathematical description of the LDDMM algorithm: The algorithm generates differential homeomorphisms by solving an optimization problem. The mapping The evolution from 0 to 1 in time t is governed by a smooth velocity field. Driver, i.e. ,and The optimization objective function Defined as:
[0025]
[0026] in, This invention preferably uses Mutual Information (MI) as the similarity measure for multimodal images. Mutual Information I(A,B) = H(A) + H(B) - H(A,B), where H(.) represents Shannon entropy and H(.,.) represents joint entropy. Mutual Information effectively captures the nonlinear statistical dependencies between images of different modalities. λ is a positive regularization parameter used to balance image similarity with the smoothness of the velocity field. L is a differential operator, typically a Gaussian kernel or a Laplacian operator, used to penalize the roughness of the velocity field, ensuring the smoothness and invertibility of φ.
[0027] Optimization Solution Process: The optimization problem is solved using variational methods or adjoint state methods, employing iterative gradient descent or quasi-Newton methods (such as L-BFGS) to find the optimal velocity field v. In each iteration, the gradient of the objective function with respect to the velocity field is calculated, and the velocity field is updated until the objective function converges or the preset number of iterations is reached. This process generates a high-precision deformation field capable of accurately describing the local nonlinear correspondences between different images.
[0028] The fusion algorithm unit, after high-precision registration, fuses image data from different modalities into a single, information-rich composite image dataset. This invention employs a multi-scale geometric analysis (MSGA) fusion algorithm. This algorithm first decomposes the registered multimodal images into sub-bands of multiple scales and orientations, for example, through wavelet transform or curvelet transform. Then, different fusion rules are applied based on the characteristics of different sub-bands. For example, a weighted average method can be used for low-frequency sub-bands (representing approximate image information); for high-frequency sub-bands (representing detailed image information), energy maximum selection or sparse representation fusion rules can be used to preserve the advantageous features of each modality. Finally, the fused sub-bands are reconstructed into a unified composite image through inverse transform. For example, the fused CT and MRI images can form a composite voxel dataset that simultaneously and clearly displays bone, soft tissue, and vascular structures.
[0029] 3. Precise Anatomical Structure Segmentation and 3D Reconstruction Module
[0030] The precise segmentation and 3D reconstruction module of the anatomical structure is designed to automatically identify and accurately delineate key anatomical structures and lesions within the thoracic cavity from the fused multimodal image data, and convert them into a 3D model that can be used for visualization and planning.
[0031] The segmentation and reconstruction process includes the following key sub-functional units:
[0032] Deep learning-driven semantic segmentation unit: This invention uses 3D U-Net (3DU-Net) and its variant deep neural network models to automatically segment structures such as lungs, trachea, bronchi, pulmonary vessels (pulmonary arteries and pulmonary veins), heart, esophagus, chest wall, lymph nodes, and lesions (such as tumors).
[0033] 3DU-Net Model Architecture: The model employs an encoder-decoder structure. The encoder path downsamples through consecutive 3D convolutional layers and 3D max-pooling layers to extract multi-scale feature representations. The decoder path upsamples through 3D transposed convolutional layers or 3D upsampling layers to progressively restore spatial resolution. Crucially, the feature maps from the encoder path are directly passed to the corresponding upsampling layers in the decoder path via skip connections, thus fusing multi-scale contextual information and fine-grained spatial information, which helps achieve accurate edge localization.
[0034] Training and Loss Function: The model's training data comes from a large number of labeled multimodal medical image datasets. During training, the model preferably uses a composite loss function, such as a weighted combination of Dice loss and cross-entropy loss. Dice Loss Where A is the predicted segmented region and B is the ground truth labeled region, the goal is to optimize the overlap of the segmentation results. Cross-entropy loss. ,in For true probability, To predict probabilities, it aims to optimize pixel-level classification accuracy. This is achieved through weighted summation. The model can simultaneously take into account both segmentation accuracy and edge details.
[0035] Post-processing optimization: The segmentation results are smoothed by morphological operations (such as opening and closing operations) to remove small artifacts or breaks, and isolated pixel clusters that do not conform to anatomical features are identified and filtered by connected component analysis.
[0036] 3D Surface Reconstruction Unit: This unit converts the segmented discrete voxel data (binarized mask) into a continuous 3D geometric model. The present invention preferably employs the Marching Cubes algorithm. This algorithm traverses each voxel and, based on the internal and external states of its eight vertices (determined by the segmentation results), selects the corresponding predefined triangular facet template, thereby generating a triangular mesh representation of isosurfaces within the voxel mesh.
[0037] Model optimization and topology correction unit: Optimizes the generated initial 3D model, including:
[0038] Mesh smoothing: Apply the Laplacian smoothing algorithm or other curvature-based smoothing algorithms to remove surface jaggedness, making the model surface smoother and more natural, while avoiding excessive shrinkage or loss of detail as much as possible.
[0039] Mesh simplification: The QuadricErrorMetricbasedEdgeCollapse algorithm or vertex clustering algorithm is used to reduce the number of triangles in the model while maintaining the model's topology and visual fidelity, thereby improving rendering efficiency and subsequent calculation speed.
[0040] Topology repair: Detects and repairs topological defects such as holes, self-intersections, or non-manifold edges in the model to ensure the geometric integrity and correctness of the model.
[0041] 4. Lesion and Functional Information Mapping Module
[0042] The lesion and functional information mapping module is designed to accurately overlay and map the tissue metabolic activity information reflected in PET images, as well as other potential functional or pathological information, onto a three-dimensional anatomical model, thereby providing surgeons with multi-dimensional diagnostic information that goes beyond morphology.
[0043] The mapping process includes the following key sub-functional units:
[0044] PET Standardized Uptake Value (SUV) Calculation Unit: Based on the raw radioactivity data from PET images and parameters such as patient weight, injection dose, and scan time, the standardized uptake value (SUV) is calculated. The SUV calculation formula is as follows:
[0045]
[0046] This unit can calculate key parameters such as SUVmax, SUVmean, and total metabolic tumor volume (MTV) to quantify the metabolic activity of lesions.
[0047] Functional Information Spatial Mapping Unit: The calculated SUV value or other functional indicators are precisely superimposed onto the surface or internal voxels of the lesion 3D model generated by the precise segmentation and 3D reconstruction module of anatomical structure in the form of color coding or transparency mapping.
[0048] Specifically, the unit first aligns the PET voxel data with a high-resolution anatomical model space using interpolation (such as trilinear interpolation). Then, a continuous color map is defined to map different SUV value ranges to specific colors and transparency. For example, high SUV value areas can be displayed as red, low SUV value areas as blue, and the color depth can vary linearly or non-linearly with the SUV value.
[0049] Mapping methods include: rendering colors directly onto the tumor surface, or displaying internal metabolic activity areas in a three-dimensional voxel model using volume rendering technology.
[0050] Dynamic functional information integration unit (optional): If dynamic enhanced CT / MRI or dynamic PET data are available, this unit can further analyze the perfusion characteristics or metabolic kinetic curves of the lesion, extract parameters such as Ktrans (transport constant) and ve (extracellular volume ratio), and integrate these parameters into the three-dimensional model in the form of time series or specific parameter maps to assess the more detailed functional status of the lesion, such as blood supply and cell density.
[0051] 5. Intelligent preoperative planning and pathway optimization module
[0052] The intelligent preoperative planning and path optimization module is the embodiment of the core innovation and application value of this invention. Its function is to automatically generate and optimize surgical paths and operation plans based on fused multimodal images and reconstructed three-dimensional models, combined with the characteristics of thoracic surgery and the doctor's experience, so as to achieve refined and personalized preoperative planning.
[0053] The planning and optimization process includes the following key sub-functional units:
[0054] Surgical approach selection and optimization unit: This unit intelligently recommends and optimizes surgical approaches based on the patient's anatomical characteristics, lesion location and size, and the expected surgical type (such as thoracoscopic minimally invasive or open surgery).
[0055] Path generation algorithm: A sparse or dense voxel graph is constructed on a 3D anatomical model, where each voxel or node represents a potential pass-through point. The edges connecting nodes are weighted, taking into account the following factors:
[0056] Safe distance: The minimum distance from important blood vessels (such as the aorta, pulmonary artery, and pulmonary vein), trachea, bronchi, nerves (such as the phrenic nerve and recurrent laryngeal nerve), esophagus, heart, and other critical structures. The smaller this distance, the greater its weight.
[0057] Degree of tissue damage: Potential tissue trauma is assessed by the type (muscle, fat, bone, organ parenchyma) and thickness of tissues along the potential access route.
[0058] Visibility and operability: Assess the extent of exposure of the target area by the approach, as well as the accessibility of surgical instruments and the operating space.
[0059] Impact on postoperative recovery: Assess the impact of the approach on postoperative pain, respiratory function, and aesthetics.
[0060] Optimization Algorithm: The A* search algorithm or Dijkstra's algorithm is used to find the optimal path from the predetermined incision in the external skin to the target lesion or surgical area on the voxel map. The A* algorithm guides the search direction through a heuristic function (e.g., a weighted sum of the Euclidean distance from the current node to the target node and the risk assessment) to efficiently find the path with the minimum overall "cost" (safety, degree of damage, operability, etc.).
[0061] Tumor resection boundary delineation and lymph node dissection strategy unit:
[0062] Calculation of safe resection margin: Based on the nature of the lesion (benign or malignant, tumor type) and clinical guidelines, a preset safe resection margin relative to the tumor surface is automatically generated on the 3D model. This margin can be calculated using **3D Euclidean Distance Transform** to determine the isosurface at a specific distance (e.g., 5 mm or 10 mm) from the tumor surface.
[0063] Lymph node identification and risk assessment: Standard thoracic lymph node zoning maps are precisely registered onto the patient's individual 3D model. By combining SUV values from PET images, lymph node morphological characteristics (size, density) from CT images, and clinicopathological information, suspicious lymph nodes are automatically identified and their metastasis risk is assessed.
[0064] Lymph node dissection path planning: Based on the lymph node risk assessment results and the relationship between the surrounding blood vessels and nerves, the path and scope of lymph node dissection are planned to minimize damage to surrounding structures while maximizing the thoroughness of the dissection.
[0065] Vascular, airway, and nerve protection strategy unit: This unit continuously monitors the relative positions of the planned surgical path and important blood vessels, airways, and nerves.
[0066] Real-time distance calculation: During the planning process, the system calculates the minimum distance between the tip of the surgical instrument or the cutting plane and the blood vessel, airway wall or nerve bundle in real time, and alerts the surgeon through visual warnings (such as color changes or distance markings).
[0067] Key point identification: Identify high-risk areas such as vascular branching points, airway bifurcation points, and areas with dense nerve pathways, and mark them as restricted or high-risk areas to guide doctors to avoid them.
[0068] Flow / Function Simulation (Advanced Function): Predicts changes in blood / airflow in a specific blood vessel or airway after compression or partial resection using simplified computational fluid dynamics (CFD) models or circuit simulations, and assesses their impact on organ function.
[0069] Surgical procedure sequence simulation and risk assessment unit:
[0070] Virtual surgical simulation: Allows surgeons to perform virtual surgical operations on a 3D model, such as simulating instrument grasping, cutting, and suturing. The system simulates tissue deformation and response based on a preset physical model (such as an elasticity model).
[0071] Complication risk prediction: Based on the patient's anatomical characteristics, lesion nature, surgical type, and historical data, machine learning models (such as random forests or neural networks) are used to predict the probability of common complications such as intraoperative bleeding, pneumothorax, pleural infection, air leak, and nerve injury. These models are trained by analyzing the imaging features, clinical parameters, and complication occurrences of a large number of surgical cases.
[0072] Risk visualization: The predicted risk level is overlaid on a 3D model in the form of a heat map or color coding to clearly indicate potential high-risk areas or operations.
[0073] 6. Visualization and Interaction Module
[0074] The visualization and interaction module is designed to present complex multimodal image data, three-dimensional reconstruction models, and planning results to surgeons in an intuitive and multidimensional way, and to provide a flexible human-computer interaction interface to support doctors' personalized adjustments and decisions.
[0075] The visualization and interaction functions include the following key sub-functional units:
[0076] Multi-dimensional rendering unit: Supports volume rendering, surface rendering, and hybrid rendering modes, and can simultaneously display the internal structure and surface morphology of an organization.
[0077] Volume rendering: By mapping voxel density values to color and transparency through transfer functions, non-invasive visualization of internal structures such as lung parenchyma, airways, and blood vessels is achieved. Adjustable clipping planes and window widths / levels are supported.
[0078] Surface rendering: Independent surface mesh rendering is performed on the segmented key anatomical structures (such as tumors, bones, and large blood vessels), supporting custom colors, transparency, and materials.
[0079] Fusion display: Allows for the simultaneous display of different modalities of images (such as CT bone, MRI soft tissue, PET metabolic hotspots) in an overlay or fusion manner, such as overlaying PET color heatmaps onto the anatomical surface of CT / MRI.
[0080] Interactive operation unit: Provides an intuitive toolset that supports surgeons in real-time rotation, translation, scaling, measurement (distance, angle, volume), cutting, annotation, and drawing and modification of virtual surgical paths for models.
[0081] Virtual sectioning: Supports planar sectioning in any direction to reveal the anatomical relationships inside the model.
[0082] Path drawing and editing: Doctors can directly draw potential surgical approaches or resection areas on the 3D model, and the system provides real-time feedback on the path risk assessment results.
[0083] Multi-view synchronization: Supports synchronized linkage between 2D views (axial, coronal, sagittal, etc.) and 3D views, so that operations on any view can be reflected in other views in real time.
[0084] Report Generation Unit: Automatically generates a structured report based on preoperative planning results. Report content includes, but is not limited to: basic patient information, imaging findings, key anatomical measurements, precise tumor location and volume, distance from surrounding vital organs, recommended surgical approach, expected resection extent, lymph node dissection strategy, and potential complication risk assessment. Reports can be exported to PDF or DICOMSR (Structured Report) format.
[0085] 7. Knowledge Base and Database
[0086] The knowledge base and database are used to store patient image data, reconstruction models, planning results, surgical templates, anatomical atlases and related clinical guidelines, and serve as data support for the intelligent module.
[0087] The knowledge base and database include the following key sub-functional units:
[0088] Patient data storage unit: Stores all patients' raw multimodal image data, processed fused images, segmented and reconstructed 3D models, and detailed results of each preoperative planning session. Data is stored in encrypted form and complies with medical data security and privacy protection standards (such as HIPAA).
[0089] Anatomical Atlas Library: Contains standardized thoracic anatomical atlases, covering detailed structures such as blood vessels, airways, and lymph node divisions, used to guide automated segmentation and registration, and as a reference for preoperative planning.
[0090] Surgical guidelines and experience knowledge base: This database stores classic surgical procedures, operational guidelines, clinical pathways, and surgical experiences and successful cases contributed by expert physicians for various thoracic surgeries. This knowledge can be accessed by the intelligent planning module to guide the optimization and decision-making of surgical pathways.
[0091] Machine Learning Model Library: Stores trained deep learning segmentation models, risk prediction models, and other AI-based analytics models. The library supports model version management and dynamic updates.
[0092] Data Indexing and Retrieval Unit: Provides an efficient data indexing mechanism and flexible retrieval functions, supporting data query and management based on various conditions such as patient ID, imaging modality, disease type, and surgery date.
[0093] II. Methodological Aspects
[0094] This invention proposes a preoperative planning method for thoracic surgery that integrates multimodal imaging, the process of which includes the following steps:
[0095] Step S100: Multimodal image data acquisition and preprocessing.
[0096] This step acquires the patient's raw medical image data, including CT, MRI, and PET scans, through a data acquisition and preprocessing module. Before entering the subsequent processing flow, the data undergoes a series of preprocessing operations, such as unified data format conversion, image quality assessment, adaptive image enhancement, intensity normalization, and spatial resampling, to ensure that all image data achieves a high degree of consistency and standardization in terms of space, intensity, and format, laying the foundation for subsequent high-precision registration and analysis.
[0097] Step S200: High-precision registration and fusion of multimodal images.
[0098] This step utilizes a multimodal image registration and fusion module to perform high-precision spatial alignment and information integration on preprocessed images of different modalities. First, initial rigid alignment based on image features or moments is performed, with preliminary translation and rotation corrections applied to each image. Then, crucially, non-rigid registration based on Large Deformation Differential Homeomorphic Registration (LDDMM) is implemented. This process iteratively calculates and optimizes a smooth velocity field, generating a differential homeomorphic transformation φ that nonlinearly maps the source image to the target image. The optimization process uses mutual information (MI) as a measure of image similarity and introduces a regularization term to constrain the smoothness of the velocity field, ensuring the biological rationality and reversibility of the deformation field. Specifically, the optimization objective function is defined as:
[0099]
[0100] After high-precision registration is completed, the multi-scale geometric analysis (MSGA) fusion algorithm is used to decompose the registered images of different modalities at different scales and directions. Based on the sub-band characteristics, fusion rules such as weighted average or maximum value selection are used to integrate information, and finally a unified composite image dataset with complementary information is reconstructed.
[0101] Step S300: Precise segmentation and three-dimensional reconstruction of key anatomical structures.
[0102] This step utilizes a precise anatomical structure segmentation and 3D reconstruction module to automatically and accurately segment key anatomical structures within the thoracic cavity (such as lung parenchyma, trachea, bronchi, pulmonary vessels, heart, esophagus, chest wall, and lymph nodes) and target lesions (such as tumors) from fused multimodal image data. The segmentation process is primarily based on a 3D U-Net and its variant deep neural network model. This model, through an encoder-decoder structure and skip connections, achieves multi-scale feature extraction and accurate pixel-level classification. The model training employs a weighted combination of Dice loss and cross-entropy loss as a composite loss function to optimize the overlap and edge accuracy of the segmentation results. After segmentation, morphological post-processing is performed to eliminate artifacts. Subsequently, the segmented binary voxel mask is input into a MarchingCubes algorithm to generate initial 3D surface mesh models for each structure. Finally, the generated model undergoes mesh smoothing, mesh simplification, and topology repair to ensure geometric accuracy, visual fidelity, and computational efficiency.
[0103] Step S400: Mapping lesions to functional information.
[0104] This step, through a lesion-functional information mapping module, precisely integrates the metabolic activity information of lesions carried by PET images into a three-dimensional anatomical model. First, based on the raw PET data and the patient's physiological parameters, the standardized uptake value (SUV) of the lesion region is calculated, including quantitative indicators such as SUVmax and SUVmean. Then, these SUV values are precisely superimposed onto the three-dimensional surface or voxel interior of the reconstructed tumor or other related anatomical structures in the form of color coding or transparency mapping using spatial interpolation methods. For example, a gradient color spectrum from low to high metabolism can be set, displaying high SUV areas as warm colors and low SUV areas as cool colors, thus visually reflecting the biological characteristics and invasiveness of the lesion.
[0105] Step S500: Intelligent preoperative planning and path optimization.
[0106] This step is the core innovation of this invention. Through the intelligent preoperative planning and path optimization module, personalized and refined surgical plans are generated and optimized for thoracic surgery based on a three-dimensional anatomical model that integrates multimodal images.
[0107] S510: Surgical Approach Selection and Optimization: The system constructs a weighted voxel graph in three-dimensional anatomical space, combining lesion location, patient anatomical characteristics, and surgical type. Each node in the graph represents a possible spatial location, and each edge represents a potential path between adjacent locations. The weight of the edge is determined by the following composite factors: the minimum safe distance to critical structures such as important blood vessels, airways, nerves, esophagus, and heart (the closer the distance, the higher the risk weight), the potential degree of tissue damage (the type and thickness of the tissue traversed), the accessibility of surgical instruments and the operating field of view, and the impact on the patient's postoperative recovery. The A* search algorithm is used to search for the optimal surgical path from a pre-set surface incision to the target lesion or surgical area on the voxel graph, with the goal of comprehensively minimizing the "cost".
[0108] S520: Tumor Resection Boundary Delineation and Lymph Node Dissection Strategy: Based on the lesion nature and clinical guidelines, a safe resection margin isosurface is automatically generated on the 3D model at a preset safe distance (e.g., 5 mm or 10 mm) from the tumor surface using 3D Euclidean distance transformation. Simultaneously, a standardized thoracic lymph node zonation atlas is registered to the individual patient model, and the metastasis risk of each lymph node station is assessed by combining PET SUV values and CT morphological features. Based on the risk assessment, the scope and path of lymph node dissection are intelligently planned to maximize the protection of surrounding critical structures while ensuring thorough dissection.
[0109] S530: Vascular, Airway, and Nerve Protection Strategy: The system continuously monitors in real time the minimum distance between the planned surgical path, resection plane, or virtual instrument tip and major blood vessels (such as the aortic arch, pulmonary artery, pulmonary veins and their branches), trachea, bronchi and their branches, phrenic nerve, recurrent laryngeal nerve, and other key anatomical structures. Once the distance falls below a preset safety threshold, the system issues an alert to the surgeon through visual warnings (such as color highlighting or distance numerical display) and assists in adjusting the path to avoid potential damage.
[0110] S540: Surgical Procedure Sequence Simulation and Risk Assessment: This system allows surgeons to simulate virtual surgical procedures on a 3D model, including the entry and exit of instruments and the cutting and separation of tissues. Based on the patient's individual anatomical characteristics, lesion features, and planned surgical path, the system uses machine learning models (such as regression or classification models based on historical big data) to predict the probability of common complications such as intraoperative blood loss, pneumothorax, air leakage, nerve injury, and postoperative infection. The prediction results are overlaid on the 3D model as risk levels or heatmaps, providing surgeons with comprehensive risk insights.
[0111] Step S600: Visualization and Interaction.
[0112] This step utilizes a visualization and interactive module to present all the above processing and planning results in an intuitive and multi-dimensional way. This module supports volume rendering, surface rendering, and hybrid rendering modes, allowing doctors to freely switch views, perform cross-sections in any direction, and perform real-time rotation, translation, scaling, measurement, and virtual annotation of the model. Doctors can directly draw or modify the surgical path and resection range on the 3D model, and the system provides real-time feedback on corresponding risk assessments. Finally, the system can automatically generate a structured surgical report based on the planning results, covering all key information and risk assessments, and supports exporting to standard medical report formats.
[0113] Step S700: Knowledge base and database management.
[0114] This step involves the operation of the knowledge base and database modules. All patients' original images, processed composite images, 3D reconstruction models, and detailed preoperative planning results are securely stored. The knowledge base also maintains standard anatomical atlases, surgical guidelines and experiential knowledge, as well as all trained machine learning models. This module provides efficient data indexing, retrieval, and management functions, and continuously provides data support and model update services to other modules.
[0115] The beneficial effects of this invention are:
[0116] Significant improvements in image information fusion and accuracy: By introducing non-rigid registration technology based on the LDDMM algorithm and a fusion algorithm based on multi-scale geometric analysis, high-precision, automated spatial alignment and complementary information fusion of multimodal image data such as CT, MRI, and PET at the voxel level have been achieved. This completely solves the information fragmentation problem caused by single-modality or limited fusion capabilities in existing systems, constructing a unified, comprehensive, and highly accurate three-dimensional anatomical reference system for patients. It provides surgeons with an unprecedented multi-dimensional view of anatomical and functional information, greatly reducing the error and cognitive load of manual information comparison.
[0117] Refined Anatomical Structure Recognition and 3D Reconstruction: Utilizing a 3D U-Net deep neural network model for automated segmentation of key anatomical structures and lesions, combined with the precise MarchingCubes 3D reconstruction algorithm and model optimization technology, this invention can generate highly detailed and topologically correct intrathoracic 3D anatomical models, including complex structures such as microvessels, bronchioles, and lymph nodes. This far surpasses the ability of existing systems to coarsely identify anatomical structures, providing a solid and accurate anatomical foundation for preoperative planning.
[0118] Breakthrough in Intelligent and Personalized Preoperative Planning Capabilities: This invention introduces an intelligent planning function specifically tailored to the characteristics of thoracic surgery, fundamentally different from the planning philosophy of existing systems that serve specific treatment modalities (such as radiotherapy and ablation). By optimizing the surgical approach through the A* search algorithm, precisely defining the surgical margins using Euclidean distance transformation, and monitoring the distances to important blood vessels, airways, and nerves in real time, it can generate personalized surgical paths that are highly safe, minimally invasive, and highly operable. Simultaneously, intelligent assistance in lymph node dissection strategies and machine learning-driven complication risk prediction greatly enhance the scientific rigor and predictability of the planning, transforming the traditional experience-based planning model into a data-driven and intelligent decision-making support model.
[0119] Surgical safety and success rates are significantly improved: This invention, through monitoring safe distances to critical structures, intelligent alerts for high-risk areas, and early prediction of complications, enables surgeons to comprehensively anticipate potential risks before surgery and develop detailed avoidance strategies. This effectively reduces the probability of intraoperative accidents and significantly improves surgical safety. Refined planning also helps shorten surgical time, reduce intraoperative bleeding, and is expected to improve postoperative recovery.
[0120] Optimization of doctor-patient communication and teaching / training: The intuitive, multi-dimensional 3D visualization model and virtual surgical simulation function not only help surgeons to better understand complex conditions and accurately formulate surgical plans, but also provide an excellent platform for doctor-patient communication, enabling patients to more intuitively understand their condition and surgical plan. Furthermore, this system can also serve as a powerful tool for advanced medical teaching and training, accelerating the growth of young doctors and improving the overall level of medical care.
[0121] In summary, this invention has achieved significant breakthroughs in both principle and function in multimodal image processing, refined anatomical modeling, and intelligent surgical planning and risk assessment, providing a comprehensive, efficient, and precise preoperative planning solution for modern thoracic surgery. Attached Figure Description
[0122] Figure 1 A schematic diagram of the preoperative planning system for thoracic surgery that integrates multimodal imaging according to the present invention. Detailed Implementation
[0123] The present invention will be further described below with reference to the accompanying drawings and specific embodiments. The illustrative embodiments and descriptions herein are used to explain the present invention, but are not intended to limit the present invention.
[0124] This invention provides a preoperative planning system for thoracic surgery that integrates multimodal imaging, aiming to revolutionize the traditional planning model for thoracic surgery. By deeply integrating multimodal medical imaging data, achieving precise three-dimensional reconstruction of complex anatomical structures, and introducing an intelligent decision-making support mechanism, it provides surgeons with an unprecedentedly refined and personalized preoperative planning solution. This embodiment will provide a detailed description of the specific operation procedures of each component module of the system and the corresponding methods to ensure that those skilled in the art can understand and implement the technical solution of this invention.
[0125] I. Detailed Explanation of Each Module of the System
[0126] The thoracic surgery preoperative planning system integrating multimodal imaging described in this invention has a meticulous architecture, with functional modules tightly coupled through standardized interfaces, clear data flow, and rigorous processing logic. The core of this system includes a data acquisition and preprocessing module, a multimodal image registration and fusion module, a precise anatomical structure segmentation and 3D reconstruction module, a lesion and functional information mapping module, an intelligent preoperative planning and path optimization module, a visualization and interaction module, as well as a knowledge base and database.
[0127] 1. Data Acquisition and Preprocessing Module
[0128] The data acquisition and preprocessing module, serving as the system's information input portal, is primarily responsible for receiving and standardizing multimodal raw data from various medical imaging devices. This module pre-configures multiple highly compatible data interfaces, supporting not only the industry-standard DICOM (Digital Imaging and Communications in Medicine) interface to ensure seamless integration with mainstream CT, MRI, and PET scanning equipment, but also the HL7 (Health Level Seven) interface for acquiring patient clinical information. Furthermore, it reserves customized interfaces for other common medical imaging data formats (such as NIfTI, Minc, Analyze, etc.). The module receives a wide range of raw image types, typically including high-resolution computed tomography (CT) images to provide excellent information on bone, airway, and tissue density; magnetic resonance imaging (MRI) images to reveal soft tissue contrast and lesion invasion; and positron emission tomography (PET) images to quantify lesion metabolic activity.
[0129] Specifically, the preprocessing process is divided into the following sub-functional units:
[0130] Data Format Conversion Unit: This unit automatically identifies the received raw image data format and converts it into voxel data in the system's standardized NIfTI (Neuroimaging Informatics Technology Initiative) format. This process involves parsing the DICOM header file to extract key metadata (such as patient information, scan parameters, image matrix, voxel spacing, window width and level, etc.) and reconstructing the pixel data into a three-dimensional tensor. To ensure data integrity, each DICOM sequence is validated during the conversion process, checking the number of files, sequence integrity, and header consistency. If any anomalies are found, an alarm is triggered.
[0131] Image Quality Assessment Unit: After data format conversion, this unit performs rigorous quality control on each set of image data. Its assessment metrics include, but are not limited to:
[0132] Noise level assessment: The presence of excessive noise is determined by calculating the standard deviation of gray levels or the signal-to-noise ratio (SNR) of local areas of the image (such as lung parenchyma or homogeneous water phantom areas). For example, in CT images, if the standard deviation of gray levels in the lung parenchyma area exceeds 5 HU or the SNR is less than 15 dB, it is considered to be excessively noisy.
[0133] Artifact detection: Utilizing methods based on Fourier transform or local texture analysis, the system detects the presence of motion artifacts (such as blurring or repetitive structures caused by breathing artifacts), metallic artifacts (manifesting as high-density scattering or stripes), or truncation artifacts. Detection methods may include calculating anomalies in the image's frequency domain energy distribution, or detecting extreme abrupt changes and periodic patterns in grayscale values within the spatial domain. If significant artifacts are detected, the system will mark the image and prompt for manual review or recommend a rescan.
[0134] Image integrity verification: Check whether the number of slices in the image sequence is consistent with the expectation, whether there are missing or duplicate slices, and ensure the continuity of the three-dimensional voxel data.
[0135] Signal-to-noise ratio (SNR) analysis: For different modalities of image characteristics, such as MRI images, an SNR estimation method based on the Rician distribution model can be used. The system will judge the evaluation results according to preset quality thresholds (e.g., SNR not less than 20dB for CT images, not less than 10dB for MRI images, and not less than 5dB for PET images). If the SNR falls below the threshold, a warning will be triggered.
[0136] Image Enhancement Unit: This unit implements adaptive image enhancement algorithms to improve image quality and visual readability, taking into account the inherent characteristics of different modalities of images.
[0137] For CT images, to effectively remove noise while finely preserving edge details, this invention preferably employs three-dimensional anisotropic diffusion filtering (3DAnisotropic Diffusion Filtering). This algorithm iteratively calculates pixel gradients, suppressing diffusion in edge directions and accelerating diffusion in uniform regions. Its core lies in a diffusion coefficient function that depends on the local gray-level gradient, such as the Perona-Malik function. Typical parameter settings include the number of iterations (e.g., 10-30), the diffusion constant (e.g., 0.01-0.1, related to the image gradient range), and the gradient threshold.
[0138] For MRI images, considering their common Rician noise characteristics, a thresholding denoising algorithm based on wavelet transform can be applied. By performing multi-level wavelet decomposition on the image, thresholding of wavelet coefficients at different scale subbands (such as VisuShrink or SureShrink thresholding), and then performing inverse wavelet transform to reconstruct the image, noise can be effectively removed while maintaining image sharpness.
[0139] For PET images, to improve the signal-to-noise ratio and contrast, iterative reconstruction algorithms such as OSEM (Ordered Subset Expectation Maximization) or more general Gaussian smoothing filters (e.g., using a Gaussian kernel with σ=1.5-2.5mm) can be used to reduce high-frequency noise through smoothing operations. However, care should be taken to avoid loss of detail due to over-smoothing.
[0140] Intensity normalization unit: To eliminate image intensity differences caused by different scanning devices, scanning protocols, or scanning parameters, and to ensure the stability and consistency of subsequent image processing algorithms, this unit employs multiple normalization methods:
[0141] Histogram matching: Matches the grayscale histogram of the source image to a preset reference histogram, which is suitable for eliminating overall intensity shift.
[0142] White / Black Normalization: By identifying the maximum (white) and minimum (black) gray values in an image, the image intensity is linearly mapped to a uniform gray range. For example, CT images can be mapped to [-1000, 3000] HU (Hounsfield Units), and MRI images can be mapped to [0, 255] or [0, 1].
[0143] Z-score normalization: By subtracting the mean and dividing by the standard deviation, the image intensity distribution is converted into a standard normal distribution with a mean of 0 and a standard deviation of 1, which is suitable for statistical analysis. This invention typically normalizes CT images to [-1000, 3000] HU, MRI images to [0, 1000], and PET images to [0, 10] SUV units to facilitate unified processing in subsequent modules.
[0144] Spatial Resampling Unit: The purpose of this unit is to resample image data with different resolutions and voxel spacing to a uniform spatial resolution, such as isotropic voxels of 1.0mm × 1.0mm × 1.0mm, to eliminate anisotropy and ensure spatial comparability of all modal images. The preferred resampling algorithm is trilinear interpolation, which calculates the grayscale value of a new pixel by weighted averaging of the eight neighboring original pixels surrounding the target pixel. This method is computationally efficient and preserves image details well. For scenes requiring higher accuracy, higher-order interpolation methods based on B-splines can also be used to further reduce interpolation artifacts.
[0145] 2. Multimodal image registration and fusion module
[0146] The multimodal image registration and fusion module is the core of this invention, overcoming the limitations of existing technologies and achieving deep integration of multimodal information. Its function is to ensure high-precision, automated spatial alignment between images of different modalities and to fuse them into a unified, complementary three-dimensional anatomical reference system for the patient.
[0147] The registration and fusion process includes the following key sub-functional units:
[0148] Initial Alignment Unit: This unit performs coarse spatial alignment on input images from different modalities such as CT, MRI, and PET, providing a good initial orientation for subsequent fine non-rigid registration. This unit typically employs rigid body registration methods based on feature points or image moments.
[0149] Feature-point-based methods identify common anatomical landmarks in images (such as vertebral bodies, rib features, tracheal bifurcation points, and other bony structures or gross anatomical features easily identifiable in multimodal images). Operators or automated algorithms select at least three pairs of non-collinear corresponding feature points, and then use the least squares method to solve for a three-dimensional rigid body transformation (including a three-dimensional translation matrix T and a three-dimensional rotation matrix R) to align the feature points of the source image as closely as possible to those of the target image.
[0150] Image-moment-based methods calculate the geometric center, principal axis, and moment of inertia of the image. Preliminary translation and rotation corrections are achieved by aligning the geometric center and principal axis of the source image with the target image. This method is highly automated but sensitive to image noise and artifacts. This invention typically combines two methods: first, coarse alignment is performed using image moments, followed by fine-tuning using a small number of feature points.
[0151] Non-rigid registration unit: Based on the initial alignment, a voxel-based non-rigid registration algorithm is executed to correct local nonlinear deformations caused by factors such as respiratory motion, organ deformation, and patient positional differences. One of the core innovations of this invention in this step is the preferred use of the Large Deformation Diffeomorphic Metric Mapping (LDDMM) algorithm. The unique feature of this algorithm is that it seeks a smooth, reversible differential homeomorphic mapping function. This makes the source image go through After transformation, the content is consistent with the target image. Highly similar. This mapping ensures the biological plausibility of geometric deformation, meaning that tissue tearing or folding will not occur.
[0152] Mathematical description of the LDDMM algorithm: The algorithm generates differential homeomorphisms by solving an optimization problem. The mapping The evolution from 0 to 1 in time t is governed by a smooth velocity field. Driver, i.e. ,and The optimization objective function Defined as:
[0153]
[0154] Here, sim is the image similarity metric function. This invention preferably uses Mutual Information (MI) as the similarity metric for multimodal images. MI calculation does not depend on the absolute intensity values of the images, but rather captures their joint probability distribution, thus exhibiting robustness to nonlinear intensity relationships between different modalities. Mutual Information ,in This represents the Shannon entropy (obtained by calculating the image's grayscale histogram). This represents the joint entropy (obtained by calculating the joint gray-level histogram of the two images). In practical calculations, the image gray-level values are usually discretized into several histogram bins (e.g., 256 bins) to estimate the probability distribution. $\lambda$ is a positive regularization parameter, typically ranging from 10 to 10. -3 Up to 101 Between these, the importance of image similarity and velocity field smoothness terms is balanced. Larger... A smaller value will produce smoother deformation, but may sacrifice registration accuracy; The value is the opposite. $L$ is a differential operator, typically a Gaussian kernel or a Laplacian operator, used to penalize the roughness of the velocity field, ensuring... The smoothness and reversibility of the deformation. For example, when L is a Gaussian kernel, its standard deviation controls the degree of local smoothness of the deformation.
[0155] Optimization Solution Process: The optimization problem is typically solved using a combination of variational methods and adjoint state methods, employing iterative gradient descent or quasi-Newton methods (such as L-BFGS) to find the optimal velocity field v. In each iteration, the system calculates the gradient of the objective function with respect to the velocity field and updates the velocity field along the gradient direction. The iteration process continues until the objective function converges (e.g., the objective function changes less than 10% in 10 consecutive iterations). -5 The process can be completed within minutes to tens of minutes, or up to a preset maximum number of iterations (e.g., 100-300). This process runs on a high-performance computing platform (equipped with GPU acceleration) and typically generates a high-precision deformation field that accurately describes the local nonlinear correspondences between different images, achieving sub-pixel-level registration accuracy.
[0156] The fusion algorithm unit, after high-precision registration, fuses image data from different modalities into a single, information-rich composite image dataset. This invention employs a multi-scale geometric analysis (MSGA) fusion algorithm. This algorithm first decomposes the registered multimodal images (such as CT and PET) into sub-bands of multiple scales and orientations. Common decomposition methods include Discrete Wavelet Transform (DWT) or Curvelet Transform. Wavelet Transform can capture the local frequency and spatial information of an image, while Curvelet Transform has a stronger sparse representation ability for curves and edges in an image, making it more suitable for capturing complex anatomical structures in medical images.
[0157] Specific fusion rules: For low-frequency sub-bands (representing approximate information or background of the image), since they mainly contain energy and contrast information, a weighted average method can be used for fusion, for example, assigning different weights based on the signal-to-noise ratio or clinical importance of each modality. For high-frequency sub-bands (representing detailed information of the image, such as edges, texture, and noise), energy maximum selection or sparse representation fusion rules can be used. Energy maximum selection refers to selecting the sub-band value with the larger energy coefficient in the corresponding high-frequency sub-band as the fusion result to preserve the advantageous features of each modality (such as edge sharpness in CT and metabolic hotspots in PET). Sparse representation fusion involves sparsely decomposing the high-frequency sub-bands under an overcomplete dictionary, then fusing the sparse coefficients (e.g., taking the largest absolute value), and finally reconstructing through the dictionary. Finally, the fused sub-bands are reconstructed into a unified composite image through inverse transformation (such as inverse wavelet transform or inverse curvelet transform). For example, the fused CT and PET images can form a complex voxel data that clearly shows anatomical structures (bones, soft tissues) and intuitively presents metabolic hotspots (tumor areas). Its voxel intensity values can be designed to reflect complex information, such as CT intensity as background and PET intensity as superimposed color or transparency information.
[0158] 3. Precise Anatomical Structure Segmentation and 3D Reconstruction Module
[0159] The precise segmentation and 3D reconstruction module for anatomical structures automatically and accurately identifies and delineates key anatomical structures and lesions within the thoracic cavity from fused multimodal image data, transforming them into 3D models suitable for visualization and planning. The accuracy of this module directly impacts the reliability of subsequent planning.
[0160] The segmentation and reconstruction process includes the following key sub-functional units:
[0161] Deep learning-driven semantic segmentation unit: This invention uses a three-dimensional U-Net (3DU-Net) and its variant deep neural network model to perform automated, pixel-level high-precision semantic segmentation of structures such as the lungs, trachea, bronchi, pulmonary vessels (including the main pulmonary artery and its tertiary branches, the main pulmonary vein and its tertiary branches), heart, esophagus, chest wall, mediastinal lymph node stations, and target lesions (such as lung cancer tumors and mediastinal tumors).
[0162] The 3DU-Net model architecture inherits the classic encoder-decoder structure of U-Net and extends it to three dimensions. The encoder path downsamples through successive 3D convolutional layers (e.g., using 3x3x3 kernels with a stride of 1) and 3D max-pooling layers (e.g., 2x2x2 pooling windows with a stride of 2). After each downsampling, the number of feature maps multiplies, thereby extracting multi-scale, more abstract feature representations. The decoder path upsamples through 3D transposed convolutional layers (or upsampling layers, e.g., 2x2x2 stride and kernel size), progressively restoring spatial resolution while processing the feature maps through 3D convolutional layers. Crucially, the feature maps in the encoder path are directly connected to the corresponding upsampling layers in the decoder path via **skip connections**. This connection fuses fine spatial information (such as edge details) captured in the encoder path with rich contextual information (such as lesion location) captured in the decoder path, thereby facilitating accurate edge localization and segmentation while ensuring global understanding. The model depth is typically 4-5 layers of encoder-decoder pairs, with each encoder layer containing two consecutive 3D convolutional blocks (convolution-batch normalization-activation function), and the number of channels increasing layer by layer from 32, 64, 128, to 256.
[0163] Training and Loss Function: The model's training data comes from large-scale, precisely annotated multimodal medical image datasets by experienced radiologists and surgeons (e.g., containing hundreds to thousands of cases, covering multiple disease types and anatomical variations). Data preprocessing includes intensity normalization and spatial resampling, and data augmentation techniques (such as random rotation, translation, scaling, elastic deformation, and brightness / contrast adjustment) are applied to improve the model's generalization ability and robustness. During training, the model preferably employs a composite loss function, such as a weighted combination of Dice loss and cross-entropy loss.
[0164] Dice loss
[0165] ,in For the predicted segmentation region, The data represents the truly labeled regions. The Dice loss is primarily used to optimize the overlap of segmentation results, and is particularly suitable for medical image segmentation tasks with class imbalance (such as lesions that are usually much smaller than normal tissue).
[0166] Cross-entropy loss
[0167] ,in For true probability, To predict probabilities, cross-entropy loss is used to optimize pixel-level classification accuracy.
[0168] Through weighted sum The model can simultaneously take into account both segmentation accuracy and edge details, among which and This is a weight parameter, typically ranging from 0.5 to 1.0, and can be adjusted depending on the specific task. The optimizer usually uses Adam or SGD, and the learning rate can be scheduled using cosine annealing or multi-step decay. A typical initial learning rate is 10. -3 The batch size is 1-4 three-dimensional voxel blocks, and the training iteration is 200-500 epochs.
[0169] Post-processing optimization: After generating an initial binary mask, the segmentation results undergo a series of morphological operations to smooth them, remove small artifacts or breaks, and fill small holes. For example, OpenOperation (erosion followed by dilation) can be used to remove small protrusions and burrs, and CloseOperation (dilation followed by erosion) can be used to fill small holes and connect broken structures. The structuring element (kernel) size used in the operations is typically 3x3x3 or 5x5x5 voxels. Furthermore, **ConnectedComponentAnalysis** identifies and filters isolated pixel clusters that do not conform to anatomical features (e.g., lung nodules smaller than 100 cubic millimeters are considered artifacts and removed), ensuring the anatomical plausibility of the segmentation results.
[0170] The 3D surface reconstruction unit converts the discrete voxel data (binarized mask, where the target structure is 1 and the background is 0) obtained from deep learning segmentation into a continuous, renderable 3D geometric model. This invention preferably employs the classic Marching Cubes algorithm. This algorithm traverses each voxel (or cube) and, based on the internal and external states of its eight vertices (determined by the segmentation result, i.e., whether the voxel to which the vertex belongs has been segmented into the target structure), selects a corresponding predefined triangular facet template (256 combinations in total, but which can be simplified to 15 basic topologies through symmetry), thereby generating an isosurface triangular mesh representation within the voxel mesh. By selecting an appropriate isosurface threshold (e.g., 0.5 for binarized images), the surface of the structure can be accurately extracted.
[0171] Model Optimization and Topology Correction Unit: Performs a series of optimization processes on the initial 3D model generated by the traveling cube algorithm to improve the model's visual quality, computational efficiency, and geometric topological correctness.
[0172] Mesh smoothing: Applying the Laplacian smoothing algorithm or a curvature-based smoothing algorithm, each vertex is iteratively moved towards the average position of its neighboring vertices, thereby removing surface jaggedness and making the model surface smoother and more natural. To avoid excessive shrinkage or loss of detail, the smoothing operation usually limits the number of iterations (e.g., 5-20 times) or combines it with an edge-preserving smoothing algorithm.
[0173] Mesh simplification: Employing either a quadric error metric-based edge collapse algorithm or a vertex clustering algorithm, the number of triangles in the model is reduced while maintaining its topological structure and visual fidelity. The quadric error metric algorithm calculates the geometric error introduced by folding each edge and prioritizes folding the edges with the smallest error, thereby significantly reducing the number of faces while minimizing the impact on the model's shape. The simplification rate can be flexibly set according to the application scenario, for example, reducing the number of model faces by 50% to 80%.
[0174] Topology repair: This function detects and repairs topological defects in the model, such as holes, self-intersecting edges, or non-manifold edges (e.g., edges sharing more than two faces). Hole repair is achieved by identifying boundary loops and filling them with triangular meshes that minimize surface area. Self-intersecting and non-manifold edges are addressed through local mesh reconstruction or removal of redundant vertices to ensure the geometric integrity and correctness of the model, which is crucial for subsequent virtual surgical simulations and 3D printing.
[0175] 4. Lesion and Functional Information Mapping Module
[0176] The lesion and functional information mapping module is designed to accurately overlay and map the tissue metabolic activity information reflected in PET images, as well as other potential functional or pathological information (such as dynamic enhanced CT / MRI perfusion information), onto a three-dimensional anatomical model. This provides surgeons with multi-dimensional diagnostic information that goes beyond simple morphology, assisting in lesion assessment and treatment decisions.
[0177] The mapping process includes the following key sub-functional units:
[0178] PET Standardized Uptake Value (SUV) Calculation Unit: This unit calculates the standardized uptake value (SUV) based on the raw radioactivity data from PET images and key parameters such as patient weight, injection dose, and scan time. SUV is an important indicator of a tumor's ability to take up radiotracers, and its calculation formula is as follows:
[0179]
[0180] This unit can calculate and provide several commonly used clinical SUV parameters, including:
[0181] SUVmax: The highest SUV value of voxels within the lesion area, reflecting the metabolic level of the most active area of the lesion.
[0182] SUVmean: The average SUV value of all voxels within the lesion area, providing an overview of the overall metabolic level of the lesion.
[0183] Metabolic Tumor Volume (MTV): Defines the volume of a metabolically active tumor region by setting an SUV threshold (e.g., above 2.5 or 40% SUVmax), reflecting the biological size of the lesion.
[0184] Total Lesion Glycolysis (TLG): TLG = MTV × SUVmean, which comprehensively reflects the metabolic volume and intensity of lesions.
[0185] The calculation of these parameters provides an objective basis for quantifying the metabolic activity and invasiveness of lesions.
[0186] Functional Information Spatial Mapping Unit: The calculated SUV value or other functional indicators are precisely superimposed onto the surface or internal voxels of the lesion 3D model generated by the precise segmentation and 3D reconstruction module of anatomical structure in the form of color coding or transparency mapping.
[0187] Spatial Alignment and Interpolation: This unit first aligns the PET voxel data (typically lower resolution than CT / MRI) with the high-resolution anatomical model space using an interpolation algorithm (such as trilinear interpolation or a more refined B-spline interpolation). Since the previous multimodal registration has already registered PET and CT / MRI to the same coordinate system, only resolution consistency needs to be ensured here.
[0188] Color Mapping and Transparency Mapping: Define a continuous color map (such as a rainbow spectrum, heatmap, or custom pseudocolor spectrum) to map different SUV value ranges to specific colors and transparency. For example, low SUV value areas can be displayed as cool colors (such as blue or green), and high SUV value areas as warm colors (such as yellow to red), with color depth varying linearly or non-linearly with the SUV value. Simultaneously, transparency can be set according to the SUV value; high-metabolic areas can be more opaque, and low-metabolic areas more transparent, thus highlighting active areas of lesions in a 3D view.
[0189] Rendering methods: Mapping methods include: directly rendering colors onto the tumor surface so that its appearance directly reflects metabolic intensity; or displaying internal metabolic activity areas in a 3D voxel model using volume rendering technology. Volume rendering allows doctors to "see through" tissue without anatomical dissection, visually observing metabolic heterogeneity within the tumor.
[0190] Dynamic functional information integration unit (optional): If the patient has undergone dynamic contrast-enhanced CT / MRI scan or dynamic PET scan, this unit can further analyze the perfusion characteristics or metabolic kinetic curves of the lesion.
[0191] Parameter extraction: For example, pharmacokinetic parameters such as Ktrans (transport constant, reflecting vascular permeability) and ve (extracellular space volume ratio) can be extracted from dynamic contrast-enhanced MRI (DCE-MRI) data. These parameters can be used to assess the blood supply, microvascular density, and extracellular space characteristics of lesions. Kinetic parameters such as tracer uptake and clearance can be extracted from dynamic PET data.
[0192] Spatial integration: These parameters are overlaid onto the 3D model in the form of parametric maps using color coding, or displayed in the interactive interface as time-series curves, thus providing more detailed and dynamic functional status information of the lesions. This is of great value for assessing the malignancy of tumors, treatment response, and differential diagnosis.
[0193] 5. Intelligent preoperative planning and pathway optimization module
[0194] The intelligent preoperative planning and path optimization module is the embodiment of the core innovation and application value of this invention. Its function is to automatically generate and optimize surgical paths and operation plans based on fused multimodal images and reconstructed three-dimensional models, combined with the characteristics of thoracic surgery and the doctor's experience, so as to achieve refined and personalized preoperative planning and significantly improve the safety and efficiency of surgery.
[0195] The planning and optimization process includes the following key sub-functional units:
[0196] Surgical Approach Selection and Optimization Unit: This unit intelligently recommends and optimizes the safest, least invasive, and most operable surgical approach based on the patient's individual anatomical characteristics, the precise location and size of the lesion, and the expected type of surgery (such as thoracoscopic minimally invasive surgery, robot-assisted surgery, or traditional open surgery).
[0197] Path generation algorithm: The system constructs a dense or sparse voxel map on a 3D anatomical model, where each voxel (e.g., 1 cubic millimeter in size) or node represents a potential pass-through point. The edges connecting nodes are assigned weights, which are "cost" or "risk" values derived from a comprehensive evaluation of several key factors. These factors include:
[0198] Safe distance: This is the most crucial weighting factor. The system calculates in real time the minimum distance between each point on the potential path and critical structures within the thoracic cavity, such as important blood vessels (e.g., aorta, pulmonary artery trunk and branches, pulmonary veins and their branches, vena cava), trachea, bronchi, nerves (e.g., phrenic nerve, recurrent laryngeal nerve, brachial plexus), esophagus, heart, and spine. A smaller distance indicates a higher potential risk of injury, and therefore a greater weight is assigned. The overall weight of the edges connecting nodes (i.e., path segments) is also considered. Determined by the following weighted linear combination function:
[0199]
[0200] in: The preset weighting coefficients are, for example, 0.6, 0.3, and 0.1. It's about the minimum safe distance. The penalty function, for example when Less than the safety threshold (e.g., 5mm) ,in As a penalty factor; otherwise = 0. It is the cost of damage when crossing tissues, and different values are assigned according to the type of tissue. For example, it is 5 for crossing lung parenchyma, 3 for muscle, and 1 for fat. This is the visibility cost, quantified by calculating the number of obstructed voxels between the path point and the preset observation point (thoracoscopy inlet). For example, the area within 3mm of the pulmonary artery is defined as a high-risk area, with a weight of 10; the area within 3-5mm is a medium-risk area, with a weight of 5; and the area above 5mm is a low-risk area, with a weight of 1.
[0201] Degree of tissue damage: Potential tissue trauma is assessed by the type and thickness of tissue traversed along the potential access route. For example, the weighting differs depending on whether the tissue is muscle, adipose tissue, lung parenchyma, lymph nodes, or bone, with bone and lung parenchyma typically receiving higher weightings. The system incorporates parameters such as stiffness, bleeding risk, and healing capacity for each tissue type for quantitative assessment.
[0202] Visibility and operability: This assesses the degree of exposure of the target area by the approach, as well as the accessibility and operating space of surgical instruments (such as thoracoscopes and robotic arms). The system can simulate the range of motion of instruments under different approaches and assess whether there are blind spots or operative dead zones. For example, if the approach results in obstruction of the instrument from the target lesion by other organs, the weighting is increased.
[0203] Impact on postoperative recovery: Assess the impact of the approach on the patient's postoperative pain, respiratory function, and aesthetics. For example, the location and size of the chest wall incision, and whether the intercostal nerves are damaged.
[0204] Optimization Algorithm: The A* search algorithm or Dijkstra's algorithm is used to find the optimal path on the weighted voxel map from a predetermined external skin incision (which can be specified by the doctor on the 3D model or intelligently recommended by the system) to the target lesion or surgical area. The A* algorithm guides the search direction through a heuristic function (e.g., a weighted sum of the Euclidean distance from the current node to the target node and a risk assessment), efficiently finding a path with the minimum overall "cost" (safety, degree of damage, operability, etc.). The heuristic function of the A* algorithm... It is usually set to start from the current node. The search process is accelerated by using the Euclidean distance to the target node, or a weighted distance combined with the initial risk assessment. The final output path is a sequence of three-dimensional coordinates representing the optimal surgical approach.
[0205] Tumor resection boundary delineation and lymph node dissection strategy unit:
[0206] Safe resection margin calculation: Based on the nature of the lesion (e.g., benign tumor, early-stage non-small cell lung cancer, invasive adenocarcinoma, etc.), the degree of tumor malignancy, and the latest clinical guidelines (e.g., for non-small cell lung cancer, a resection margin of at least 2 cm or an imaging margin of at least 5 mm is recommended), a preset safe resection margin relative to the tumor surface is automatically generated on the 3D model. This margin is accurately calculated using the **3D Euclidean Distance Transform** algorithm. This algorithm calculates the minimum Euclidean distance from each voxel within the tumor region to the tumor boundary, thereby generating a distance field. By setting a distance threshold (e.g., 5 mm or 10 mm), the corresponding isosurface is obtained, which represents the safe resection margin. The system can display this margin with different colors or transparency for physician reference.
[0207] Lymph node identification and risk assessment: Standardized thoracic lymph node atlases (such as the IASLC lymph node atlas) are precisely registered onto the patient's individual 3D model. Combining the SUV value of PET images, the morphological characteristics of lymph nodes on CT images (size, density, shape, enhancement characteristics), and the patient's clinicopathological information (such as primary tumor type and stage), the system uses machine learning models (such as support vector machines or gradient boosting trees) to automatically identify suspicious lymph nodes and assess their metastatic risk. For example, lymph nodes with a diameter greater than 1 cm, an SUVmax greater than 2.5, or abnormal morphology will be marked as high-risk.
[0208] Lymph node dissection path planning: Based on lymph node risk assessment results and surrounding vascular and nerve relationships, the system plans the path and extent of lymph node dissection. According to the anatomical location of the lymph node station and adjacent key structures, the system recommends the optimal dissection sequence and instrument operation path to minimize damage to surrounding structures while maximizing dissection thoroughness. For example, for 7 groups of lymph nodes, the system can plan dissection along the paratracheal space and provide real-time alerts regarding the distance to the esophagus and vagus nerve.
[0209] Vascular, airway, and nerve protection strategy unit: This unit continuously monitors the relative positions of the planned surgical path and important blood vessels, airways, and nerves throughout the preoperative planning process and provides real-time alerts.
[0210] Real-time distance calculation: When the doctor draws or the system generates the surgical path, defines the resection plane, or simulates the movement of the instrument tip, the system uses a geometric distance calculation algorithm (such as the nearest point distance) to calculate the minimum distance between the current operation point and the segmented key blood vessels, airway walls, or nerve bundles in real time.
[0211] Visual alerts and feedback: Once the minimum distance falls below a preset safety threshold (e.g., 2mm for large blood vessels, 1mm for nerves, and 3mm for airways), the system will alert the surgeon through visual warnings (such as highlighting the threatened critical structure in red, directly marking the distance value in the 3D view, or changing the path color from green to yellow and then to red). Simultaneously, an audible alarm may also be activated.
[0212] Key point identification and restricted area marking: The system presets and identifies important vascular branching points, airway bifurcation points, areas with dense nerve pathways, and high-risk areas for tumor invasion, and marks them as absolute restricted areas or high-risk areas to guide doctors to avoid them.
[0213] Flow / Function Simulation (Advanced Functionality): Predicts changes in blood flow / airflow in specific vessels (e.g., pulmonary artery branches) or airways (e.g., bronchi) after compression or partial resection using simplified computational fluid dynamics (CFD) models or circuit simulations, assessing their impact on downstream organ function. For example, based on Poiseuille's Law, it calculates changes in vessel cross-sectional area to predict increased resistance and decreased blood flow, thereby assessing perfusion risk in lung segments or lobes.
[0214] Surgical procedure sequence simulation and risk assessment unit:
[0215] Virtual surgical simulation: This allows surgeons to perform virtual surgical operations on a 3D model, such as simulating instrument grasping, cutting, suturing, and clamping. Based on a preset physical model (such as an elastic model based on the finite element method or a simplified mass spring model), the system simulates tissue deformation and response, providing doctors with a near-realistic surgical experience. Doctors can drag virtual instruments and observe the morphological changes of tissue after it has been stretched and cut.
[0216] Complication risk prediction: Based on the patient's individual anatomical characteristics, lesion nature, planned surgical type (e.g., lobectomy), complexity of the planned approach, and historical surgical data, machine learning models (such as random forests, support vector machines, or deep neural networks) are used to predict the probability of common complications such as intraoperative bleeding, postoperative pneumothorax, pleural infection, air leak, nerve injury, and arrhythmia. These models are trained by analyzing a large number of historical surgical cases, including imaging features (such as tumor volume, location, and distance from blood vessels and airways), clinical parameters (such as patient age, underlying diseases, and pulmonary function indicators), and actual complication occurrences.
[0217] The machine learning model specifically employs a gradient boosting decision tree model. Its input feature vector explicitly includes: 1) tumor volume (cm³); 2) minimum 3D distance (mm) between the tumor and the nearest major blood vessel (pulmonary artery or aorta); 3) minimum 3D distance (mm) between the tumor and the nearest main bronchus; 4) SUVmax value of the PET image; 5) patient's preoperative pulmonary function index FEV1%; 6) patient age; 7) surgical type (encoded value, e.g., lobectomy = 1, segmentectomy = 2). The model is trained using a dataset containing 1000 historical surgical cases, each case labeled with whether air leakage occurred within 30 days post-surgery (binary label: 0 or 1) as the training label. The model is trained and optimized using 5-fold cross-validation to maximize the AUC value.
[0218] For example, a well-trained model might output: "Expected intraoperative blood loss of 100-150ml, postoperative probability of air leakage in the lungs of 15%, and risk of phrenic nerve injury of 2%."
[0219] Risk visualization: Predicted risk levels are overlaid onto a 3D model as heatmaps or color codes. For example, high-risk areas are displayed in red, medium-risk in yellow, and low-risk in green. This allows surgeons to easily identify potentially high-risk areas or procedures and adjust their planning accordingly.
[0220] 6. Visualization and Interaction Module
[0221] The visualization and interaction module is designed to present complex multimodal image data, three-dimensional reconstruction models, and intelligent planning results to surgeons in an intuitive and multidimensional way, and to provide a flexible and efficient human-computer interaction interface to support doctors' personalized adjustments and decisions.
[0222] The visualization and interaction functions include the following key sub-functional units:
[0223] Multi-dimensional rendering unit: Supports multiple rendering modes to meet different observation needs.
[0224] Volume Rendering: This feature maps the density values of 3D voxel data (such as the HU value in CT or the SUV value in PET) to color and transparency using transfer functions, enabling non-invasive visualization of internal structures such as lung parenchyma, airways, blood vessels, and tumors. Transfer functions can be customized for different tissue types; for example, bones are displayed as opaque white, lung parenchyma as translucent, and tumors as a specific color with high opacity. Adjustable clipping planes are supported, allowing doctors to section the voxel model in any direction to reveal internal structures. Window width / level adjustments are also supported to optimize the contrast of specific tissues (such as lung windows and mediastinal windows).
[0225] SurfaceRendering: Performs independent surface mesh rendering on segmented key anatomical structures (such as tumors, bones, major blood vessels, and bronchial trees), supporting custom colors, transparency, and materials (such as gloss and reflectivity) to clearly distinguish different structures.
[0226] Hybrid rendering mode: Allows the combination of volume rendering and surface rendering. For example, surface-rendered blood vessels and tumors can be overlaid on a volume-rendered lung parenchyma background to provide more comprehensive visual information.
[0227] Fusion display: Allows information from different modalities of images to be displayed synchronously by overlaying or fusing. For example, a PET metabolic heatmap can be overlaid on the anatomical surface of a CT / MRI with a semi-transparent color to visually display the metabolically active areas of a tumor.
[0228] Interactive operation unit: Provides an intuitive and powerful toolset that supports surgeons in real-time rotation, translation, scaling, measurement (distance, angle, volume), cutting, annotation, and drawing and modification of virtual surgical paths for models.
[0229] Real-time transformation: Doctors can smoothly rotate, pan, and zoom the 3D model in real time using a mouse or touchpad, and observe it from any angle.
[0230] Virtual cross-section: Supports defining one or more planes in any direction for real-time virtual cross-section to reveal the anatomical relationships within the model, such as cross-section along the bronchus to observe the adjacent relationship between the tumor and the airway.
[0231] Measurement tools: Provides precise distance measurement (point-to-point, point-to-plane, plane-to-plane), angle measurement (such as bronchial bifurcation angle), and volume measurement (such as tumor volume and lung lobe volume) functions with sub-millimeter accuracy.
[0232] Path drawing and editing: Doctors can directly draw potential surgical approaches or resection areas on the 3D model. The system provides real-time feedback on the path length, the types of tissues traversed, and risk assessment results provided by the planning module (such as minimum distance to important structures and expected bleeding). Doctors can adjust the path in real time based on the feedback until they are satisfied.
[0233] Multi-view synchronization: Supports synchronized linkage between 2D and 3D views, including axial, coronal, and sagittal views. Operations on any view (such as mouse click positioning and path drawing) are reflected in real time in all other views, ensuring seamless switching and collaborative observation between different dimensions of information for doctors.
[0234] Report Generation Unit: Based on preoperative planning results, automatically generates a structured and standardized surgical report. Report content includes, but is not limited to: basic patient information, detailed imaging findings (such as tumor size, location, and stage), key anatomical measurements (such as the minimum distance between the tumor and major blood vessels), precise tumor volume and nature, anatomical relationship with surrounding vital organs, systematically recommended surgical approach, expected resection extent, lymph node dissection strategy, potential complication risk assessment (such as bleeding, air leak, and nerve injury probability), and planned cross-sectional views and 3D renderings. The report supports export to industry-standard PDF or DICOMSR (Structured Report) format for easy storage, transmission, and printing.
[0235] 7. Knowledge Base and Database
[0236] The knowledge base and database serve as the cornerstone of the system's operation, storing patient image data, reconstruction models, planning results, surgical templates, anatomical atlases, and relevant clinical guidelines, and continuously providing data support and model update services for the intelligent modules.
[0237] The knowledge base and database include the following key sub-functional units:
[0238] The patient data storage unit stores all patients' raw multimodal image data, preprocessed and registered composite images, segmented and reconstructed 3D anatomical models (including mesh data and voxel masks), and detailed results of each preoperative planning session (including planned pathways, surgical margins, risk assessment reports, etc.). All data is stored in encrypted form, using the AES-256 encryption standard, and strictly adheres to medical data security and privacy protection regulations (such as HIPAA and GDPR) to ensure the confidentiality and integrity of patient information. The system supports data lifecycle management, including archiving and backup.
[0239] The anatomical atlas library contains high-resolution, standardized thoracic anatomical atlases, covering detailed structures such as lung lobes, lung segments, trachea, bronchial tree, fine pulmonary artery and pulmonary vein branches, major cardiac chambers and vessels, esophagus, chest wall, and internationally standardized lymph node divisions (e.g., the IASLC lymph node atlas). These atlases are precisely annotated by professional anatomists and experienced surgeons to guide the training of automated segmentation models, serve as registration reference templates, and provide standardized anatomical references for preoperative planning.
[0240] Surgical Standards and Experience Knowledge Base: This base stores classic surgical procedures, standardized operating guidelines, clinical pathways, intraoperative precautions, and successful surgical cases and special case management experiences contributed by expert physicians for various thoracic surgeries (such as lobectomy, segmentectomy, wedge resection, and mediastinal tumor resection). This knowledge is stored in both structured and unstructured data formats and can be accessed by the intelligent planning module to guide surgical pathway optimization and decision-making, providing best clinical practice data for planning.
[0241] Machine Learning Model Library: This library stores trained deep learning segmentation models (such as the 3DU-Net model), risk prediction models (such as random forest or neural network models for complication prediction), lymph node metastasis risk assessment models, and other AI-based analytical models. The library supports model version management, performance monitoring, and dynamic updates, allowing for retraining and deployment of models after acquiring new large-scale labeled data to continuously improve model accuracy and generalization ability.
[0242] Data Indexing and Retrieval Unit: Provides efficient data indexing mechanisms (such as B+ tree-based or hash table-based indexes) and flexible retrieval functions, supporting data querying and management based on various criteria such as patient ID, imaging modality, disease type, surgery date, doctor's name, and surgery type. This ensures that physicians and researchers can quickly access and utilize historical data for retrospective analysis, research statistics, or teaching demonstrations. Simultaneously, it supports data annotation and quality control workflows, providing high-quality labeled data for model retraining.
[0243] II. Detailed Implementation of the Methodology and Procedures
[0244] This invention provides a preoperative planning method for thoracic surgery that integrates multimodal imaging. Its process is tightly coupled and interconnected, ensuring continuity and high accuracy from data input to final planning output.
[0245] Step S100: Multimodal image data acquisition and preprocessing.
[0246] This step is the starting point of the entire planning methodology and is executed through the aforementioned data acquisition and preprocessing modules. First, the system automatically acquires raw medical image data from various modalities, including chest CT, MRI (such as T1WI, T2WI, and DWI sequences), and PET, through its DICOM, HL7, and other common medical imaging data interfaces. Typically, CT data acquisition parameters are: slice thickness 0.625mm, reconstruction interval 0.625mm, tube voltage 120kV, and automatic tube current adjustment; MRI T2WI sequence acquisition slice thickness 1.5mm, voxel size 0.7x0.7mm; PET scans use a three-dimensional acquisition mode, with a reconstructed voxel size of 4x4x4mm.
[0247] The acquired raw data then enters a standardized processing flow. Specifically, the data format conversion unit converts all DICOM sequences to NIfTI format and parses the metadata. The image quality assessment unit follows, checking the converted images for noise levels, artifacts, and integrity. For example, if the signal-to-noise ratio of the lung window in a CT image is below 20 dB, or the motion artifact index (assessed by the cross-correlation coefficient of adjacent frames) in an MRI image exceeds 0.1, the system will issue a warning. Subsequently, the image enhancement unit implements adaptive filtering for different modalities: 3D anisotropic diffusion filtering (20 iterations, diffusion constant 0.05) is applied to CT images to remove noise and preserve edges; wavelet thresholding (SureShrink threshold) is applied to MRI images to improve image clarity; and Gaussian smoothing (σ=2.0 mm) is applied to PET images to reduce noise. The intensity normalization unit maps the intensity of CT images to [-1000, 3000] HU, MRI images to [0, 1000], and PET images to [0, 10] SUV units. Finally, the spatial resampling unit resamples all images to a uniform 1.0mm×1.0mm×1.0mm isotropic voxels and uses trilinear interpolation to ensure spatial resolution consistency, laying a solid foundation for subsequent high-precision registration.
[0248] Step S200: High-precision registration and fusion of multimodal images.
[0249] This step utilizes a multimodal image registration and fusion module to perform high-precision spatial alignment and information integration on preprocessed images of different modalities. First, initial rigid alignment based on image moments is performed, with preliminary translation and rotation corrections calculated by determining the centroid and principal axis directions of each image. Subsequently, as a core technical feature of this invention, non-rigid registration based on Large Deformation Differential Homeomorphic Registration (LDDMM) is implemented. This process iteratively optimizes a smooth velocity field v, generating a differential homeomorphic transformation that nonlinearly maps the source image (e.g., PET) to the target image (e.g., CT / MRI). During the optimization process, Mutual Information (MI) was chosen as the image similarity metric. Its calculation is based on a joint histogram of 256 gray levels to effectively capture the nonlinear statistical dependencies between different modalities. Regularization parameters... Set to 10 -2 To balance similarity and deformation smoothness, a Gaussian kernel smoothing operator L (standard deviation σL = 2.0 mm) is used to penalize the roughness of the velocity field. The optimization solution employs the L-BFGS algorithm, with a maximum of 200 iterations and a convergence threshold of 10. -6This process ensures sub-pixel level registration accuracy, with the average target registration error (TRE) typically controlled within 0.5 mm. After high-precision registration, a multi-scale geometric analysis (MSGA) fusion algorithm is employed. For example, for registered CT and PET images, they are first decomposed into low-frequency and high-frequency subbands using discrete wavelet transform (e.g., using Daubechies wavelets, decomposed into 3 layers). The low-frequency subbands are fused using a weighted average method (CT weight 0.7, PET weight 0.3), while the high-frequency subbands are selected using the maximum energy selection rule. Finally, a unified composite image dataset is reconstructed through inverse wavelet transform, which clearly displays both anatomical structures and metabolic activity hotspots.
[0250] Step S300: Precise segmentation and three-dimensional reconstruction of key anatomical structures.
[0251] This step utilizes a precise anatomical structure segmentation and 3D reconstruction module to automatically and accurately segment key anatomical structures within the thoracic cavity (such as lung parenchyma, trachea, bronchial tree down to fourth-order branches, pulmonary artery / venous trunk down to third-order branches, heart, esophagus, chest wall, and major lymph node stations, such as groups 4R, 7, and 10R) and target lesions (such as pulmonary nodules larger than 5 mm in diameter and mediastinal masses) from the fused multimodal image data. The segmentation process is primarily based on a 3D U-Net deep neural network model. This model has a 4-layer encoder-decoder structure, with each layer containing two 3x3x3 convolutional blocks. The number of feature channels increases from 32 to 256, and multi-scale features are effectively fused through skip connections. The model's training dataset contains 500 precisely labeled chest multimodal images, employing the Adam optimizer with an initial learning rate of 10. -4 It also applies data augmentation techniques such as random rotation, scaling, and elastic deformation. The loss function uses a weighted combination of Dice loss and cross-entropy loss. The training process involved 300 epochs with a batch size of 2. After segmentation, morphological post-processing (e.g., one each of 3x3x3 opening and closing operations) was performed to eliminate minor artifacts, and isolated segmented regions with a volume less than 100 cubic millimeters were filtered using connected component analysis. Subsequently, the binary voxel masks of each structure obtained from the segmentation were input into the MarchingCubes algorithm to generate initial 3D surface mesh models for each structure with an isosurface threshold of 0.5. Finally, the generated models underwent mesh smoothing (10 iterations of Laplacian smoothing), mesh simplification (using a quadrilateral edge folding algorithm to reduce the number of faces by 50%), and topology repair (automatically identifying and repairing holes smaller than 20 cubic millimeters and removing self-intersecting surfaces) to ensure the geometric accuracy, visual fidelity, and computational efficiency of the model.
[0252] Step S400: Mapping lesions to functional information.
[0253] This step, through a lesion-functional information mapping module, precisely integrates the metabolic activity information of lesions carried by PET images into a three-dimensional anatomical model. First, based on the raw PET data and the patient's physiological parameters (weight, height, injection dose, scan time), the standardized uptake value (SUV) of the lesion region is calculated, including SUVmax, SUVmean, and total metabolic tumor volume (MTV, with a threshold set at SUV>2.5). For example, for a lung nodule with an SUVmax of 8.2, the system accurately calculates the SUV values of all its voxels. Subsequently, these SUV values are precisely superimposed onto the reconstructed three-dimensional tumor surface using trilinear interpolation, in the form of color coding (e.g., using a standard "thermal" color spectrum, from blue to red, where blue represents low SUV and red represents high SUV) and transparency mapping (high SUV regions have higher opacity). For example, on the surface of the three-dimensional tumor model, an area with an SUVmax of 8.2 appears as dark red and opaque, while an area with an SUVmean of 4.5 appears as orange with partial transparency, thus visually reflecting the biological characteristics and invasiveness of the lesion, facilitating the physician's assessment of the tumor's boundaries and biological behavior.
[0254] Step S500: Intelligent preoperative planning and path optimization.
[0255] This step is the core function of the invention. Through the intelligent preoperative planning and path optimization module, personalized and refined surgical plans are generated and optimized for thoracic surgery based on a three-dimensional anatomical model that integrates multimodal images.
[0256] S510: Surgical Approach Selection and Optimization: The system combines the patient's individualized anatomical features, the precise location of the lesion, and the expected surgical type (e.g., thoracoscopic right upper lobe resection for right lung cancer) to construct a weighted voxel map in three-dimensional anatomical space. Each voxel (1mm × 1mm × 1mm) in the map represents a possible spatial location, and each edge represents a potential path between adjacent voxels. The weight of the edges is determined by the following composite factors:
[0257] Safety distance weight: The minimum Euclidean distance between the path and critical structures such as important blood vessels (e.g., pulmonary artery trunk, aortic arch, superior vena cava), airways (main bronchus, lobar bronchus), nerves (phrenic nerve, recurrent laryngeal nerve), and esophagus. The closer the distance, the greater the weight. For example, a path less than 2 mm from an important blood vessel has a weight of 100; 2-5 mm, a weight of 50; 5-10 mm, a weight of 10; and greater than 10 mm, a weight of 1.
[0258] Tissue damage weighting: The inherent risk of trauma when traversing tissues such as muscle, fat, lung parenchyma, lymph nodes, and bone. For example, traversing the ribs has a weight of 50, lung parenchyma has a weight of 5, and fat has a weight of 1.
[0259] Visibility and operability weighting: This is achieved by simulating the field of vision and operating space of surgical instruments (such as thoracoscopes) under a specific approach. The weighting increases if there are blind spots or instrument collision risks along the path.
[0260] The A* search algorithm is employed to search for the optimal surgical path on the voxel map from a pre-defined surface incision (e.g., the anterior axillary line of the right fifth intercostal space) to the target lesion or surgical area, with the objective of minimizing the overall "cost" (i.e., the sum of all weights). The heuristic function of the A* algorithm is set as the weighted sum of the Euclidean distance from the current node to the target node and the cumulative risk weights of the current path. The final output path is a three-dimensional coordinate sequence with the lowest overall risk score.
[0261] S520: Tumor Resection Boundary Delineation and Lymph Node Dissection Strategy: Based on the nature of the lesion (e.g., early-stage lung adenocarcinoma with a diameter of 3cm) and clinical guidelines (e.g., NCCN guidelines), the system automatically generates a safe resection margin isosurface on the 3D model using a 3D Euclidean distance transformation. This isosurface is at a preset safe distance from the tumor surface (e.g., 5 mm for the lateral resection margin and 2 mm for the medial resection margin for non-small cell lung cancer, balancing thorough resection and preservation of lung function). This isosurface is displayed in semi-transparent green, visually indicating the resection range. Simultaneously, the standardized IASLC thoracic lymph node zonation map is precisely registered to the individual patient model. Combining the SUV value of PET images (e.g., SUVmax>2.5), the morphological characteristics of lymph nodes on CT images (e.g., short diameter>10mm, round or heterogeneous enhancement), and clinicopathological information, the system uses a trained random forest model to automatically assess the metastasis risk of each lymph node station (e.g., subcarinal lymph nodes in group 7 and paratracheal lymph nodes in group 4R), and marks them with colors (red for high risk, yellow for medium risk, and green for low risk). Based on risk assessment, the scope and path of lymph node dissection are intelligently planned. For example, for high-risk lymph node stations, systematic lymph node dissection is recommended, and a dissection path that minimizes damage to surrounding structures is planned, such as along the vascular sheath.
[0262] S530: Vascular, Airway, and Nerve Protection Strategy: The system continuously monitors in real-time the minimum distances between the planned surgical path, resection plane, or virtual instrument tip (simulating the position and orientation of surgical instruments in 3D space) and key anatomical structures such as major blood vessels (e.g., the pulmonary artery and its branches, pulmonary veins and their branches, and the aorta), airways (trachea, left and right main bronchi, and lobar bronchi), phrenic nerve, and recurrent laryngeal nerve. Once the distance falls below a preset safety threshold (e.g., less than 2mm from major blood vessels, less than 1mm from nerves, and less than 3mm from the airway wall), the system issues a visual warning to the surgeon (e.g., highlighting threatened critical structures in flashing red, displaying real-time distance values in the 3D view accompanied by a red numerical alarm), and assists in adjusting the path to avoid potential damage. For example, if the planned path is too close to the pulmonary artery, the system will display "Warning: 1.5mm from a pulmonary artery branch, risk of damage, please adjust the path," and provide alternative safe path suggestions.
[0263] S540: Surgical Procedure Sequence Simulation and Risk Assessment: Allows surgeons to simulate virtual surgical procedures on a 3D model, including instrument grasping, cutting (e.g., lung tissue, blood vessels, bronchi), suturing, clamping, etc. The system simulates tissue deformation and response based on a pre-set elastic physics model (e.g., lung tissue deforms under instrument grasping, and blood vessels show cross-sections after cutting). The system can also predict the probability of common complications such as intraoperative blood loss (e.g., predicted range 50-150ml), postoperative pneumothorax (5% probability), air leak (12% probability), nerve injury (1% probability), and postoperative infection (3% probability) based on the patient's individual anatomical characteristics, lesion characteristics, and planned surgical path, using machine learning models (e.g., a gradient boosting tree model trained on thousands of historical surgical cases, with input features including tumor volume, location, distance from major blood vessels, patient lung function, pathological stage, and planned surgical duration). The prediction results are overlaid on the 3D model in the form of risk levels (low, medium, high) or heat maps, clearly indicating potential high-risk areas or procedures, providing surgeons with comprehensive risk insights and assisting in the development of avoidance strategies.
[0264] Step S600: Visualization and Interaction.
[0265] This step utilizes a visualization and interactive module to present all the above processing and planning results in an intuitive and multi-dimensional way. This module supports volume rendering, surface rendering, and hybrid rendering modes, allowing doctors to freely switch views (axial, coronal, and sagittal 2D views and 3D views are synchronously linked), perform cross-sections in any direction, and perform real-time rotation, translation, scaling, measurement (e.g., measuring the maximum tumor diameter of 3.2cm and the distance to the main trachea of 1.8cm), and virtual annotation (e.g., marking the placement of drainage tubes). Doctors can directly draw or modify the surgical path and resection range on the 3D model, and the system provides real-time feedback on corresponding risk assessments (e.g., when the doctor modifies the resection margin, the system immediately updates the estimated blood loss and lung function impact). Finally, the system can automatically generate a structured surgical report based on the planning results, covering all key information and risk assessments, and supports exporting to standard PDF or DICOMSR format for easy archiving and communication.
[0266] Step S700: Knowledge base and database management.
[0267] This step involves the operation of the knowledge base and database modules. All patients' original images, processed composite images, 3D reconstruction models, and detailed preoperative planning results are securely stored in encrypted form (AES-256 encryption). The knowledge base simultaneously maintains high-resolution standard anatomical atlases, thoracic surgery guidelines and expert experience (e.g., standard operating videos for segmentectomy, expert resection techniques for specific lesions), and all trained machine learning models (versions V1.0, V1.1, etc.). The knowledge base is specifically implemented as a rule engine, whose stored "surgical experience" is transformed into a series of "IF-THEN" rules. This rule engine interacts with the intelligent preoperative planning and path optimization module, dynamically adjusting weights during the A* algorithm's voxel map construction. For example, a rule might be: "IF Tumor type is central lung cancer AND Path segment is within 3mm of the main bronchus THEN Safe distance penalty function for this path segment." The calculation result is multiplied by a preset penalty factor (e.g., 1.5). In this way, clinical experience in the knowledge base is transformed into specific constraints on the path search space, thereby guiding the algorithm to generate optimal paths that are more in line with clinical practice. This module provides efficient data indexing, retrieval, and management functions, supporting doctors to quickly query data based on various conditions such as patient ID, diagnosis type, and surgery date. At the same time, the system continuously provides data support and model update services to other modules through API interfaces. For example, when new clinical data is labeled and validated, the system will automatically trigger the retraining and updating of the machine learning model.
[0268] III. Examples and Comparative Examples
[0269] To further illustrate the advantages and non-obviousness of the thoracic surgery preoperative planning system and method integrating multimodal imaging of the present invention, a detailed description will be provided below through a specific embodiment and a comparative example.
[0270] Example: Preoperative planning for a patient with central lung cancer
[0271] Patient: A 62-year-old male diagnosed with central non-small cell lung cancer in the right upper lobe, with a tumor diameter of approximately 3.5 cm. PET-CT showed an SUVmax of 7.8, located near the bronchus and pulmonary artery branches in the right upper lobe, and a group 4R lymph node with an SUVmax of 3.1. Clinical evaluation indicated a planned thoracoscopic right upper lobectomy and lymph node dissection.
[0272] Data acquisition and preprocessing (step S100):
[0273] The patient underwent high-resolution CT scan (slice thickness 0.625 mm), MRI T2WI scan (voxel 0.7 x 0.7 x 1.5 mm), and PET-CT scan (voxel 4 x 4 x 4 mm, SUVmax 7.8).
[0274] The system automatically converts the data format to NIfTI and performs a quality assessment. The CT image signal-to-noise ratio is 35 dB, and the MRI shows no obvious motion artifacts.
[0275] Image enhancement: CT uses 3D anisotropic diffusion filtering (25 iterations), and MRI uses wavelet denoising.
[0276] Intensity normalization: CT normalized to [-1000, 3000] HU, MRI to [0, 1000], PET to [0, 10] SUV.
[0277] Spatial resampling: All images are resampled to 1.0mm×1.0mm×1.0mm isotropic voxels.
[0278] High-precision registration and fusion of multimodal images (step S200):
[0279] Initial alignment: Rigid body alignment based on the bifurcation points of the spine and trachea in CT images, correcting the rough positions of MRI and PET.
[0280] Non-rigid registration: The LDDMM algorithm was used for high-precision non-rigid registration of PET and CT / MRI images. The optimization process achieved MI convergence, and the average target registration error (TRE) between PET and CT was 0.38 mm. The deformation field was smooth and biologically reasonable.
[0281] Fusion: The MSGA algorithm was used to fuse registered CT, MRI, and PET images into a unified composite dataset. The fused images clearly displayed lung anatomical details, soft tissue contrast, and metabolically active areas of the tumor.
[0282] Precise segmentation and 3D reconstruction of key anatomical structures (step S300):
[0283] Deep learning-driven semantic segmentation: The system utilizes a pre-trained 3DU-Net model to automatically segment fused image data. It accurately identifies and segments the right upper lobe tumor (18.3 cubic centimeters), right upper lobe bronchus, right upper lobe arterial and venous branches, group 4R lymph nodes (1.2 cubic centimeters), trachea, main bronchus, esophagus, and adjacent chest wall. The Dice coefficient for tumor segmentation is 0.93, and the Dice coefficient for bronchial tree segmentation is 0.91.
[0284] 3D Reconstruction: The segmentation results were converted into a high-resolution 3D model using the MarchingCubes algorithm, followed by mesh smoothing (15 Laplacian iterations), mesh simplification (60% reduction in the number of faces), and topology repair. The final 3D model is highly detailed, clearly showing the complex relationship between the tumor and surrounding blood vessels and bronchi, and even clearly showing signs of tumor invasion into the upper lobe bronchial wall.
[0285] Mapping of lesions to functional information (step S400):
[0286] PET standardized uptake value (SUV) calculation: The system calculated the SUVmax of the tumor region to be 7.8, the SUVmean to be 5.2, and the MTV to be 15.1 cubic centimeters. The SUVmax of the lymph nodes in group 4R was 3.1.
[0287] Functional Information Spatial Mapping: The SUV value of the tumor is superimposed onto the three-dimensional surface model of the tumor as a gradient spectrum from red to yellow. High SUV areas (such as the tumor core) are displayed as dark red, and low SUV areas are displayed as yellow. The transparency is adjusted according to the SUV value, so that the metabolically most active areas stand out abnormally on the three-dimensional model, intuitively revealing the biological invasiveness of the tumor.
[0288] Intelligent preoperative planning and pathway optimization (step S500):
[0289] Surgical approach selection and optimization: Based on the patient's 3D anatomical model and considering the characteristics of thoracoscopic surgery, the system intelligently recommends the right fifth intercostal anterior axillary line approach as the optimal main operating port. Using the A* search algorithm, a path from the external incision to the tumor is planned. This path bypasses the pulmonary artery and the main upper lobe bronchus, minimizing trauma by reducing the volume traversing the lung parenchyma. The path length is 18.5cm, resulting in the lowest overall risk score.
[0290] Tumor resection boundary delineation and lymph node dissection strategy: The system automatically defines the safe resection margin of the tumor: an isosurface 5 mm from the tumor surface. For areas where the tumor invades the upper lobe bronchial wall, bronchial sleeve resection or extended marginal resection is recommended. For group 4R lymph nodes, due to their SUVmax > 2.5, the system assesses a high risk of metastasis, and systematic dissection of group 4R and group 7 (subcarinal) lymph nodes is recommended, with a dissection path planned along the paratracheal and paraesophageal sides for minimal damage.
[0291] Vascular, airway, and nerve protection strategies: During the planning process, the system monitors the minimum distance between the surgical path and the pulmonary artery, upper lobe bronchus, and phrenic nerve in real time. When the virtual instrument tip approaches the main pulmonary artery within 3mm, the system immediately highlights the pulmonary artery in red and issues an alarm: "Warning: 3.0mm from the main pulmonary artery, potential risk of damage," prompting the surgeon to adjust the path. When planning bronchial incisions, the system calculates and displays the distance between the cutting surface and the main bronchus in real time to ensure safety.
[0292] Surgical procedure sequence simulation and risk assessment: The surgeon simulated procedures such as right upper lobectomy, vascular and bronchial transection, and lymph node dissection on a 3D model. Based on a machine learning model, the system predicted intraoperative blood loss of 80-120ml, a postoperative air leakage probability of 8%, a phrenic nerve injury probability of 1% (extremely low), and a postoperative infection probability of 3%. The risk visualization function overlaid a heatmap showing the pulmonary artery injury risk area.
[0293] Visualization and Interaction (Step S600):
[0294] The system displays a 3D anatomical model in a volume rendering mode (the lung parenchyma is semi-transparent, while the blood vessels and bronchi are opaque), and overlays surface-rendered tumors (highlighted in red) and planned paths (blue lines) on top of it.
[0295] Doctors can rotate and zoom in real time using interactive tools. Through the virtual dissection function, doctors can clearly observe the relationship between the tumor and the segmental bronchus, as well as its proximity to the pulmonary artery branches.
[0296] Automatically generate a detailed surgical report, including the precise tumor volume, distance from important structures, recommended resection margins, extent of lymph node dissection, estimated blood loss, and probability of complications, and export it as a PDF file.
[0297] Comparative Analysis: Preoperative Planning Based on Traditional 2D Imaging and Human Experience
[0298] For the same patient, without using the system of this invention, preoperative planning is carried out using traditional two-dimensional imaging (CT / PET-CT film or PACS system two-dimensional image reading) and the doctor's experience.
[0299] Image review and preliminary assessment:
[0300] Doctors use CT and PET-CT two-dimensional images to analyze each layer and attempt to construct three-dimensional anatomical relationships in the brain.
[0301] Tumor size is estimated through two-dimensional measurements (such as the maximum diameter of 3.5 cm in axial CT scans), while its relationship with surrounding blood vessels and bronchi relies mainly on the doctor's experience.
[0302] Lymph node assessment: CT scan of lymph nodes in group 4R showed a short diameter of 1.1 cm and a PET / SUVmax of 3.1 cm. Based on experience, these were identified as suspicious lymph nodes.
[0303] This process relies heavily on the doctor's experience and spatial imagination, and it is easy to miss small lesions or complex adjacent relationships.
[0304] Surgical approach and margin delineation:
[0305] The surgical approach is mainly selected based on the surgeon's past experience and the approximate location of the lesion, such as choosing the right fifth intercostal approach. Multi-path comparison and optimization are not possible.
[0306] Defining tumor margins: This is usually based on visual estimation and experience from two-dimensional images, for example, planning to remove the tumor 2cm outside the edge. However, it cannot be precise to any direction in three-dimensional space, nor can the safe distance be verified in real time.
[0307] Lymph node dissection scope: Usually performed according to standard dissection scope, but lacks accurate assessment of individual lymph node metastasis risk and optimization of dissection path.
[0308] risk assessment:
[0309] The risk of complications such as intraoperative bleeding and nerve damage mainly depends on the doctor's experience with similar cases and judgment of the patient's underlying diseases, and lacks quantitative prediction based on individual anatomical characteristics.
[0310] Virtual surgical simulations are not possible, and surgical procedures and potential difficulties cannot be visually rehearsed.
[0311] Performance comparison and quantitative data:
[0312] The table below compares the differences in key performance indicators between the system of the present invention (example) and the conventional method (comparative example):
[0313]
[0314] Through detailed comparisons of the above embodiments and comparative examples, the superiority of the present invention, "A Preoperative Planning System for Thoracic Surgery Integrating Multimodal Imaging," has been fully demonstrated. This invention achieves revolutionary breakthroughs in deep fusion of multimodal images, precise identification of complex anatomical structures, and three-dimensional reconstruction, overcoming the inherent limitations of traditional methods in terms of information integration and accuracy. More importantly, by introducing intelligent preoperative planning and path optimization modules, this invention fundamentally changes the pattern of thoracic surgical planning, elevating the previous reliance on experience-based qualitative judgment to data-driven quantitative decision-making. Especially in surgical approach selection, safe surgical margin definition, protection of critical structures, and prediction of complication risks, this system provides unprecedented accuracy and predictability. This refined and personalized preoperative planning can significantly improve the safety and success rate of surgery, reduce the risk of intraoperative complications, and is expected to improve postoperative recovery. Simultaneously, the intuitive visualization and interactive functions greatly optimize the efficiency of doctor-patient communication and teaching training. Therefore, this invention not only possesses significant non-obviousness in its technology but also demonstrates enormous practical value and broad prospects in clinical application.
[0315] The technical solutions of the present invention are not limited to the specific embodiments described above. Any technical modifications made in accordance with the technical solutions of the present invention fall within the protection scope of the present invention.
Claims
1. A preoperative planning system for thoracic surgery integrating multimodal imaging, characterized in that, include: The data acquisition and preprocessing module (1) is used to receive and standardize multimodal raw image data from different medical imaging devices; The multimodal image registration and fusion module (2) is connected to the data acquisition and preprocessing module (1) and is used to perform high-precision spatial alignment and information integration on the standardized multimodal image. The anatomical structure precise segmentation and three-dimensional reconstruction module (3) is connected to the multimodal image registration and fusion module (2) to automatically identify and accurately delineate the key anatomical structures and lesions in the thoracic cavity from the multimodal image data after information integration, and convert them into a three-dimensional model that can be used for visualization and planning. The lesion and functional information mapping module (4) is connected to the anatomical structure precise segmentation and three-dimensional reconstruction module (3) to accurately superimpose and map the tissue metabolic activity information reflected by the PET image, as well as other potential functional or pathological information, onto the three-dimensional anatomical model. The intelligent preoperative planning and path optimization module (5) is connected to the lesion and functional information mapping module (4) and is used to automatically generate and optimize the surgical path and operation plan based on the fused multimodal images and reconstructed three-dimensional model, combined with the characteristics of thoracic surgery and the doctor's experience. The visualization and interaction module (6) is connected to the intelligent preoperative planning and path optimization module (5) to present the multimodal image data, three-dimensional reconstruction model and planning results in an intuitive and multidimensional way, and to provide a flexible human-computer interaction interface to support doctors' personalized adjustments and decisions. as well as The knowledge base and database (7) are used to store the patient's image data, reconstruction model, planning results, surgical template, anatomical atlas and related clinical guidelines, and serve as data support for the intelligent module.
2. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 1, characterized in that, The data acquisition and preprocessing module (1) is equipped with a DICOM standard interface, an HL7 interface, and other common medical image data format interfaces for receiving computed tomography (CT) images, magnetic resonance imaging (MRI) images, and positron emission tomography (PET) image data; the data acquisition and preprocessing module (1) includes: The data format conversion unit is used to convert the received raw image data into a standardized data structure within the system. The image quality assessment unit is used to perform quality checks on the converted image data, including noise level assessment, artifact detection, image integrity verification, and signal-to-noise ratio analysis. The image enhancement unit is used to implement adaptive image enhancement algorithms based on the characteristics of different modal images; An intensity normalization unit is used to eliminate image intensity differences caused by different scanning devices or scanning parameters, mapping the intensity values of all images to a uniform grayscale or density range; and The spatial resampling unit is used to resample image data with different resolutions and voxel spacing to a uniform spatial resolution.
3. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 2, characterized in that, The image enhancement unit in the data acquisition and preprocessing module (1) is used for: For CT images, nonlocal mean filtering or three-dimensional anisotropic diffusion filtering is used to remove noise and preserve edge details; For MRI images, apply a denoising algorithm based on wavelet transform or an adaptive filter based on the Rician noise model. For PET images, iterative reconstruction algorithms or post-processing smoothing filters are used to improve the signal-to-noise ratio and contrast. The spatial resampling unit is used to resample the image data to isotropic voxels of 1.0mm×1.0mm×1.0mm. The resampling algorithm uses trilinear interpolation or a higher-order interpolation method based on B-splines.
4. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 1, characterized in that, The multimodal image registration and fusion module (2) includes: The initial alignment unit employs a rigid body registration method based on feature points or image moments to perform coarse spatial alignment on the input images from different modalities; and A non-rigid registration unit is used to perform a voxel-based large deformation differential homeomorphism registration algorithm on the basis of the initial alignment, to correct local deformations caused by respiratory motion or organ deformation; the voxel-based large deformation differential homeomorphism registration algorithm generates a differential homeomorphism mapping function by solving an optimization problem. The mapping The evolution from 0 to 1 in time t is governed by a smooth velocity field. Driver, i.e. ,and The optimization objective function Defined as: Where sim is the image similarity measurement function, using mutual information. L is a positive regularization parameter, and L is a differential operator.
5. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 4, characterized in that, The multimodal image registration and fusion module (2) further includes a fusion algorithm unit, which is used to fuse the image data of different modalities into a single, information-rich composite image dataset after the high-precision registration is completed. The fusion algorithm adopts a multi-scale geometric analysis fusion algorithm. The multi-scale geometric analysis fusion algorithm first decomposes the registered multimodal image into sub-bands of multiple scales and directions. Then, based on the characteristics of different sub-bands, it uses a weighted average method, energy maximum selection, or sparse representation fusion rules to integrate information. Finally, it reconstructs the fused sub-bands into a unified composite image through inverse transformation.
6. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 1, characterized in that, The precise segmentation and three-dimensional reconstruction module (3) of the anatomical structure includes: A deep learning-driven semantic segmentation unit is used to automatically segment lungs, trachea, bronchi, pulmonary vessels, heart, esophagus, chest wall, lymph nodes, and lesions using a 3D U-Net model and its variant deep neural network models. The 3D U-Net model adopts an encoder-decoder structure and fuses multi-scale contextual information and fine spatial information through skip connections. During training, the model is optimized using a weighted combination of Dice loss and cross-entropy loss as a composite loss function. A three-dimensional surface reconstruction unit is used to convert the segmented discrete voxel data into a continuous three-dimensional geometric model. The three-dimensional surface reconstruction unit preferably adopts the traveling cube algorithm.
7. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 6, characterized in that, The anatomical structure precise segmentation and three-dimensional reconstruction module (3) also includes a model optimization and topology correction unit, which is used to optimize the generated three-dimensional model, including mesh smoothing by applying the Laplacian smoothing algorithm to remove the surface jagged effect, mesh simplification by using the quadrilateral edge folding algorithm or vertex clustering algorithm to reduce the number of triangles in the model, and topology repair by detecting and repairing holes, self-intersecting or non-manifold edge defects in the model.
8. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 1, characterized in that, The lesion-functional information mapping module (4) includes: The PET Normalized Uptake Value (SUV) calculation unit is used to calculate the standardized uptake value (SUV) based on the raw radioactivity data of PET images, as well as the patient's weight, injection dose, and scan time. This includes SUVmax, SUVmean, and total metabolic tumor volume parameters. The functional information spatial mapping unit is used to accurately superimpose the calculated SUV value or other functional indicators onto the surface or internal voxels of the three-dimensional anatomical model generated by the precise segmentation and three-dimensional reconstruction module (3) of the anatomical structure in the form of color coding or transparency mapping through interpolation alignment. The mapping method includes rendering the color directly onto the tumor surface or displaying the internal metabolic activity area in the three-dimensional voxel model through volume rendering.
9. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 1, characterized in that, The intelligent preoperative planning and path optimization module (5) includes a surgical approach selection and optimization unit, which is used to construct a weighted voxel map on the three-dimensional anatomical model based on the patient's anatomical structure characteristics, lesion location, size and expected surgical type. Each voxel or node in the voxel map represents a potential pass point, and the connecting edges between nodes are assigned weights. The weights consider the following factors: minimum safe distance from important blood vessels, airways, nerves, esophagus and heart key structures, potential tissue damage, visibility and operability, and postoperative recovery impact. The unit uses the A* search algorithm or Dijkstra algorithm to find the optimal path from the predetermined incision on the external skin to the target lesion or surgical area on the voxel map.
10. The thoracic surgery preoperative planning system integrating multimodal imaging according to claim 9, characterized in that, The intelligent preoperative planning and path optimization module (5) also includes: The tumor resection boundary definition and lymph node dissection strategy unit is used to automatically generate a preset safe resection margin relative to the tumor surface on the three-dimensional model through three-dimensional Euclidean distance transformation based on the nature of the lesion and clinical guidelines. It also combines the SUV value of PET images, the morphological characteristics of lymph nodes in CT images and clinicopathological information to identify suspicious lymph nodes and assess their metastasis risk in order to plan the path and scope of lymph node dissection. A vascular, airway, and nerve protection strategy unit is used to calculate the minimum distance between the planned surgical path and important blood vessels, airways, and nerves in real time during the planning process, and to alert the surgeon via visual warning when the distance falls below a preset safety threshold; and The surgical procedure sequence simulation and risk assessment unit allows surgeons to perform virtual surgical operations on the three-dimensional model and uses machine learning models to predict the probability of intraoperative bleeding, pneumothorax, air leakage, and nerve damage complications based on the patient's anatomical characteristics, lesion nature, surgical type, and historical data.