Lung adenocarcinoma alk rearrangement state non-invasive prediction method and system based on medical foundation large model
By employing a non-invasive detection method based on a large medical model, and utilizing CT images and clinical information, we have achieved efficient and accurate diagnosis of ALK rearrangement status in lung adenocarcinoma. This solves the problem of invasive detection in existing technologies, improves detection efficiency and accuracy, and reduces costs.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING FRIENDSHIP HOSPITAL CAPITAL MEDICAL UNIV
- Filing Date
- 2026-02-02
- Publication Date
- 2026-06-19
AI Technical Summary
Current technologies for detecting ALK rearrangements in lung adenocarcinoma rely on invasive tissue biopsy, which is complex, costly, inefficient, and lacks the generalization ability of AI models, making it difficult to achieve efficient and accurate non-invasive assisted diagnosis.
We employ a medical-based large-scale model approach, utilizing conventional chest CT images and clinical information, along with high-precision automatic segmentation and a dedicated deep learning model, to non-invasively predict the ALK rearrangement status of lung adenocarcinoma. This includes fine-tuning the large-scale automatic segmentation model for lung adenocarcinoma and the large-scale ALK rearrangement status prediction model. By combining a low-rank adapter and a dynamic gating cross-attention fusion mechanism, we generate a visual heatmap to explain the model's decisions.
It achieves high-precision lesion segmentation and strong generalization ability, avoids the risks of invasive biopsy, improves the accuracy and reliability of ALK rearrangement status in lung adenocarcinoma, is significantly better than traditional models, shortens the detection cycle from several days to minutes, and reduces costs.
Smart Images

Figure CN122243866A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of deep learning technology, specifically to a non-invasive prediction method and system for ALK rearrangement status of lung adenocarcinoma based on a large medical model. Background Technology
[0002] Lung cancer remains the leading cause of cancer-related deaths worldwide, with non-small cell lung cancer accounting for approximately 85% of all cases, and lung adenocarcinoma being the most common histological subtype. The discovery of targetable driver gene mutations has revolutionized lung adenocarcinoma treatment, with targeted therapies significantly improving patient survival and prognosis. Anaplastic lymphoma kinase (ALK) rearrangements, as key driver mutations, are a core predictor of the efficacy of ALK tyrosine kinase inhibitors (TKIs), therefore, developing precise, non-invasive ALK rearrangement detection technologies has significant clinical value.
[0003] ALK rearrangements occur in approximately 3%-7% of non-small cell lung cancer patients. These patients exhibit unique clinical characteristics and show high sensitivity to TKIs treatment. Therefore, early and accurate detection of ALK rearrangement status in lung adenocarcinoma is crucial for developing individualized treatment plans. Currently, traditional detection methods rely on invasive tissue biopsies, which are complex, time-consuming, and costly. Manual lesion delineation is subjective, inefficient, and has poor reproducibility. Furthermore, existing AI models have insufficient performance and weak generalization ability in predicting ALK rearrangements.
[0004] Therefore, how to achieve efficient, accurate, and interpretable non-invasive auxiliary diagnosis of ALK gene rearrangement status in lung adenocarcinoma patients without relying on invasive biopsies, using routine chest medical images and clinical information, through high-precision automatic segmentation and dedicated deep learning models, has become an urgent technical problem to be solved. Summary of the Invention
[0005] To overcome the problems of existing technologies, such as reliance on invasive tissue biopsies, strong subjectivity of manual segmentation, and insufficient generalization ability of AI models, this invention provides a non-invasive prediction method and system for ALK rearrangement status in lung adenocarcinoma based on a large medical model. Without relying on invasive biopsies, this method utilizes conventional chest computed tomography (CT) images and patient clinical information, and through high-precision automatic segmentation and a dedicated deep learning model, achieves efficient, accurate, and clinically interpretable auxiliary diagnosis of ALK gene rearrangement status in lung adenocarcinoma patients.
[0006] In a first aspect, embodiments of this application provide a non-invasive prediction method for ALK rearrangement status in lung adenocarcinoma based on a large medical model, comprising the following steps: S1. Acquire chest medical imaging data and corresponding structured clinical information, wherein the chest medical imaging includes lung adenocarcinoma lesion areas; S2. Input the chest medical image into the automatic segmentation model for lung adenocarcinoma to generate a lesion segmentation mask image, wherein the automatic segmentation model for lung adenocarcinoma is obtained based on a lung adenocarcinoma-specific cue token and a low-rank adapter module fine-tuning strategy; S3. Based on the lesion segmentation mask image, crop the lesion region image from the chest medical image, input the lesion region image and clinical information into the large-scale ALK rearrangement state prediction model for lung adenocarcinoma, and obtain the ALK rearrangement prediction result for the lesion region image. The large-scale ALK rearrangement state prediction model for lung adenocarcinoma is obtained based on the ALK specific cue token and low-rank adapter module fine-tuning strategy for lung adenocarcinoma.
[0007] Furthermore, S1 also includes isotropic resampling of chest medical images, wherein the isotropic resampling employs a trilinear interpolation method.
[0008] Furthermore, the chest medical images are resampled to a voxel spacing of 1.0 mm × 1.0 mm × 1.0 mm.
[0009] Furthermore, S1 also includes preprocessing the chest medical image data, the preprocessing including resampling, normalization, and cropping and filling.
[0010] Furthermore, S3 includes: S31. Based on the lesion segmentation mask image, crop out the lesion region image from the chest medical image, and preprocess the lesion region image; S32. Input the preprocessed lesion area image and clinical information into the large model for predicting the ALK rearrangement status of lung adenocarcinoma, and output the prediction results of the ALK rearrangement status of lung adenocarcinoma.
[0011] Furthermore, the visualized heatmap is used to explain the basis of the model's decision-making.
[0012] Furthermore, during the process of outputting the prediction results of the ALK rearrangement state of lung adenocarcinoma, a hierarchical perceptual gradient class activation mapping visualization heatmap is generated simultaneously.
[0013] Furthermore, the hierarchical perceptual gradient class activation mapping visualization heatmap includes: During the prediction process, feature maps and their corresponding gradients of the last three layers of the Transformer encoder of the large-scale model for predicting the ALK rearrangement state of lung adenocarcinoma are extracted. For each feature map layer, the weight of each channel is calculated based on the gradient of the predicted ALK rearrangement state of lung adenocarcinoma, and the feature map is then weighted by channel to obtain the activation map of that layer. The activation maps of the last three layers are weighted and fused to generate a three-dimensional heat map, which is then reduced to a two-dimensional plane for display by maximum intensity projection.
[0014] Furthermore, the large-scale automatic segmentation model for lung adenocarcinoma was obtained through the following fine-tuning method: Insert a lung adenocarcinoma-specific cue token fine-tuning module at the end of the encoder of a general medical segmentation model (such as SAM-Med3D, nnU-Net v2, SegVol, or MedSAM) and introduce a low-rank adapter module at the skip connection of the decoder. Freeze the backbone network parameters of the general medical segmentation model; Furthermore, during the fine-tuning of the large-scale automatic segmentation model for lung adenocarcinoma, only the weight parameters of the lung adenocarcinoma-specific cue token module, low-rank adapter module, and segmentation head are updated.
[0015] Furthermore, during the fine-tuning of the large-scale automatic segmentation model for lung adenocarcinoma, a dataset of chest medical images with lesion annotations is obtained, and the chest medical images are preprocessed to obtain a training dataset.
[0016] Furthermore, the training process of the large-scale automatic segmentation model for lung adenocarcinoma includes the following steps: The labeled chest medical images are input into a large-scale automatic segmentation model for lung adenocarcinoma, which includes a lung adenocarcinoma-specific cue token fine-tuning module and a low-rank adapter module, to generate a predicted mask image. Set a first combined loss function and calculate a first loss value based on the predicted mask image and the labeled mask image; Using the first loss value as the optimization objective, backpropagation and parameter updates are performed on the model, and the process is repeated iteratively until the preset training termination condition is met, resulting in a large-scale automatic segmentation model for lung adenocarcinoma that has been trained.
[0017] Furthermore, the first loss function is: : in, For the first loss function, The Dice loss function, Focal loss function For the boundary loss function, For the first hyperparameterization, This is the second hyperparameter. As the third hyperparameter, satisfying .
[0018] Furthermore, the large-scale model for predicting ALK rearrangement status in lung adenocarcinoma was obtained through the following fine-tuning method: A lung adenocarcinoma-specific cue token fine-tuning module is inserted at the end of the encoder of the general vision model, and a low-rank adapter module is inserted after the multi-head attention module of each Transformer layer of the encoder. Freeze the backbone network parameters of the general vision model.
[0019] Furthermore, when fine-tuning the large model for predicting the ALK rearrangement state of lung adenocarcinoma, only the weight parameters of the ALK-specific cue token fine-tuning module and the low-rank adapter module for lung adenocarcinoma are updated. Furthermore, during the fine-tuning of the large model for predicting ALK rearrangement status in lung adenocarcinoma, images of the segmented and cropped lesion regions and their corresponding clinical structured information are obtained. The lesion region images are preprocessed to construct a multimodal dataset for training.
[0020] Furthermore, the training process of the large-scale model for predicting ALK rearrangement states in lung adenocarcinoma includes the following steps: The labeled lesion region image is input into a large model for predicting the ALK rearrangement state of lung adenocarcinoma in an encoder with an added ALK-specific cue token fine-tuning module and a low-rank adapter module to obtain image features. The image features and the clinical information features after vectorization and normalization are fused across modalities through a dynamic gated cross-attention fusion mechanism to obtain dynamically weighted fusion features. The dynamically weighted fusion features are input into the classification head for classification, and the predicted result of the ALK rearrangement state of lung adenocarcinoma is output. The classification head includes two fully connected layers with a nonlinear activation function in between. A second combined loss function is constructed, and the second loss value is calculated based on the predicted ALK rearrangement state of lung adenocarcinoma and the labeled ALK rearrangement state of lung adenocarcinoma. Using the second combined loss value as the optimization objective, backpropagation and parameter updates are performed on the model, and the process is repeated iteratively until the preset convergence condition is met or the maximum number of training rounds is reached, thus obtaining a large-scale model for predicting the ALK rearrangement state of lung adenocarcinoma after training.
[0021] Furthermore, the general-purpose visual foundation model is the encoder part of SegVol or M3AE. Both are three-dimensional visual foundation models pre-trained on large-scale medical image data and have good representation capabilities and transfer potential.
[0022] Furthermore, the specific implementation of the dynamic gating cross-attention fusion mechanism includes: The structured clinical features are used as query vectors, and the image features are mapped to key vectors and value vectors respectively. Cross-attention weights are calculated based on query vectors and key vectors, and then the value vectors are weighted and summed to obtain preliminary fusion features; A gating network is introduced to concatenate the image features with clinical features, and a gating vector is generated through a gating function; The gating vector is used to perform weighted fusion of the preliminary fusion features and the image features, and finally outputs the dynamic weighted fusion features.
[0023] Furthermore, the second loss function is: ; in, The second loss function, For inter-class comparison regularization, It is the fourth hyperparameter. The fifth hyperparameter satisfies .
[0024] Secondly, a non-invasive prediction system for ALK rearrangement status in lung adenocarcinoma based on a large-scale medical model includes: The acquisition module is used to acquire chest medical images and corresponding structured clinical information, wherein the chest medical images include lesion areas; The segmentation module is used to input the chest medical images into the automatic segmentation model for lung adenocarcinoma to generate a lesion segmentation mask image, wherein the automatic segmentation model for lung adenocarcinoma is obtained based on a lung adenocarcinoma-specific cue token and a low-rank adapter module fine-tuning strategy. The rearrangement state prediction module is used to crop the lesion region image from the chest medical image based on the lesion segmentation mask image, and input the lesion region image and clinical information into the large-scale ALK rearrangement state prediction model for lung adenocarcinoma to obtain the ALK rearrangement prediction result for the lesion region image. The large-scale ALK rearrangement state prediction model for lung adenocarcinoma is obtained based on the fine-tuning strategy of lung adenocarcinoma ALK specific cue token and low-rank adapter module.
[0025] Furthermore, a visual heatmap is generated simultaneously to assist in the clinical interpretation of the model's decision-making basis.
[0026] The non-invasive prediction method and system for ALK rearrangement status in lung adenocarcinoma based on a large medical model provided by this invention has the following beneficial effects: High-precision lesion segmentation and strong generalization ability: By employing an efficient parameter fine-tuning strategy, high-precision segmentation of lung adenocarcinoma lesions is achieved using only a small number of labeled samples for fine-tuning on a pre-trained large-scale medical vision model. This method effectively avoids the catastrophic forgetting problem in full-parameter fine-tuning, significantly reduces the dependence on large-scale pixel-level labeled data, and requires training only a small number of newly added parameters. On datasets from multiple independent external testing centers, the segmentation performance achieves a Dice similarity coefficient ≥ 0.92 and a Hausdorff distance ≤ 8.5 mm, significantly outperforming traditional U-Net-like models.
[0027] Non-invasive prediction of ALK rearrangement status in lung adenocarcinoma: Predicting ALK rearrangement status in lung adenocarcinoma using non-invasive chest imaging and basic clinical information avoids the risks and discomfort associated with traditional invasive biopsies. Furthermore, based on high-precision lesion segmentation results, the accuracy and reliability of ALK rearrangement status in lung adenocarcinoma are further improved. Attached Figure Description
[0028] Figure 1 This is a flowchart illustrating the method for predicting ALK rearrangement status in lung adenocarcinoma provided in this application.
[0029] Figure 2 This is a flowchart illustrating the construction and fine-tuning method of the large-scale automatic segmentation model for lung adenocarcinoma provided in this application.
[0030] Figure 3 This is a flowchart illustrating the construction and fine-tuning method of the large-scale model for predicting ALK rearrangement status in lung adenocarcinoma provided in this application.
[0031] Figure 4 This is a schematic diagram of the structure of the non-invasive prediction system for ALK rearrangement status in lung adenocarcinoma based on the large medical model provided in this application. Detailed Implementation
[0032] The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of this application. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.
[0033] This invention provides a non-invasive prediction method and system for ALK rearrangement status in lung adenocarcinoma based on a large medical model. It can achieve efficient, accurate, and interpretable non-invasive auxiliary diagnosis of ALK gene rearrangement status in lung adenocarcinoma patients using conventional chest CT images and clinical information.
[0034] Example 1: A Non-invasive Prediction Method for ALK Rearrangement Status in Lung Adenocarcinoma Based on a Large-Scale Medical Model like Figure 1 As shown, this embodiment provides a non-invasive prediction method for ALK rearrangement status in lung adenocarcinoma based on a large medical model, including the following steps: S1. Obtain chest medical images and corresponding clinical information.
[0035] Chest medical imaging includes the area of the lesion.
[0036] Specifically, chest medical images can be medical images that include lesion areas, such as medical images or images of target organ regions. These chest medical images can be CT images acquired by a CT scanner; however, this application does not specifically limit the chest medical images used in its embodiments.
[0037] The dataset is preprocessed by resampling, standardization, pruning, and data augmentation, and used as input data for the training set. For details of the preprocessing, please refer to Example 2.
[0038] S2. Input the chest medical images into the large-scale automatic segmentation model for lung adenocarcinoma to obtain lesion segmentation mask images.
[0039] The large-scale automatic segmentation model for lung adenocarcinoma is obtained based on a fine-tuning strategy using lung adenocarcinoma-specific cue tokens and a low-rank adapter module. The lesion segmentation mask image represents the segmentation result of the large-scale automatic segmentation model for lung adenocarcinoma in the lesion region.
[0040] Specifically, the lung adenocarcinoma-specific cue token fine-tuning module and the low-rank adapter module are fine-tuned based on a general medical segmentation model. The lung adenocarcinoma-specific cue token fine-tuning module is inserted at the end of the encoder of the general medical segmentation model, and these cue tokens are used to guide the model to focus on image features specific to lung adenocarcinoma. The low-rank adapter module is inserted at the skip connections of the decoder of the general medical segmentation model, and only the low-rank matrix increment is trained, which can effectively reduce the number of parameters that need to be updated.
[0041] In this embodiment, chest medical images are input into a large-scale automatic segmentation model for lung adenocarcinoma to obtain lesion segmentation mask images, where the lesion segmentation mask images can be understood as chest medical images labeled with lesion segmentation results.
[0042] Figure 2 This is a schematic diagram of the construction and fine-tuning method for an automatic segmentation model of lung adenocarcinoma provided in this embodiment, including the following steps: S210. Acquire chest medical images, preprocess the chest medical images, and obtain a training set.
[0043] Chest medical images were obtained from CT images of lung adenocarcinoma patients from multiple centers. Each image was independently annotated by two senior radiologists, and a gold standard segmentation mask was developed through consensus. The training set data included manually annotated data and image data obtained through data augmentation. The manually annotated data was preprocessed to form standardized data. Data augmentation was performed on the input standardized data. Through data augmentation, the image data in the training set had different contrasts and lesion features, ensuring that the large-scale automatic lung adenocarcinoma segmentation model could adapt to diverse chest medical images after training, improving the generalization and adaptability of image segmentation, and expanding the total number of training samples to 4–6 times the original size.
[0044] S220. Insert a lung adenocarcinoma-specific cue token fine-tuning module at the end of the encoder of the general medical segmentation model, and introduce a low-rank adapter module at the decoder jump connection.
[0045] In this embodiment, the SAM-Med3D general medical segmentation model is selected for subsequent fine-tuning. SAM-Med3D is a large-scale pre-trained model based on the Vision Transformer (ViT) architecture, with a parameter scale of 1.2 billion. It is pre-trained on a dataset containing more than 1 million 3D medical images (including CT, MRI and other modalities), and learns rich prior knowledge of anatomical structures, density and contrast features of multiple tissue windows, and morphological and hematopoietic manifestation patterns of typical lesions, and has strong robustness across tasks and devices.
[0046] A parameter-efficient fine-tuning strategy is adopted, which fine-tunes only key substructures while retaining the core parameters of the general medical segmentation model. Eight learnable lung adenocarcinoma-specific cue token fine-tuning modules are inserted at the end of the encoder, and low-rank adapter modules are introduced at the skip connections of the decoder.
[0047] In the operation of inserting cue tokens at the end of the encoder, feature fusion is specifically performed in the Transformer block before the final output layer of the encoder. The output feature dimension of the last Transformer block of the encoder is 768, so the dimension of the learnable cue tokens is set to 768 (the same as the original feature dimension). The initial values of the cue tokens follow a standard normal distribution (mean 0, standard deviation 1), and there are 8 of them. The cue tokens and the original feature tensor (shape [H×W×D, 768]) are concatenated in the channel dimension (concatenated in the sequence length dimension, i.e., adding 8 tokens) to obtain a new feature sequence (length H×W×D+8, dimension 768). This is then input into the next layer of the multi-head attention mechanism, so that when the self-attention mechanism calculates key-value pairs, it can simultaneously fuse the global image features extracted by the base model and the cue information specific to lung adenocarcinoma, focusing on enhancing the sensitivity to typical lung adenocarcinoma signs such as ground-glass nodules and solid components.
[0048] A low-rank adapter module is introduced at the skip connections in the decoder. The original skip connections in the general medical segmentation model concatenate the encoder's feature map with the decoder's upsampled feature map. A low-rank adapter module is added at each skip connection. For each skip connection, if the number of input feature map channels is C (e.g., 256), the low-rank adapter module consists of a fully connected layer (the weight matrix is decomposed into two low-dimensional matrices: , The adapter's output is added to the input feature map (residual connection) and then concatenated with the upsampled features.
[0049] Meanwhile, during parameter updates, the SAM-Med3D encoder, which includes multiple Transformer blocks, freezes all Transformer blocks and only updates the parameters of the inserted cue token and low-rank adapter module, as well as the parameters of the split head.
[0050] S230. Input the labeled chest medical images into the configured large-scale automatic segmentation model for lung adenocarcinoma to generate a predicted mask image.
[0051] Preprocessed chest medical images Input a large-scale automatic segmentation model of lung adenocarcinoma after configuration. The model automatically outputs a 3D binary segmentation mask M for the lung adenocarcinoma lesion, with a mask size of [size missing]. A three-dimensional matrix, where each element takes the value of 0 or 1; where the position of the element with a value of 1 corresponds to the voxel of the lung adenocarcinoma lesion, and the position of the element with a value of 0 corresponds to the voxel of the non-lesion background.
[0052] S240. Set the first combined loss function and calculate the first loss value based on the predicted mask image and the labeled mask image.
[0053] During training, a first combined loss function is used, which includes Dice loss, Focal loss, and Boundary loss, where the loss function is: ; in, For the first loss function, The Dice loss function is used to optimize the spatial overlap of segmented regions. Focal loss function is used to address the class imbalance problem between the foreground (tumor) and the background; This is the Boundary loss function, used to enhance the accuracy of tumor margins; For the first hyperparameterization, This is the second hyperparameter. As the third hyperparameter, satisfying In this embodiment, the following settings are provided. , , .
[0054] By combining Dice loss, Focal loss, and Boundary loss, three key indicators in lung adenocarcinoma segmentation tasks—volume overlap, difficult sample identification, and boundary contour accuracy—can be optimized simultaneously. This solves the technical challenge in existing technologies where a single loss function cannot simultaneously consider both the overall coverage of the target region and the fineness of the boundary, effectively improving the edge segmentation accuracy of small lesions in lung adenocarcinoma.
[0055] S250. Using the first loss value as the optimization objective, perform backpropagation and parameter updates on the model, repeating the iteration until the preset training termination condition is met, and obtain the trained large-scale automatic segmentation model for lung adenocarcinoma.
[0056] During training, the predicted mask output by the model is supervised using the first combined loss function to optimize the boundary accuracy and region consistency of the predicted mask. Training is terminated when the Dice coefficient does not show a significant improvement (less than 0.001) within 10 consecutive epochs on the validation set, or when the cumulative number of training epochs reaches 200. This strategy effectively balances model convergence sufficiency and training efficiency, avoiding overfitting. During iterative training, all parameters of the backbone network of the general medical segmentation model are fixed, while the parameters of the lung adenocarcinoma-specific cue token module, low-rank adapter module, and segmentation head are updated to reduce computational overhead and improve fine-tuning efficiency.
[0057] Table 1 shows the results of the large-scale automatic segmentation model for lung adenocarcinoma obtained using the construction method provided in the embodiments of this specification, with Dice similarity coefficient and Hausdorff distance (in millimeters) as evaluation metrics. The Dice similarity coefficient is used to measure the spatial overlap between the predicted mask and the labeled mask, and the Hausdorff distance is used to evaluate the boundary error.
[0058] Table 1 S3. Based on the lesion segmentation mask image, the lesion region image is cropped from the chest medical image. The lesion region image and clinical information are input into the large model for predicting the ALK rearrangement status of lung adenocarcinoma to obtain the ALK rearrangement prediction results for the lesion region image.
[0059] In Example 1, a large-scale model for predicting ALK rearrangement status in lung adenocarcinoma is used to predict the ALK rearrangement outcome. This large-scale model is a pre-trained model designed to predict the ALK rearrangement status of lung adenocarcinoma. It should be understood that this large-scale model is used to predict ALK rearrangement positivity.
[0060] S31. Based on the lesion segmentation mask image, the lesion region image is cropped from the chest medical image and the lesion region image is preprocessed.
[0061] S32. Input the preprocessed lesion area image and clinical information into the large-scale model for predicting the ALK rearrangement status of lung adenocarcinoma, and output the prediction results of the ALK rearrangement status of lung adenocarcinoma.
[0062] This large-scale model for predicting ALK rearrangement status in lung adenocarcinoma can automatically identify and output rearrangement prediction results for lesion segmentation images.
[0063] Figure 3 This is a schematic diagram of the construction and fine-tuning method for the large-scale model for predicting ALK rearrangement status in lung adenocarcinoma provided in this embodiment, including the following steps: S310. Obtain images and clinical information of the lesion area, preprocess the images and clinical information of the lesion area, and obtain the training dataset.
[0064] In this embodiment, the lesion region image is cropped from the original chest medical image (using a mask output by the lesion segmentation model). Preprocessing of the lesion region image is necessary to improve data quality, reduce noise and inconsistencies, thereby improving the model's accuracy and generalization ability. Clinical information includes the continuous variable age, and the categorical variables gender, smoking history, and T stage, which are aggregated into a 7-dimensional feature vector after feature processing.
[0065] In this embodiment, sample training data is obtained based on lesion area images and clinical information, wherein the sample training data includes a sample training dataset, a sample validation dataset, and a sample test dataset.
[0066] Specifically, lesion area images are combined with clinical information to obtain sample training data. The sample training dataset, sample validation dataset, and sample test dataset can be divided according to a certain ratio, for example, randomly divided into sample training dataset, sample test dataset, and sample validation dataset in a ratio of 8:1:1.
[0067] Therefore, by combining lesion area images with clinical information to construct sample training data, this application embodiment can make full use of imaging and clinical information, improve the training effect and generalization ability of the model, and thus enhance the accuracy and reliability of diagnosis.
[0068] S320. Insert a lung adenocarcinoma-based ALK-specific cue token fine-tuning module at the end of the encoder of the general vision model, and insert a low-rank adapter module after the multi-head attention module of each Transformer layer of the encoder. Freeze the backbone network parameters of the general vision model and only update the weight parameters of the lung adenocarcinoma-based ALK-specific cue token fine-tuning module and the low-rank adapter module.
[0069] In this embodiment, the encoder part of the SegVol model is selected as the general visual feature extraction backbone for subsequent fine-tuning. SegVol is a basic 3D medical image segmentation model based on the Vision Transformer (ViT) architecture. It has completed self-supervised pre-training on a large-scale dataset containing more than one million 3D medical images (covering multiple modalities such as CT and MRI), and has learned rich spatial topological relationships of anatomical structures, density and texture patterns of multimodal images, and morphological prior knowledge of lesions.
[0070] Although SegVol is natively used for interactive segmentation tasks, its encoder can effectively extract high-dimensional, semantically rich 3D image features. Therefore, this invention reuses its pre-trained encoder as the backbone for visual feature extraction and introduces ALK-specific cue tokens and a low-rank adapter module for lung adenocarcinoma to adapt to binary classification prediction tasks with ALK gene rearrangement states.
[0071] This embodiment employs an efficient parameter fine-tuning strategy, refining only key substructures while retaining the core parameters of a general medical segmentation model. Eight learnable ALK task-specific cue tokens are embedded at the input of the SegVol model. These cue tokens are added to the prefix (i.e., the beginning) of the input sequence, forming a new token sequence. In this new sequence, the cue tokens occupy the first eight positions, followed by image patch tokens. Simultaneously, positional encoding is expanded, with the first eight positions (cue token positions) using newly initialized learnable positional encodings, while the image patch tokens use pre-trained positional encodings (the original positional encodings are frozen but need to be re-indexed due to sequence position shifts).
[0072] In each Transformer layer of SegVol, a low-rank adapter module (LoRA) is inserted after the output of the multi-head self-attention module. This adapter consists of two low-rank matrices (rank 8) connected in a residual manner. Specifically, the output of the multi-head self-attention module (denoted as F) is transformed as follows: F_adjusted = F + LoRA(F); Where F_adjusted is the feature output after adapter adjustment, and LoRA(F) is the output of the low-rank adapter module.
[0073] The low-rank adapter module initializes the dimensionality reduction matrix using a random Gaussian distribution and initializes the dimensionality increase matrix to zero, ensuring that the original model output is not altered at the start of training. The adjusted output is then residually concatenated with the input.
[0074] During fine-tuning, only the following parameters are updated: the embedding vectors of the 8 ALK task-specific cue tokens (768 in each dimension), the position encoding vectors of the 8 cue tokens (768 in each dimension), the low-rank adapter module parameters in all Transformer layers (12 layers in total), the parameters in the classification head (two fully connected layers), and all other parameters are frozen.
[0075] S330. Input the labeled lesion region image into the large model for predicting the ALK rearrangement state of lung adenocarcinoma in the encoder, which has added the ALK-specific cue token fine-tuning module and the low-rank adapter module for lung adenocarcinoma, to obtain image features.
[0076] Specifically, the lesion area image (64×64×64 voxels) is divided into non-overlapping 3D blocks (16×16×16), and an image token sequence with a dimension of 768 is obtained through linear projection. Eight ALK-specific cue tokens for lung adenocarcinoma (each token with a dimension of 768) are concatenated before the image sequence tokens.
[0077] The sequence is input into an encoder containing 12 Transformer layers. Each Transformer layer includes: LayerNorm, Multi-head Attention Module (MSA), low-rank adapter applied after the MSA output, residual connection, layer normalization, and multilayer perceptron (MLP). After passing through all Transformer layers, the token output vector at position 0 in the output sequence of the last Transformer layer is extracted as an image feature. This feature comprehensively encodes the radiological patterns most relevant to the ALK rearrangement (such as cavitation sign, air bronchogram, etc.) and serves as the feature representation of the entire lesion area image, denoted as image feature (768×1 dimension).
[0078] S340. The image features and the patient's clinical information features are fused in a dynamic gated cross-attention fusion mechanism to obtain fused features.
[0079] Image features and clinical information are projected into a shared semantic space. Specifically, 7-dimensional clinical information data is transformed into 256-dimensional clinical features through a fully connected layer feature mapping. These clinical features are then weighted by a weight matrix. Dimensionality reduced to 64 dimensions, used as a query to actively guide the selection of image features; 768-dimensional image features were then processed... The weight matrix undergoes dimensionality reduction to obtain a 64-dimensional image feature vector, which is used as the key; simultaneously, through... The weight matrix is used to increase the dimensionality of the 768-dimensional image features, generating a 256-dimensional feature vector, which is then used as the weighting value. ; ; ; Where K represents the Key feature. Original image features The transpose weight matrix has dimensions of 768×64. For Value features, The transposed weight matrix has dimensions of 768×256, and Q represents the query feature. The 256 clinical features are derived from the fully connected layer. It is a 256×64 dimensional transpose weight matrix.
[0080] Cross-attention calculation: Calculate the dot product of the query and the key and scale it. ; ; in, Here is the cross-attention weight matrix, and softmax(·) is the softmax activation function. For the transpose of Query k, Scaling factor The initial fusion features are obtained by weighted summation of the attention weights on the image features.
[0081] The gating mechanism for fusing image features and clinical features is implemented using the following formula: ; ; in, For the gated vector, For the sigmoid function, For learnable weight matrix, For bias vectors, For dynamic weighted fusion features, This indicates element-wise multiplication.
[0082] The clinical feature projection is fused with the residual to supplement the clinical feature information in the dynamic gated cross-attention mechanism, as shown in the formula: ; ; in, Clinical features after projection, These are learnable weight coefficients. This is the final dynamically weighted fusion feature.
[0083] S350. Input the dynamically weighted fusion features into the classification head for classification, and output the prediction results of the ALK rearrangement status of lung adenocarcinoma.
[0084] The classification head consists of two fully connected layers. The first fully connected layer is a feature transformation layer, taking a 512-dimensional fusion vector as input. The formula is as follows: ; in, The intermediate features after activation For GELU activation function, The weight matrix is a learnable matrix. This is the bias vector.
[0085] The Dropout layer randomly discards neuron outputs with a 20% probability to prevent overfitting.
[0086] The second fully connected layer is a binary classification output layer, taking 256-dimensional intermediate features as input, with the following formula: ; in, To output the predicted probability, For learnable weight matrix, This is the bias vector.
[0087] S360. Set a second combined loss function and calculate a second loss value based on the predicted ALK rearrangement state of lung adenocarcinoma and the labeled ALK rearrangement state of lung adenocarcinoma.
[0088] During training, a second combined loss function is used, which includes Focal loss and inter-class contrast regularization term, where the loss function is: ; in, The second loss function, For inter-class comparison regularization, It is the fourth hyperparameter. The fifth hyperparameter satisfies , , .
[0089] ; ; in, For the i-th real label, the value is set to 0.75. The focus parameter is set to 2.0. To predict the probability that the sample is positive for the model. Here, P is the regularization strength coefficient, P is the positive sample index set, and N is the negative sample index set. Let be the feature vector of the nth sample in the positive sample set. Let be the feature vector of the m-th sample in the negative sample set. To calculate the Euclidean distance between two eigenvectors.
[0090] The second combined loss function addresses sample imbalance through Focal loss and enhances feature separability through inter-class contrast regularization. Under acceptable computational cost, it significantly improves the model's ability to determine the ALK rearrangement status of lung adenocarcinoma.
[0091] S370. Using the second combined loss value as the optimization objective, perform backpropagation and parameter updates on the model, repeating the iteration until the preset termination condition is met, to obtain the trained large model for predicting the ALK rearrangement state of lung adenocarcinoma.
[0092] During training, a second combined loss function is used to supervise the ALK rearrangement state of lung adenocarcinoma output by the model, optimizing the model's prediction accuracy. Training terminates when the Dice coefficient no longer increases for 10 consecutive epochs on the validation set or when the total number of training epochs reaches 200. During iterative training, all parameters of the feature extraction backbone network are fixed; only the lung adenocarcinoma ALK-specific cue token fine-tuning module, low-rank adapter module, feature fusion module, and classification head module are updated to reduce computational overhead and improve fine-tuning efficiency.
[0093] This embodiment introduces a dynamic gating cross-attention mechanism during training, effectively integrating image features with clinical metadata (such as age, gender, smoking history, and T stage), achieving clinical information-driven image feature selection and enhancing sensitivity for low-incidence ALK-positive samples. In external multicenter testing, the prediction model achieved an AUC of 0.93, with sensitivity and specificity of 0.89 and 0.87, respectively. The end-to-end inference time of the entire system is less than 8 seconds per case (based on a single NVIDIA RTX 3090 GPU), and it can seamlessly interface with hospital PACS systems (via HL7 or DICOM SR interface), making it particularly suitable for deployment in primary healthcare institutions.
[0094] This system significantly reduces the pain and complication risks associated with traditional invasive biopsies, while shortening the testing cycle from several days to minutes, thus drastically reducing costs. It helps accelerate targeted therapy decisions, improves the efficiency of medical resource utilization, and has significant clinical application value and health economic benefits.
[0095] This application embodiment acquires chest medical images, inputs them into a large-scale automatic segmentation model for lung adenocarcinoma based on a specific cue token for lung adenocarcinoma and a low-rank adapter for fine-tuning, obtains a lesion segmentation mask image, then crops the lesion region image from the original chest medical images based on the lesion segmentation mask image, and finally inputs the lesion region image and its corresponding clinical information into a large-scale prediction model for ALK rearrangement status of lung adenocarcinoma, and outputs the prediction result of ALK rearrangement of lung adenocarcinoma.
[0096] This integrated intelligent diagnostic system, which inputs original medical images into the prediction results of ALK rearrangement status in lung adenocarcinoma, can infer the ALK gene status of lung adenocarcinoma non-invasively, quickly, and at low cost without biopsy, thus improving the automation level of screening, prediction accuracy, reasoning efficiency, and clinical applicability.
[0097] Example 2: Standardized Processing Method for Chest Medical Imaging and Clinical Information This embodiment details the preprocessing process of medical images and clinical information involved in Embodiment 1.
[0098] 1. Chest medical image preprocessing workflow During the training and prediction phases of the large-scale automatic segmentation model for lung adenocarcinoma, the input chest medical images require the following preprocessing steps: (1) Isotropic resampling By adjusting the spatial resolution of chest medical images, it is ensured that all chest medical images have the same pixel spacing. The median voxel spacing in the statistical dataset, i.e., 1.0 × 1.0 × 1.0 mm, is used as the standard spacing for resampling. For example, chest medical images and their metadata are read, and then the chest medical images and their voxel spacing are read using a medical image processing library (e.g., SimpleITK or PyTorch's torchio). Furthermore, the voxel spacing of the chest medical images can be adjusted to the target spacing using a trilinear interpolation method.
[0099] (2) Standardization To reduce the intensity differences in chest medical images caused by different devices, Z-score standardization is applied to the images to reduce the impact of numerical range on model training. Specifically, the mean and standard deviation are calculated, that is, the mean and standard deviation of pixel values are calculated for the entire dataset or for each image, and the mean is subtracted from each pixel value and then divided by the standard deviation.
[0100] (3) Cutting and filling The chest medical image is cropped to a fixed size and filled to meet the model input size requirements (128×128×128 voxels). If the chest medical image size is larger than the target size, the area of the target size is cropped from the center of the chest medical image. If the chest medical image size is smaller than the target size, zeros are padded around the chest medical image until the target size is reached.
[0101] 2. Data augmentation for the training phase of a large-scale automatic segmentation model for lung adenocarcinoma (for training phase only) To increase data diversity and reduce overfitting, data augmentation is performed on the training dataset during segmentation model training. Data augmentation involves enhancing the pre-processed labeled image data, including any one or more of the following: random rotation (-10° to 10°), vertical flipping, horizontal flipping, scaling (0.9 to 1.1x), and grayscale adjustment (including contrast and brightness adjustment).
[0102] 3. Preprocessing of lesion area images In the training and prediction phases of the large-scale model for predicting ALK rearrangement status in lung adenocarcinoma, the input is an image of the lesion region cropped from the raw medical imaging data (obtained from a mask provided by the segmentation model). Preprocessing steps include: (1) Cutting Based on the lesion segmentation mask image output by the segmentation model, the precise bounding box of the lesion region is obtained, and then the image within the bounding box is cropped from the original chest medical image.
[0103] (2) Resampling Since different lesions vary in size, the cropped lesion images are of different sizes. In order to ensure the consistency of the input of the large model for predicting the ALK rearrangement state of lung adenocarcinoma, trilinear interpolation is used to resample each lesion image to a preset spatial size of 64×64×64 voxels.
[0104] (3) Standardization Z-score standardization was performed using the same mean and standard deviation as the entire chest medical image.
[0105] 4. Preprocessing of clinical information Clinical information includes continuous variables (such as age) and categorical variables (sex, smoking history, and T stage). This information needs to be vectorized before being input into a large-scale model for predicting ALK rearrangement status in lung adenocarcinoma. Continuous variables (age) are mapped to the [0,1] interval through Min-Max Normalization.
[0106] Categorical variables (gender, smoking history, and T stage) are transformed using one-hot encoding to form multidimensional binary feature vectors.
[0107] The processed structured clinical information is aggregated into a 7-dimensional feature vector.
[0108] The above preprocessing steps ensure the consistency of the input data and the stability of the model / inference, and the processed data is adapted to the input requirements of the deep learning model.
[0109] Example 3: Method for Generating Visual Heatmaps In Example 1, during the process of outputting the predicted ALK rearrangement state of lung adenocarcinoma, a hierarchical perceived gradient class activation mapping visualization heatmap is generated simultaneously.
[0110] This embodiment details the generation process of the visualization heatmap in the lung adenocarcinoma ALK rearrangement state prediction and assessment module. This heatmap is generated through a hierarchical perceptual gradient class activation mapping (Hi-CAM) mechanism, used to explain the model's decision-making basis when predicting the ALK rearrangement state of lung adenocarcinoma, and to assist physicians in understanding the imaging feature regions that the model focuses on.
[0111] The specific steps are as follows: (1) Feature map and gradient extraction When an image of a lesion region (64×64×64 voxels) is input into a large-scale model for predicting the ALK rearrangement state of lung adenocarcinoma, the feature maps of the last three layers (the outputs of the penultimate, second-to-last, and third-to-last layers) of the model's encoder (Transformer structure) are extracted during the forward propagation of the model. These feature maps are denoted as follows: , , ,in The dimension is (k=1,2,3), representing the number of channels, depth, height, and width. For each feature map layer, the gradient of the model output (the predicted score of the positive class in ALK rearrangement) relative to that feature is obtained through backpropagation, denoted as... It has the same dimension as the feature map.
[0112] (2) Calculate channel weights For each layer of feature map Calculate the weight of each channel. : ; Where S is the specified computation region on the feature map. It is the predicted score of the positive category of ALK rearrangement status in lung adenocarcinoma output by the model. Representation of feature map The values at channel c, depth d, height h, and width w. This weight represents the degree to which each channel contributes to the prediction result.
[0113] (3) Generation layer activation graph For each layer k, calculate the activation graph of that layer. : ; in, Representation of feature map The kth channel, This represents the c-th channel of the k-th layer. The max operation ensures that only the region that contributes positively to the prediction is retained.
[0114] (4) Multi-level integration The final 3D heatmap is obtained by weighted summation of the activation maps of each layer. : ; in, This is the activation map of the last layer. This is the activation map of the second-to-last layer. This is the activation map of the third-to-last layer. Interp(·) represents the interpolation operation, resampling to the same size as the input lesion image (64×64×64). The weight coefficients corresponding to the penultimate layer. The weight coefficients corresponding to the penultimate layer The weight coefficients corresponding to the third to last layer, .
[0115] (5) Dimensional reduction projection and visualization 3D heat map Dimensionality reduction is used to form a two-dimensional image. Maximum Intensity Projection (MIP) is used to select the midpoint of the heatmap and output a two-dimensional visualization result in axial / coronal plane that conforms to the reading habits of radiologists, resulting in a two-dimensional heatmap.
[0116] Optionally, the system can overlay contour hints of known ALK-positive imaging features of lung adenocarcinoma (such as "pleural traction" and "vascular convergence") onto the heatmap to help doctors verify whether the model's decisions are in line with medical evidence.
[0117] This embodiment helps clinicians understand the relationship between the model's focus area and pathobiological features, and builds a reliable human-machine collaborative diagnostic closed loop of "prediction-interpretation-verification", thereby improving doctors' acceptance and trust in the model.
[0118] Example 4: Non-invasive prediction system for ALK rearrangement status in lung adenocarcinoma based on a large-scale medical model Figure 4 This is a schematic diagram of the structure of the non-invasive prediction system for ALK rearrangement status of lung adenocarcinoma based on a large medical model provided in this application embodiment. Figure 4 As shown, the ALK rearrangement state prediction system 400 for lung adenocarcinoma based on a large medical model includes: an acquisition module 410, a segmentation module 420, and a rearrangement state prediction module 430.
[0119] The acquisition module 410 is used to acquire chest medical images and corresponding clinical information, wherein the chest medical images include the lung adenocarcinoma lesion area.
[0120] The segmentation module 420 is used to input chest medical images into the automatic segmentation model for lung adenocarcinoma to obtain lesion segmentation mask images. The automatic segmentation model for lung adenocarcinoma is obtained based on a lung adenocarcinoma-specific cue token and a low-rank adapter module fine-tuning strategy.
[0121] The rearrangement state prediction module 430 is used to crop out the lesion region image from the chest medical image based on the lesion segmentation mask image, and input the lesion region image and clinical information into the large-scale ALK rearrangement state prediction model for lung adenocarcinoma to obtain the ALK rearrangement prediction result for the lesion region image. The lesion region image is preprocessed, including resampling the lesion region image to a preset spatial size (e.g., 64×64×64 voxels) and using Z-score normalization to maintain the physical meaning of tissue density.
[0122] This application provides a rearrangement state prediction module 430, which acquires the lesion segmentation mask image output by the segmentation module 420, crops the lesion region image from chest medical images based on the lesion segmentation mask image, and then inputs the lesion region image into a large-scale model for predicting the ALK rearrangement state of lung adenocarcinoma, thereby obtaining the prediction result of the ALK rearrangement state of lung adenocarcinoma for the lesion region image. This allows this application embodiment to achieve automatic segmentation and rearrangement state prediction of the entire lesion region. The possibility of ALK gene state can be inferred non-invasively, quickly, and at low cost through conventional CT images, improving the efficiency and accuracy of lesion delineation, enhancing the repeatability and reliability of diagnostic results, reducing reliance on human experience, and providing timely decision-making references for doctors.
[0123] It should be understood that the specific working process and functions of the acquisition module 410, segmentation module 420, and rearrangement prediction module 430 in the above embodiments can be referred to the above. Figures 1 to 3 The description of the lung adenocarcinoma ALK rearrangement state prediction method provided in the examples will not be repeated here to avoid repetition.
[0124] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0125] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0126] In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0127] It should be noted that in the description of this application, the terms "first," "second," "third," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance. Furthermore, in the description of this application, unless otherwise stated, "a plurality of" means two or more.
[0128] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A non-invasive prediction method for ALK rearrangement status in lung adenocarcinoma based on a large-scale medical model, characterized in that, Includes the following steps: S1. Acquire chest medical images and corresponding clinical information, wherein the chest medical images include lesion areas; S2. Input the chest medical image into the automatic segmentation model for lung adenocarcinoma to generate a lesion segmentation mask image, wherein the automatic segmentation model for lung adenocarcinoma is obtained based on a lung adenocarcinoma-specific cue token and a low-rank adapter module fine-tuning strategy; S3. Based on the lesion segmentation mask image, crop the lesion region image from the chest medical image, input the lesion region image and clinical information into the large-scale ALK rearrangement state prediction model for lung adenocarcinoma, and obtain the ALK rearrangement prediction result for the lesion region image. The large-scale ALK rearrangement state prediction model for lung adenocarcinoma is obtained based on the ALK specific cue token and low-rank adapter module fine-tuning strategy for lung adenocarcinoma.
2. The method according to claim 1, characterized in that, S1 also includes isotropic resampling of chest medical images, wherein the isotropic resampling uses a trilinear interpolation method.
3. The method according to claim 1, characterized in that, S3 includes: S31. Based on the lesion segmentation mask image, crop out the lesion region image from the chest medical image, and preprocess the lesion region image; S32. Input the preprocessed lesion area image and clinical information into the large model for predicting the ALK rearrangement status of lung adenocarcinoma, and output the prediction results of the ALK rearrangement status of lung adenocarcinoma.
4. The method according to claim 1, characterized in that, S3 further includes, during the process of outputting the prediction results of the ALK rearrangement state of lung adenocarcinoma, simultaneously generating a hierarchical-aware gradient class activation mapping visualization heatmap, including: During the prediction process, feature maps and their corresponding gradients of the last three layers of the Transformer encoder of the large-scale model for predicting the ALK rearrangement state of lung adenocarcinoma are extracted. For each feature map layer, the weight of each channel is calculated based on the gradient of the predicted ALK rearrangement state of lung adenocarcinoma, and the feature map is then weighted by channel to obtain the activation map of that layer. The activation maps of the last three layers are weighted and fused to generate a three-dimensional heat map, which is then reduced to a two-dimensional plane for display through maximum intensity projection.
5. The method according to claim 1, characterized in that, The large-scale automatic segmentation model for lung adenocarcinoma was obtained through the following fine-tuning method: A lung adenocarcinoma-specific cue token fine-tuning module is inserted at the end of the encoder of the general medical segmentation model, and a low-rank adapter module is introduced at the decoder jump connection. Freeze the backbone network parameters of the general medical segmentation model.
6. The method according to claim 1, characterized in that, The training process of the large-scale automatic segmentation model for lung adenocarcinoma includes the following steps: The labeled chest medical images are input into a large-scale automatic segmentation model for lung adenocarcinoma, which includes a lung adenocarcinoma-specific cue token fine-tuning module and a low-rank adapter module, to generate a predicted mask image. Set a first combined loss function and calculate a first loss value based on the predicted mask image and the labeled mask image; Using the first loss value as the optimization objective, backpropagation and parameter updates are performed on the model, and the process is repeated iteratively until the preset training termination condition is met, resulting in a large-scale automatic segmentation model for lung adenocarcinoma that has been trained.
7. The method according to claim 1, characterized in that, The large-scale model for predicting ALK rearrangement status in lung adenocarcinoma was obtained through the following fine-tuning method: A lung adenocarcinoma-specific cue token fine-tuning module is inserted at the end of the encoder of the general vision model, and a low-rank adapter module is inserted after the multi-head attention module of each Transformer layer of the encoder. Freeze the backbone network parameters of the general vision model.
8. The method according to claim 1, characterized in that, The training process of the large-scale model for predicting ALK rearrangement states in lung adenocarcinoma includes the following steps: The labeled lesion region image is input into a large model for predicting the ALK rearrangement state of lung adenocarcinoma in an encoder with an added ALK-specific cue token fine-tuning module and a low-rank adapter module, and image features are extracted. The image features are fused with the patient's clinical information features through a dynamic gated cross-attention fusion mechanism to obtain dynamically weighted fusion features; The dynamically weighted fusion features are input into the classification head for classification, and the predicted result of the ALK rearrangement status of lung adenocarcinoma is output. The classification head includes two fully connected layers. A second combined loss function is set up to calculate the second loss value based on the predicted ALK rearrangement state of lung adenocarcinoma and the labeled ALK rearrangement state. Using the second combined loss value as the optimization objective, backpropagation and parameter updates are performed on the model, and the process is repeated iteratively until a preset termination condition is met, resulting in a trained large model for predicting the ALK rearrangement state of lung adenocarcinoma.
9. The method according to claim 8, characterized in that, The dynamic gating cross-attention fusion mechanism includes: The structured clinical features are used as query vectors, and the image features are used as key and value vectors to calculate cross-attention weights. The value vector is weighted and summed using the cross-attention weights to obtain preliminary fusion features; The image features are concatenated with clinical features, and a gating vector is generated using a gating function. The gating vector is then used to perform a weighted fusion of the preliminary fused features and the image features to obtain a dynamically weighted fused feature.
10. A non-invasive prediction system for ALK rearrangement status in lung adenocarcinoma based on a large-scale medical model, comprising: The acquisition module is used to acquire chest medical imaging data and corresponding clinical information, wherein the chest medical imaging includes lesion areas. The segmentation module is used to input the chest medical images into the automatic segmentation model for lung adenocarcinoma to generate a lesion segmentation mask image, wherein the automatic segmentation model for lung adenocarcinoma is obtained based on a lung adenocarcinoma-specific cue token and a low-rank adapter module fine-tuning strategy. The rearrangement state prediction module is used to crop the lesion region image from the chest medical image based on the lesion segmentation mask image, and input the lesion region image and clinical information into the large-scale ALK rearrangement state prediction model for lung adenocarcinoma to obtain the ALK rearrangement prediction result for the lesion region image. The large-scale ALK rearrangement state prediction model for lung adenocarcinoma is obtained based on the fine-tuning strategy of lung adenocarcinoma ALK specific cue token and low-rank adapter module.