Endometrial cancer prediction method based on cell-level feature fusion and application thereof

By segmenting tumor regions and fusing cell-level features in endometrial cancer detection, the high cost and insufficient accuracy of existing technologies are solved, achieving efficient and low-cost accurate prediction, which is suitable for large-scale applications and cross-center deployment.

CN122244059APending Publication Date: 2026-06-19SHENZHEN SHENGQIANG TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHENZHEN SHENGQIANG TECH
Filing Date
2026-05-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies for molecular subtyping of endometrial cancer suffer from high costs, long cycles, high invasiveness, high requirements for laboratory conditions, and are not applicable on a large scale. Furthermore, they fail to effectively utilize cellular-level microscopic information, resulting in insufficient prediction accuracy and models that are susceptible to background noise interference.

Method used

By segmenting tumor regions to extract whole-slice images of pathology, and combining parallel extraction of cell-level map features and tissue-level depth features, a cross-attention and gating fusion mechanism is adopted to achieve adaptive fusion of multi-scale features and improve prediction accuracy.

Benefits of technology

It significantly improves the accuracy of molecular subtyping and microsatellite instability prediction for endometrial cancer, reduces detection costs, has the potential for large-scale application, and provides model interpretability and robustness, making it suitable for cross-center deployment.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244059A_ABST
    Figure CN122244059A_ABST
Patent Text Reader

Abstract

This invention proposes a method for predicting endometrial cancer based on cell-level feature fusion and its application. Addressing the shortcomings of existing methods, such as susceptibility to background interference and low prediction accuracy due to reliance solely on tissue-level macroscopic features, this approach acquires whole-section pathological images and extracts the tumor region. Within the tumor region, cell-level map features and tissue-level depth features are extracted separately. Using the cell-level map features as the query, cross-attention calculation is performed on the tissue-level depth features to generate a cell-guided tissue representation, which is then adaptively gated and fused with the global tissue representation to form a joint representation. Based on this joint representation, molecular subtyping and microsatellite instability prediction results are simultaneously output. This invention eliminates irrelevant background noise, achieves precise fusion of complementary microscopic and macroscopic features, and improves the accuracy of clinical disease prediction.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence medical diagnosis, and in particular to a method for predicting endometrial cancer based on cell-level feature fusion and its application. Background Technology

[0002] Precision treatment of endometrial cancer is highly dependent on molecular subtyping. Current standard clinical methods mainly rely on molecular pathological detection techniques such as immunohistochemistry, fluorescence in situ hybridization, or next-generation sequencing to determine four standard molecular subtypes: POLE mutant, mismatch repair deficient, p53 abnormal, and nonspecific molecular profiles, thereby guiding prognosis and treatment selection. However, these techniques are generally costly, time-consuming, require sophisticated laboratory conditions and skilled operators, and are invasive / destructive, consuming valuable tissue samples, which limits their large-scale application in primary healthcare institutions.

[0003] With the development of digital pathology technology, computational pathology methods based on hematoxylin-eosin stained whole-slice images have become a research hotspot for non-invasive, low-cost prediction of molecular phenotypes. Existing solutions generally employ a weakly supervised multi-instance learning framework, which typically treats the entire whole-slice image as a package composed of numerous tissue blocks and aggregates features from all tissue blocks through mechanisms such as attention pooling to form a slice-level representation, finally performing molecular subtype classification. Some improved solutions attempt to introduce graph neural networks to model the relationships between tissue blocks, or to improve upon a single global attention model. However, these solutions do not explicitly distinguish between tumor regions and background regions such as normal glands, stroma, and inflammation during feature extraction and modeling, making the feature aggregation process susceptible to interference from irrelevant noise. Furthermore, existing solutions often rely solely on macroscopic morphological features at the tissue block level, neglecting cellular-level microscopic morphological information such as nuclear morphology, nuclear atypia, cell density, and spatial arrangement, resulting in the loss of discriminative features and making it difficult to comprehensively reflect key biological clues at different scales within tumor tissue.

[0004] Therefore, there is an urgent need for a method for predicting endometrial cancer based on cell-level feature fusion and its application, in order to solve the problems existing in the current technology. Summary of the Invention

[0005] This invention provides a method for predicting endometrial cancer based on cell-level feature fusion and its application. It addresses the problems of existing molecular subtyping methods for endometrial cancer based on whole-slice images, which fail to focus on the tumor region and ignore cell-level microscopic information, resulting in models that are susceptible to background noise interference and have insufficient characterization ability, affecting prediction accuracy and interpretability.

[0006] The core technology of this invention is to first segment and extract the tumor region to eliminate background interference, then extract cell-level map features and tissue-level depth features in parallel within the tumor region, and adopt a cross-attention and gating fusion mechanism guided by cell information to achieve adaptive fusion of multi-scale features, thereby improving prediction accuracy.

[0007] In a first aspect, the present invention provides a method for predicting endometrial cancer based on cell-level feature fusion, the method comprising the following steps:

[0008] Obtain whole pathological slide images and segment the tumor regions from the whole slide images to obtain the tumor regions; Cell-level map features and tissue-level depth features were extracted separately within the tumor region; Project cell-level map features into query vectors, and project each feature vector contained in tissue-level deep features into key vectors and value vectors respectively; Based on the similarity between the query vector and each of the key vectors, the cross-attention weights are calculated, and the value vectors are weighted and summed using the cross-attention weights to obtain cell-guided tissue characterization. By using a learnable gating coefficient, the cell-guided tissue representation and the global tissue representation are weighted and summed for adaptive fusion to obtain the fused features; The fusion features are combined with the cell-level map features to form a joint representation; Based on joint characterization, the molecular subtyping prediction results and microsatellite instability prediction results of endometrial cancer are output simultaneously.

[0009] Furthermore, tumor region segmentation is performed on the whole-slice image to obtain the tumor region, including: The full slice image is divided into multiple non-overlapping image blocks; The trained tumor segmentation model is used to perform tumor / non-tumor binary segmentation on each image block, and the segmentation results are mapped back to the whole slice image to generate a tumor region mask. Only the region within the tumor region mask is taken as the tumor region. The tumor segmentation model is built upon a base model pre-trained on histopathological images.

[0010] Further, cell-level map features are extracted, including: Cell nucleus detection and instance segmentation are performed within the tumor region, and the morphological, texture, and spatial location features of each cell nucleus are extracted as initial node features; Using the cell nucleus as a node, an undirected cell graph is constructed based on the spatial proximity between cell nuclei. A graph neural network is used to perform message passing and feature aggregation on an undirected cell graph to obtain a global cell representation, which serves as a cell-level graph feature.

[0011] Furthermore, an undirected cell map is constructed based on the spatial proximity between cell nuclei, including: The initial topology is constructed using the K-nearest neighbor algorithm; For any two nodes, a connection edge is established between them only if the Euclidean distance between them in the image plane is less than a preset distance threshold, so as to complete the topology pruning.

[0012] Furthermore, tissue-level deep features are extracted, including: The tumor region was divided into multiple tissue blocks of interest. A pre-trained basic feature extraction model is used to perform deep feature extraction on each tissue block of interest to obtain a tissue block feature set. Attention pooling is applied to the feature set of the organization block to obtain a global organization representation.

[0013] Furthermore, adaptive fusion is performed within a defined shared fusion subspace, with the dimension of the shared fusion subspace being 128.

[0014] Furthermore, the molecular subtyping prediction results and microsatellite instability prediction results for endometrial cancer are output simultaneously, including: Input the joint representation into the multi-task classifier; The molecular subtyping prediction results for endometrial cancer are output through the first classification branch, which is a four-category result, including POLE mutant, mismatch repair deficient, p53 abnormal, and non-specific molecular profile. The microsatellite instability prediction results are output through the second classification branch. These results are binary, including high instability and stable microsatellites.

[0015] Secondly, the present invention provides an endometrial cancer prediction device based on cell-level feature fusion, comprising: The tumor region extraction module is used to acquire whole-slice images of pathology and segment the whole-slice images to obtain the tumor region; The dual-branch feature extraction module is used to extract cell-level map features and tissue-level depth features separately within the tumor region; The adaptive fusion module is used to perform cross-attention processing on tissue-level deep features using cell-level map features as queries, generate cell-guided tissue representations, and adaptively fuse the cell-guided tissue representations with the global tissue representations obtained by attention pooling of tissue-level deep features to obtain a joint representation. The multi-task prediction module is used to simultaneously output the molecular subtyping prediction results and microsatellite instability prediction results of endometrial cancer based on joint characterization.

[0016] Thirdly, the present invention provides an electronic device including a memory and a processor, wherein the memory stores a computer program and the processor is configured to run the computer program to perform the above-described method for predicting endometrial cancer based on cell-level feature fusion.

[0017] Fourthly, the present invention provides a readable storage medium storing a computer program, the computer program including program code for controlling a process to execute the process, the process including the endometrial cancer prediction method based on the above-described cell-level feature fusion.

[0018] The main contributions and innovations of this invention are as follows: 1. Significantly improves the prediction accuracy of molecular subtyping and microsatellite instability: The method of this invention explicitly constructs cell-level map features that include cell nuclear morphology, texture and spatial topological relationships within the tumor region, and fuses them with tissue-level macroscopic structural features across scales. This enables the model to capture microscopic and macroscopic discriminative information simultaneously, complementarily representing tumor heterogeneity. It achieves better prediction performance than existing weakly supervised multi-instance learning models on multiple public and private datasets.

[0019] 2. The model is highly interpretable and conforms to the logic of pathological diagnosis: The method of this invention adopts a cross-attention mechanism with cell-level features as the query to generate cell-guided tissue representations, which enables the neural network attention heatmap to spontaneously focus on the morphologically rich tumor areas of concern to pathologists, rather than irrelevant background or normal tissues. This provides a visual basis for molecular subtyping results that is consistent with clinical diagnostic thinking and enhances clinical credibility.

[0020] 3. Low cost, fast speed, and potential for large-scale application: The entire prediction process of this invention only requires routine clinical hematoxylin-eosin staining slides, without the need for additional immunohistochemical or sequencing detection, completely eliminating the dependence on expensive gene testing. Moreover, the entire process is automated and can generate prediction reports in a short time, providing efficient and low-cost auxiliary support for the treatment decisions of endometrial cancer patients.

[0021] 4. Strong robustness and generalization ability: The mandatory tumor region segmentation pre-step effectively removes the mixed noise from non-tumor regions and the domain offset effect introduced by cross-center slice staining and scanning differences, enabling the model to exhibit excellent generalization performance and stability in real-world deployment scenarios across centers, devices, and datasets.

[0022] 5. Multi-task integration enhances clinical applicability: The method of this invention simultaneously outputs molecular subtyping and microsatellite instability prediction results within the same framework, realizing a one-stop clinical decision support with "one input, multiple outputs", simplifying the diagnostic workflow and improving deployment efficiency and clinical applicability.

[0023] Details of one or more embodiments of the present invention are set forth in the following drawings and description, so that other features, objects and advantages of the invention will be more readily understood. Attached Figure Description

[0024] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this invention, illustrate exemplary embodiments of the invention and are used to explain the invention, but do not constitute an undue limitation of the invention. In the drawings: Figure 1 This is a flowchart illustrating the overall technical framework of the endometrial cancer prediction method based on cell-level feature fusion according to an embodiment of the present invention. Figure 2 This is a comparison chart of the model attention heatmap and pathology expert annotations according to an embodiment of the present invention; Figure 3 This is the flowchart of the endometrial cancer prediction method based on cell-level feature fusion according to an embodiment of the present invention; Figure 4 This is a schematic diagram of the hardware structure of an electronic device according to an embodiment of the present invention. Detailed Implementation

[0025] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with one or more embodiments of this specification. Rather, they are merely examples of apparatuses and methods consistent with some aspects of one or more embodiments of this specification as detailed in the appended claims.

[0026] It is important to clarify first that the endometrial cancer prediction method based on cell-level feature fusion provided in this invention processes digitized whole-slice images of pathological tissue collected and scanned in vitro, and does not directly apply to living humans or animals. The molecular subtyping and microsatellite instability prediction results output by the model of this invention are essentially objective data features obtained through deep learning feature extraction and classification calculations on image data. These calculation results serve only as intermediate information for biological markers, providing data reference for subsequent computer-aided diagnosis by pathologists, and cannot replace the final clinical diagnostic conclusion of doctors. Therefore, this invention does not constitute a method for diagnosing or treating diseases.

[0027] Example 1 like Figure 1 and Figure 3As shown, this embodiment provides a method for predicting endometrial cancer based on cell-level feature fusion. This method takes whole-slice images (WSI) as input, and through automatic tumor region segmentation, parallel extraction of cell-level map features and tissue-level depth features, cell-guided cross-attention gating fusion, and multi-task simultaneous prediction, it finally outputs molecular subtyping and microsatellite instability (MSI) prediction results for endometrial cancer.

[0028] To more intuitively illustrate the entire implementation mechanism of this invention, this embodiment provides an overall technical framework. See details... Figure 1 The method described in this invention is mainly based on a parallel dual-branch network architecture, including a tissue branch and a cell branch. In the tissue branch, the input whole-slice image (WSI) is segmented into multiple patch terms. After depth features are extracted by a feature extractor, these are sequentially mapped to a shared fusion subspace through an attention pooling layer and a linear projection layer, thus outputting tissue features. In the cell branch, cell information within the tumor region is input into a graph neural network and segmented into cell terms. Similarly, these are sequentially mapped to a shared fusion subspace through an attention pooling layer and a linear projection layer, thus outputting cell features. Subsequently, the global cell representation obtained from cell feature projection is used as a query vector. This vector interacts with the key vector and value vector obtained from tissue feature projection in a cell-guided cross-attention module. During the calculation, specific operators (such as dot product or matrix multiplication operators) are used. The weights are dynamically adjusted using parameters such as the sigmoid activation function σ to generate cell-guided tissue representations. Next, the cell-guided tissue representations and global tissue representations are fed into a fusion module using gating coefficients for adaptive weighted summation and fusion to construct a joint representation. Finally, the joint representation is input into a multi-task classifier, which simultaneously outputs raw scores for molecular subtyping prediction and microsatellite instability prediction of endometrial cancer.

[0029] The method specifically includes the following steps: Step S1: Obtain the whole pathological slide image and segment the tumor region from the whole slide image to obtain the tumor region.

[0030] The core purpose of this step is to introduce a pre-emptive explicit tumor focusing mechanism to force the model to focus only on the lesion areas that the pathologist is truly concerned with, eliminating interference from background noise such as normal glands, stroma, and inflammation.

[0031] First, input a whole-section image stained with hematoxylin and eosin (H&E), and at 10x magnification, divide it into multiple non-overlapping image blocks of size 512×512 pixels.

[0032] Subsequently, the trained TransUNI tumor segmentation model was used to perform binary classification of each image patch as tumor / non-tumor. The architecture of this segmentation model is based on TransUNet, with the original standard Transformer module in its encoder replaced by a UNI base model that was self-supervised pre-trained on a large-scale histopathological image dataset. The UNI encoder endows the model with powerful prior capabilities for histological morphology recognition and structural modeling. The segmentation model was then fine-tuned under supervision on an endometrial cancer dataset where tumor region boundaries were precisely annotated pixel-by-pixel by pathologists to ensure accurate localization of the fine boundaries of malignant regions.

[0033] Finally, the segmentation results of each image patch are mapped back to the full-slice image according to their spatial coordinates in the original image, generating a continuous tumor region mask. Only the image regions located within this mask are retained as the tumor regions for subsequent processing, and all subsequent feature extraction and inference are strictly performed within this region.

[0034] Step S2: Extract cell-level map features and tissue-level depth features from the tumor region, respectively.

[0035] This step constructs a parallel, two-branch feature extraction architecture, designed to simultaneously capture complementary discriminative information at both the microscopic cellular and macroscopic tissue scales from the obtained tumor region.

[0036] The specific sub-steps for extracting cell-level map features are as follows: First, at high resolution, a pre-trained HoVer-Net cell nucleus segmentation model is used to accurately detect and segment cell nuclei within the tumor region, obtaining the geometric contour of each cell nucleus. For the j-th segmented cell nucleus, its initial node feature vector c is extracted. j This feature vector encodes the morphological features (such as area, nuclear atypia, etc.), texture features, and its location information in image space of the cell nucleus. The dimension d of the feature vector is... c Set to 64.

[0037] Subsequently, using each detected cell nucleus as a graph node and the spatial proximity between cell nuclei as edges, an undirected cell graph G is constructed. c =(V c E c This construction process is based on the assumption that "the closer the spatial proximity of cells, the stronger their biological interactions," and is specifically implemented as follows: The initial topology is constructed using the K-Nearest Neighbors (KNN) algorithm, and a preset distance threshold d is introduced. min Prune it. That is, for any two nodes v and u, if node u is a K-nearest neighbor of node v, and the Euclidean distance dist(v,u) between the two nodes in the image plane is less than d...min Only then is a connecting edge e established between the two. vu ∈E c This eliminates weak connections that are too long and may not have a biological connection.

[0038] After constructing the undirected cell graph, a four-layer graph isomorphic network (GIN) is used as the graph neural network to perform message passing and feature aggregation. Each GIN layer is implemented by a two-layer multilayer perceptron (MLP) with ReLU activation function, and the hidden dimension is set to 32. After message passing, the features of each cell node are fused with its local neighborhood information to obtain a context-aware cell node embedding vector h. j =GIN(c j G c The set of embedding vectors of all nodes constitutes the cell representation set H = {h1, h2, ..., h...} M}, where M is the total number of cells in the tumor region.

[0039] The specific steps for extracting tissue-level deep features are as follows: The tumor region obtained in step S1 is divided into N non-overlapping tissue blocks of interest. For the i-th tissue block x... i It is input into a pre-trained UNI basic feature extraction model, and the model outputs a fixed-dimensional d. p Deep feature vector = 1024 =UNI(x i This serves as the histological representation of the patch. From this, the feature set of the patch is obtained. .

[0040] Step S3: Using cell-level map features as the query, perform cross-attention processing on tissue-level deep features to generate cell-guided tissue representations, and adaptively fuse them with global tissue representations to obtain joint representations.

[0041] This step is the core feature fusion stage of this invention. To facilitate the interaction and computation between different modal features, a shared attention and fusion subspace is set, with a dimension d=128.

[0042] First, obtain the query vector for cross-attention. Then, perform attention pooling on the cell node embedding set H obtained in step S2: through a pool containing a learnable function. An attention network, embedding for each cell node Calculate attention weights The calculation formula is as follows:

[0043] Where M represents the total number of cells. Based on this, a weighted summation of all node embeddings is performed to obtain the aggregated global cell representation. Then through a linear projection layer Mapping it to a shared subspace yields the projected global cell representation. Its dimension is d. This representation will serve as the query vector in the subsequent cross-attention mechanism.

[0044] Secondly, obtain macroscopic organizational-level benchmark representations. Similarly, perform attention pooling on the patch feature set P obtained in step S2: introduce a learnable function. For each tile feature Calculate attention weights The calculation formula is as follows:

[0045] Where N is the total number of tissue blocks. Weighted summation yields the aggregated global histological representation. and pass it through a linear projection layer Projecting onto a shared subspace yields a projected global organizational representation. The dimension is also d. This representation represents a global view of the macroscopic tissue structure of the tumor region.

[0046] Next, cross-attention computation guided by cell information is performed. Each patch feature in the patch feature set P... Through parameter matrices respectively and Perform a linear transformation and project it as a key vector. Sum value vector Then, global cell characterization after projection. As a query vector, and with each key vector Perform dot product similarity calculation, after scaling factor After scaling and normalization using the Softmax function, a set of cross-attention weights is obtained. The calculation formula is as follows:

[0047] The physical meaning of this weight represents the distribution of the importance of each tissue block feature to the current task, guided by global cell morphology and topological information. Finally, this weight is used to apply to all value vectors v. i Weighted summation is performed to obtain cell-guided tissue characterization. .

[0048] Finally, adaptive gating fusion is performed. To preserve discriminative patterns inherent in macroscopic tissue features that may not be fully captured by cellular-level information (such as tumor stromal response), the system will... and Adaptive fusion is performed. Specifically, the two components are concatenated and input into a multilayer perceptron (MLP), activated by a sigmoid function, and the output is a scalar with a value between 0 and 1, which serves as a learnable gating coefficient. Using this coefficient, the two representations are weighted and summed to obtain the fused feature. Adaptive weighted summation and fusion are performed to obtain the fused features. This mechanism enables the model to dynamically balance the contributions of cell-guided representations and raw tissue representations for each input sample. Ultimately, the features are fused. Global cell characterization after projection By splicing the components, a joint representation with dimension 2d=256 is formed. .

[0049] Step S4: Based on joint characterization, simultaneously output the molecular subtyping prediction results and microsatellite instability prediction results for endometrial cancer.

[0050] The joint representation z obtained in step S3 is input into a multi-task classifier. This classifier contains two parallel fully connected classification branches: the output layer of the first classification branch contains 4 neurons, activated by Softmax, and outputs the probability values ​​corresponding to four standard molecular subtypes, namely POLE mutant (POLEmut), mismatch repair deficient (MMRd), p53 aberrant (p53abn), and non-specific molecular subtype (NSMP); the output layer of the second classification branch contains 2 neurons, also activated by Softmax, and outputs the predicted probabilities of high instability (MSI-H) and microsatellite stability (MSS).

[0051] Example 2 This embodiment provides detailed model training configuration, parameter settings, and multi-dimensional comparative verification experiments to fully demonstrate the creativity and technical effectiveness of each core module of the scheme proposed in Embodiment 1.

[0052] Model Training and Dataset Configuration: Three independent datasets were used for model training and evaluation: the TCGA-UCEC public dataset containing 527 cases with complete molecular typing annotations, the Jiangmen Central Hospital private dataset containing 374 cases, and the Ningbo Pathology Center private dataset containing 149 cases. All whole-slice images from all datasets were preprocessed according to step S1 of Example 1. Five-fold cross-validation was used for evaluation, with the data proportionally divided into training, validation, and test sets.

[0053] During the training of the segmentation model, a stochastic gradient descent (SGD) optimizer was used with an initial learning rate of 0.01, weight decay of 0.0001, and momentum of 0.9. The model was trained for 40 epochs with a batch size of 1. For the main network training of the classification task, an Adam optimizer was used with an initial learning rate of 0.001, weight decay of 0.0005, and a batch size of 16. The model was trained for 50 epochs, and an early stopping strategy with a patience value of 20 was adopted based on the validation set accuracy.

[0054] Experiment 1: Verification of the necessity of pre-segmentation of the tumor region.

[0055] This experiment compared the performance differences of two approaches on downstream tasks: "using all tiles in the whole slice" and "using only tumor region tiles". The experimental results are shown in Table 1.

[0056] Table 1. Impact of cancer region extraction on downstream task performance

[0057] As shown in Table 1, for both molecular subtyping and MSI prediction tasks, using only tumor region patches for training and inference achieves comprehensive and significant improvements in accuracy, AUC, and F1 score. This strongly demonstrates the necessity of the preliminary step of "segmenting the tumor region in the whole slice image" as described in the independent claim: it effectively eliminates clutter noise from non-tumor regions and improves the model's ability to focus on key discriminative cues.

[0058] Experiment 2: Validation of the effectiveness of cell-tissue multiscale fusion features.

[0059] This experiment compared the performance of three schemes: "using only cell-level features", "using only tissue block-level features", and "combining both". The experimental results are shown in Table 2.

[0060] Table 2. Impact of cell-tissue feature fusion on performance (molecular typing, JMZX dataset)

[0061] The results show that after fusing features from both scales, the model performance significantly outperforms either single modality in several key metrics. This demonstrates a complementary and synergistic effect between tissue-level macroscopic structural features and cell-level microscopic morphological features, which together constitute a more complete tumor characterization. This is direct evidence of the technological advancements brought about by the core concept of multimodal fusion.

[0062] Experiment 3: Validation of the effectiveness of the feature fusion strategy.

[0063] This experiment compares the performance of various fusion strategies on molecular typing tasks, focusing on the fusion mechanism in step S3. The experimental results are shown in Table 3.

[0064] Table 3 Performance comparison of different feature fusion strategies (molecular subtyping, JMZX dataset)

[0065] Table 3 shows that the learnable gating fusion strategy achieved the best performance, while the performance of fixed gating or direct splicing methods significantly decreased. This demonstrates that different samples have different degrees of dependence on macroscopic and microscopic features, and the sample-level adaptive adjustment capability of learnable gating is the key to this technical solution. Meanwhile, the significant performance drop caused by "removing cross-attention" confirms that the core mechanism of "cellular information guiding tissue feature reweighting" plays an indispensable role in improving downstream discrimination capabilities.

[0066] Experiment 4: Sensitivity analysis of shared subspace dimensions.

[0067] This experiment investigated the impact of different values ​​of the shared subspace dimension d on the model performance. The experimental results are shown in Table 4.

[0068] Table 4 Ablation experiments with shared subspace dimension d (molecular typing, JMZX dataset)

[0069] Experimental data show that the model performance is optimal when d=128, achieving the best balance between representation ability and generalization ability. Too small a dimension (64) limits the capacity of information interaction, while too large a dimension (256, 512) may introduce redundant degrees of freedom and increase the risk of overfitting.

[0070] Finally, in the final overall performance comparison, the method of this invention consistently outperforms existing advanced weakly supervised multi-instance learning models such as ABMIL, CLAM, and TransMIL in all evaluation metrics of molecular subtyping and MSI prediction on the TCGA-UCEC and Jiangmen Central Hospital datasets.

[0071] Experiment 5: Overall performance comparison between the method of this invention and existing technologies To comprehensively verify the superiority of the method of this invention over existing methods, this experiment compared the complete technical solution proposed in Example 1 with several representative weakly supervised multi-instance learning models, including ABMIL, AMD_MIL, AEM_MIL, CLAM, TransMIL, DyHG, WIKG, and DG, on two tasks: molecular typing and MSI prediction. Each comparison model was trained and evaluated using its original recommended default configuration and hyperparameters. The experimental results are shown in Tables 5 and 6.

[0072] Table 5 Comparison of Molecular Typing (Four-Part Classification) Performance

[0073] As shown in Table 5, in the molecular subtyping task, the method of this invention consistently met or surpassed all comparative models in all three core metrics of the TCGA-UCEC and Jiangmen Central Hospital (JMZX) datasets. In particular, compared with the best-performing baseline model, the AUC of the method of this invention was improved to 0.8431 and 0.8310 on the two datasets, respectively, demonstrating excellent and stable subtype discrimination ability.

[0074] Table 6 Comparison of Microsatellite Instability (MSI) Prediction Performance

[0075] As shown in Table 6, the method of this invention also demonstrates significant advantages in the MSI prediction task, achieving the best performance in accuracy, AUC, and F1 score. In particular, the AUC reached 0.7037, a substantial improvement compared to other methods. This fully demonstrates that the tumor region segmentation and cell-tissue multi-scale feature fusion scheme proposed in this invention can extract more discriminative tissue morphological features, contributing to more accurate molecular state inference.

[0076] Example 3 like Figure 2 The image shown is a schematic diagram comparing the attention heatmap generated by the method of this invention with the results of manual annotation by pathology experts. Among them, Figure 2 The image on the left is the original H&E stained whole slide image, where the area circled in red is the actual tumor parenchyma area manually marked by the pathologist according to clinical diagnostic criteria. Figure 2 The middle and right sides show the attention heatmaps generated by the model of this invention during the prediction task. The heatmaps are displayed in pseudo-color, where warm-toned areas such as red and yellow represent feature regions assigned the highest weight by the model during the decision-making process, while cool-toned areas such as blue represent regions that the model considers to have a lower contribution to the diagnostic task. The comparison shows that the high-attention areas of the model of this invention exhibit a very high degree of overlap with the tumor areas marked by pathologists. Outside the red lines marked by the experts, i.e., the normal glands, stroma, and blank background areas, the heatmap shows a large area of ​​cool blue, indicating that the pre-tumor focusing mechanism of this scheme effectively eliminates interference from irrelevant background noise.

[0077] Further analysis revealed that the warm-toned centers the model focuses on are primarily concentrated in the tumor core region, characterized by abnormal nuclear morphology, dense arrangement, and significant tissue heterogeneity. This aligns closely with the medical logic that molecular subtyping of endometrial cancer relies on subtle regional heterogeneity within tumor tissue. This visualization intuitively demonstrates that the proposed "cell-guided cross-attention" and "adaptive gating fusion" mechanisms enable the model to capture histological cues closely related to subtype prediction, thus providing strong clinical interpretability support for the final classification results.

[0078] Example 4 Based on the same inventive concept, this embodiment provides an endometrial cancer prediction device based on cell-level feature fusion. This device strictly corresponds to the aforementioned method embodiment and includes: The tumor region extraction module is used to acquire whole-slice images of pathology and segment the tumor regions from the whole-slice images to obtain the tumor regions.

[0079] The dual-branch feature extraction module is used to extract cell-level map features and tissue-level depth features separately within the tumor region.

[0080] The adaptive fusion module is used to perform cross-attention processing on tissue-level deep features using cell-level graph features as queries, generate cell-guided tissue representations, and adaptively fuse the cell-guided tissue representations with the global tissue representations obtained by attention pooling of tissue-level deep features to obtain a joint representation.

[0081] The multi-task prediction module is used to simultaneously output the molecular subtyping prediction results and microsatellite instability prediction results of endometrial cancer based on joint characterization.

[0082] The specific functions performed by each module strictly correspond to steps S1 to S4 in the method embodiment. The implementation details are detailed in the method embodiment and will not be repeated here.

[0083] Example 5 This embodiment also provides an electronic device, see reference. Figure 4 It includes a memory 404 and a processor 402, wherein the memory 404 stores a computer program and the processor 402 is configured to run the computer program to perform the steps in any of the above method embodiments.

[0084] Specifically, the processor 402 may include a central processing unit (CPU), or an application-specific integrated circuit (ASIC), or one or more integrated circuits that can be configured to implement embodiments of the present invention.

[0085] Memory 404 may include a mass storage device for data or instructions. For example, and not limitingly, memory 404 may include a hard disk drive (HDD), a floppy disk drive, a solid-state drive (SSD), flash memory, an optical disk drive, a magneto-optical disk drive, magnetic tape, or a Universal Serial Bus (USB) drive, or a combination of two or more of these. Where appropriate, memory 404 may include removable or non-removable (or fixed) media. Where appropriate, memory 404 may be internal or external to a data processing device. In a particular embodiment, memory 404 is non-volatile memory. In a particular embodiment, memory 404 includes read-only memory (ROM) and random access memory (RAM). Where appropriate, the ROM may be a mask-programmed ROM, a programmable read-only memory (PROM), an erasable read-only memory (EPROM), an electrically erasable read-only memory (EEPROM), an electrically alterable read-only memory (EAROM), or flash memory, or a combination of two or more of these. Where appropriate, the RAM can be Static Random-Access Memory (SRAM) or Dynamic Random-Access Memory (DRAM). DRAM can be Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), Extended Data Out Dynamic Random-Access Memory (EDODRAM), Synchronous Dynamic Random-Access Memory (SDRAM), etc.

[0086] The memory 404 can be used to store or cache various data files that need to be processed and / or communicated, as well as possible computer program instructions executed by the processor 402.

[0087] The processor 402 reads and executes computer program instructions stored in the memory 404 to implement any of the cell-level feature fusion-based endometrial cancer prediction methods in the above embodiments.

[0088] Optionally, the electronic device may further include a transmission device 406 and an input / output device 408, wherein the transmission device 406 is connected to the processor 402, and the input / output device 408 is connected to the processor 402.

[0089] The transmission device 406 can be used to receive or send data via a network. Specific examples of the network described above may include wired or wireless networks provided by the communication provider of the electronic device. In one example, the transmission device includes a Network Interface Controller (NIC), which can connect to other network devices via a base station to communicate with the Internet. In another example, the transmission device 406 may be a Radio Frequency (RF) module used for wireless communication with the Internet.

[0090] Input / output device 408 is used to input or output information.

[0091] Example 6 This embodiment also provides a readable storage medium storing a computer program, the computer program including program code for controlling a process to execute the process, the process including the endometrial cancer prediction method based on cell-level feature fusion according to Embodiment 1.

[0092] It should be noted that the specific examples in this embodiment can refer to the examples described in the above embodiments and optional implementations, and will not be repeated here.

[0093] Generally, various embodiments can be implemented in hardware or dedicated circuitry, software, logic, or any combination thereof. Some aspects of the invention can be implemented in hardware, while others can be implemented by firmware or software executed by a controller, microprocessor, or other computing device, but the invention is not limited thereto. Although various aspects of the invention may be shown and described as block diagrams, flowcharts, or using some other graphical representation, it should be understood that, by way of non-limiting example, these blocks, apparatuses, systems, techniques, or methods described herein can be implemented in hardware, software, firmware, dedicated circuitry or logic, general-purpose hardware or controllers or other computing devices, or some combination thereof.

[0094] Embodiments of the present invention can be implemented by computer software, which may be executable by a data processor of a mobile device, such as a processor entity, or by hardware, or by a combination of software and hardware. Computer software or programs (also referred to as program products), including software routines, applets, and / or macros, can be stored in any device-readable data storage medium, and they include program instructions for performing specific tasks. A computer program product may include one or more computer-executable components configured to perform embodiments when the program is run. One or more computer-executable components may be at least one piece of software code or a portion thereof. Additionally, it should be noted that any block in the logical flow of the figures may represent a program step, or interconnected logical circuitry, blocks and functions, or a combination of program steps and logical circuitry, blocks and functions. The software may be stored on physical media such as memory chips or blocks of storage implemented within a processor, magnetic media such as hard disks or floppy disks, and optical media such as, for example, DVDs and their data variants, CDs, etc. The physical medium is a non-transient medium.

[0095] Those skilled in the art should understand that the technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments have been described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0096] The above embodiments are merely illustrative of several implementations of the present invention, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of the present invention. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of the present invention, and these all fall within the protection scope of the present invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims

1. A method for predicting endometrial cancer based on cell-level feature fusion, characterized by, Includes the following steps: Obtain a full pathological slide image, and segment the tumor region from the full slide image to obtain the tumor region; Cell-level map features and tissue-level depth features were extracted from the tumor region, respectively. The cell-level graph features are projected into query vectors, and each feature vector contained in the tissue-level deep features is projected into a key vector and a value vector, respectively. Based on the similarity between the query vector and each of the key vectors, cross-attention weights are calculated, and the value vectors are weighted and summed using the cross-attention weights to obtain the cell-guided tissue characterization. The cell-guided tissue representation and the global tissue representation are weighted and summed using a learnable gating coefficient to adaptively fuse them, resulting in fused features. The fusion features are combined with the cell-level map features to form a joint representation; Based on the aforementioned joint characterization, the molecular subtyping prediction results and microsatellite instability prediction results for endometrial cancer are output simultaneously.

2. The method of claim 1, wherein, The tumor region is segmented from the whole slice image to obtain the tumor region, including: The full slice image is divided into multiple non-overlapping image blocks; The trained tumor segmentation model is used to perform tumor / non-tumor binary segmentation on each image block, and the segmentation results are mapped back to the whole slice image to generate a tumor region mask. Only the region within the tumor region mask is taken as the tumor region. The tumor segmentation model is built on a base model pre-trained on histopathological images.

3. The method of claim 1, wherein, Extracting the cell-level map features includes: Cell nucleus detection and instance segmentation are performed within the tumor region, and the morphological, texture, and spatial location features of each cell nucleus are extracted as initial node features; Using the cell nucleus as a node, an undirected cell graph is constructed based on the spatial proximity between cell nuclei. A graph neural network is used to perform message passing and feature aggregation on the undirected cell graph to obtain a global cell representation, which serves as the cell-level graph feature.

4. The method of claim 3, wherein, Undirected cell maps are constructed based on the spatial proximity between cell nuclei, including: The initial topology is constructed using the K-nearest neighbor algorithm; For any two nodes, a connection edge is established between them only if the Euclidean distance between them in the image plane is less than a preset distance threshold, so as to complete the topology pruning.

5. The method according to claim 1, characterized in that, Extracting the tissue-level deep features includes: The tumor region is divided into multiple tissue blocks of interest; Deep feature extraction is performed on each of the tissue blocks of interest using a pre-trained basic feature extraction model to obtain a tissue block feature set; Attention pooling is performed on the set of features of the tissue block to obtain the global tissue representation.

6. The method according to claim 1, characterized in that, The adaptive fusion is performed within a defined shared fusion subspace, the dimension of which is 128.

7. The method according to claim 1, characterized in that, Simultaneous output of molecular subtyping prediction results and microsatellite instability prediction results for endometrial cancer, including: The joint representation is input into a multi-task classifier; The molecular subtyping prediction results of endometrial cancer are output through the first classification branch. It is a four-category result, including POLE mutant, mismatch repair deficient, p53 abnormal and non-specific molecular profile. The microsatellite instability prediction results are output through the second classification branch. These results are binary classification results, including high instability and microsatellite stability.

8. An apparatus for implementing the endometrial cancer prediction method based on cell-level feature fusion as described in any one of claims 1 to 7, characterized in that, include: The tumor region extraction module is used to acquire a whole pathological slide image and segment the whole slide image to obtain the tumor region. A dual-branch feature extraction module is used to extract cell-level map features and tissue-level depth features respectively within the tumor region; An adaptive fusion module is used to perform cross-attention processing on the tissue-level deep features using the cell-level map features as a query, generate cell-guided tissue representations, and adaptively fuse the cell-guided tissue representations with the global tissue representations obtained by attention pooling on the tissue-level deep features to obtain a joint representation. The multi-task prediction module is used to simultaneously output the molecular subtyping prediction results and microsatellite instability prediction results of endometrial cancer based on the joint characterization.

9. An electronic device comprising a memory and a processor, characterized in that, The memory stores a computer program, and the processor is configured to run the computer program to perform the endometrial cancer prediction method based on cell-level feature fusion as described in any one of claims 1 to 7.

10. A readable storage medium, characterized in that, The readable storage medium stores a computer program, the computer program including program code for controlling a process to execute the process, the process including the endometrial cancer prediction method based on cell-level feature fusion according to any one of claims 1 to 7.