A medical image analysis system based on a multi-granularity fuzzy classification network

By using a multi-granularity fuzzy classification network, the problems of noise interference and few-sample learning in pathological slide images are solved, achieving high-precision medical image classification in complex noise environments and improving the robustness and reliability of the model.

CN122244615APending Publication Date: 2026-06-19HARBIN INST OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HARBIN INST OF TECH
Filing Date
2026-03-23
Publication Date
2026-06-19

Smart Images

  • Figure CN122244615A_ABST
    Figure CN122244615A_ABST
Patent Text Reader

Abstract

This invention relates to a medical image analysis system based on a multi-granularity fuzzy classification network, belonging to the field of medical image processing technology. To address the problem of error accumulation in the feature extraction stage of traditional classification models due to noise interference in pathological slide images, this invention utilizes a multi-granularity fuzzy classification network for medical image classification. The input features are enhanced and divided into K subsets, each subset corresponding to a feature of a specific granularity. Features at different abstraction levels are captured through three parallel sub-networks, and after weighted fusion, a multilayer perceptron is used to obtain T. output The system performs fuzzy logic classification on the features, outputs the category probability and estimates the prediction uncertainty, and makes a fusion decision based on the results obtained from multiple fuzzy classifiers. The fusion result is then concatenated with the global classification logic value corresponding to the original input, and the final prediction result is generated through the fusion layer.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of medical image processing technology and relates to a method for classifying pathological slide images. Background Technology

[0002] Current medical image classification faces the following challenges:

[0003] 1. Image blurring and noise interference: Pathological slide images have problems such as staining differences, blurred tissue boundaries, and noise interference, which cause traditional classification models to accumulate errors in the feature extraction stage, significantly affecting the final diagnostic accuracy and increasing the risk of clinical misdiagnosis.

[0004] 2. Difficulty in learning with small samples: Medical image annotation data is scarce, and traditional deep learning models are prone to overfitting with limited samples, resulting in insufficient generalization ability of the model in actual clinical applications and difficulty in adapting to the differences in data distribution generated by different medical institutions and different acquisition devices.

[0005] 3. Lack of reliability assessment in classification decisions hinders clinical credibility: Existing classification models lack assessment of the confidence level of prediction results, making it impossible for clinicians to judge the reliability of the model output. This severely restricts the practical application value of AI-assisted diagnostic systems in critical medical decisions. Traditional methods such as threshold segmentation and CNNs have limitations and cannot effectively handle fuzzy boundaries and uncertainties in medical images. Summary of the Invention

[0006] This invention aims to address the problem of error accumulation in the feature extraction stage of traditional classification models due to noise interference in traditional pathological slide images.

[0007] A medical image analysis system based on a multi-granularity fuzzy classification network includes:

[0008] Data acquisition and preprocessing unit: used to acquire pathological slide images and process them into data suitable for processing by multi-granularity fuzzy classification networks;

[0009] Fuzzy classification unit: Medical images are classified using a multi-granularity fuzzy classification network; the multi-granularity fuzzy classification network model includes: a feature enhancement module, a multi-scale feature fusion module, a fuzzy classifier, and a multi-granularity decision fusion module;

[0010] Feature enhancement module: Enhances the input features; including: first, performing a nonlinear transformation on the input features T, and then obtaining the enhanced features through residual connections. Then, layer normalization is used to standardize the features to obtain the standardized features. ;against The attention weight A is obtained using the attention mechanism and then compared with... Element-wise multiplication yields the enhanced features ;

[0011] Multi-scale feature fusion module: targeting Extracting and fusing feature information at different scales; including: combining feature vectors The data is uniformly divided into K subsets, each subset corresponding to a feature at a specific granularity, and each subset corresponds to a feature dimension subspace. ,in For the feature of the i-th granularity, For the number of granularities; targeting It captures features at different levels of abstraction through three parallel sub-networks. , , Based on learnable fusion weight parameters w raw Obtained through softmax normalization ; weighted fusion is performed to obtain , ⊕ indicates a concatenation operation; [the feature] The result is obtained by nonlinear transformation using a multilayer perceptron. ;

[0012] Fuzzy classifier: performs fuzzy logic classification on features, outputs class probabilities, and estimates prediction uncertainty; including: for each corresponding The classification result is obtained by constructing a classification head using multi-level nonlinear transformation. ;against The regression uncertainty u is obtained by using an independent subnetwork with uncertainty estimation.

[0013] Multi-granularity decision fusion module: performs fusion decision based on multiple results obtained through fuzzy classifiers; includes:

[0014] Each granular feature corresponding Let it be T m , for each granularity T m The corresponding independent fuzzy classifier outputs the logits and uncertainty u of that category, denoted as and Attention weights are calculated based on the uncertainties at each granularity. α m Let m be the attention weight at the m-th granularity; Let be the uncertainty fraction for the m-th granularity. The uncertainty score for the j-th granularity, where m and j both represent granularity indices; output for each granularity. ;

[0015] For feature tensors The global classification logical value logits is obtained through a global classification network. global ; will logits global With all The final prediction is obtained based on the stitching results. This refers to the classification results of medical images.

[0016] Furthermore, the data acquisition and preprocessing unit includes an adaptive downsampling module, which is used to adaptively downsample the pathological slide image. During the downsampling process, the downsampling rate is calculated based on the ratio of the maximum magnification of the data corresponding to the pathological slide image to the preset target magnification, so as to complete the downsampling.

[0017] Furthermore, the data acquisition and preprocessing unit also includes an adaptive block segmentation module, used to divide the downsampled image into a series of image blocks, including:

[0018] A201. For downsampled data, the entire slice image is meshed by sliding a window with a fixed step size according to the set tile size to generate image blocks, also known as tiles.

[0019] A202. Perform grayscale conversion and gradient field measurement on each tile to obtain the gradient amplitude. Based on the gradient amplitude, obtain the flat area ratio. Based on the flat area ratio, determine the effective organization area and retain the corresponding tile.

[0020] Further, step A202 obtains the proportion of flat regions based on the gradient magnitude, and the process of determining the effective tissue region based on the proportion of flat regions includes:

[0021] Statistical tile Q i The number of pixels whose gradient magnitude is below the gradient magnitude threshold τ is P_flat=Σ_{x,y}I[M i (x, y) < τ], where I[·] is an indicator function that has a value of 1 when the condition is true and 0 otherwise; M i (x, y) represents the gradient magnitude at pixel (x, y);

[0022] Then calculate the flat area ratio R_flat = P_flat / (S × S), where S × S is the size of each tile; tiles with R_flat less than or equal to the flat area ratio threshold θ are considered as Q. i It is considered an effective organizational area.

[0023] Furthermore, the data acquisition and preprocessing unit also includes a color normalization module for color normalizing image patches, including:

[0024] A301. Calculate the optical density OD based on the transmitted light intensity I and the maximum light intensity I_max, and remove background pixels based on the pixel OD.

[0025] A302. Perform principal component analysis on the OD values ​​of the remaining pixels to extract the coloring basis vectors H and E of the image;

[0026] A303, Place tile Q k Projecting the staining basis vectors according to their OD values, we obtain the staining concentration matrix C. source ;

[0027] A304. Determine the target standard coloring basis vector H_target:

[0028] A305, Obtain the staining concentration matrix C source The first percentile is denoted as max(C). source Normalization is performed as follows:

[0029]

[0030] Where diag(·) denotes the construction of a diagonal matrix; The normalized concentration matrix; max(C target () represents the first percentile of the target standard concentration distribution;

[0031] A306, Using the normalized concentration matrix C normalized and the target standard coloring basis vector H target Reconstruct the normalized optical density value OD normalized =C normalized ×H target The optical density values ​​are converted back to RGB space to obtain the reconstructed RGB image.

[0032] Furthermore, the data acquisition and preprocessing unit also includes a depth feature extraction module, used for color-normalized image tiles Q' k Deep feature extraction yields data suitable for processing by multi-granularity fuzzy classification networks.

[0033] Furthermore, the data acquisition and preprocessing unit also includes a composite data augmentation module for generating multiple augmentation features. The generated multiple augmentation features are used for training the multi-granularity fuzzy classification network. The process of generating multiple augmentation features adopts a parallel augmentation method, which includes at least one of Gaussian noise injection, random feature masking, feature space interpolation, and adaptive scaling.

[0034] Furthermore, the structure of the global classification network is: linear layer + layer normalization + GELU activation + Dropout.

[0035] Furthermore, the process of fusion decision-making in the multi-granularity decision fusion module is replaced by the following:

[0036] Each granular feature corresponding Let it be T m , for each granularity T m The corresponding independent fuzzy classifier outputs the logits and uncertainty u of that category, denoted as and Attention weights are calculated based on the uncertainties at each granularity. α m Let m be the attention weight at the m-th granularity; Let be the uncertainty fraction for the m-th granularity. The uncertainty score for the j-th granularity, where m and j both represent granularity indices; weighted fusion of outputs from each granularity. ,logits m This is the classification logical value for the m-th granularity. The weighted and merged logical value;

[0037] For feature tensors The global classification logical value logits is obtained through a global classification network. global ; will logits global and The final prediction is obtained based on the stitching results. This refers to the classification results of medical images.

[0038] Furthermore, the multi-granularity fuzzy classification network model also includes an ensemble learning module for integrating P independent multi-granularity decision fusion modules to perform ensemble classification; envelope:

[0039] Averaging is performed based on the outputs of p multi-granularity decision fusion modules: ,in For the p-th independent multi-granularity decision fusion module ;

[0040] Will As the final prediction result, the medical images are classified.

[0041] Beneficial effects:

[0042] This invention intelligently divides a high-dimensional feature space into multiple complementary granular subspaces through multi-granularity dynamic weighted fusion, and innovatively introduces an attention mechanism guided by "uncertainty". The network can automatically assess the impact of noise on each granular feature and dynamically reduce the weight of granularities heavily contaminated by noise, while increasing the contribution ratio of clear and reliable granularities. This structured adaptive fusion mechanism ensures that the system maintains robustness and high accuracy in decision-making even when facing complex noise with uneven distribution. Attached Figure Description

[0043] Figure 1 This is the original pathological slide image.

[0044] Figure 2 This is an image of the effective region from the original pathological slide image.

[0045] Figures 3 to 5 The training results for three independent models are shown in the image.

[0046] Figure 6 This is a diagram showing the overall training results of the model.

[0047] Figure 7 This is a comparison chart of the robustness of different models under three noise scenarios.

[0048] Figure 8 This is a comparison chart showing the uncertainty of different models when adding noise to an image.

[0049] Figure 9 This is a diagram showing the effect of ROC for Gaussian noise.

[0050] Figure 10 This is a diagram showing the ROC response to sudden noise.

[0051] Figure 11 This is a diagram showing the effect of a systematic ROC offset. Detailed Implementation

[0052] This invention proposes a fuzzy classification network technology based on uncertainty-weighted multi-granularity decision fusion, which solves key problems in medical image classification through the following core innovations:

[0053] (1) To address the issue of feature reliability differences under noise interference, an innovative inter-granular attention mechanism based on uncertainty is proposed: the uncertainty score is independently predicted for each feature granularity, which serves as the basis for attention weights, thereby automatically reducing the decision weights of high uncertainty granularities under noise interference and improving the robustness of the model in noisy environments.

[0054] (2) In response to the quantitative needs of fuzzy boundaries and diagnostic uncertainty in medical images, the membership function of the model is deeply coupled with the feature granularity, rather than a simple post-processing module. A membership function system oriented towards multi-granularity features is designed, and each feature granularity is equipped with an independent membership function to form a multi-expert decision system. This extends the single fuzzy classification to multi-granularity fuzzy reasoning, which better expresses the uncertainty of different regions and features in medical images.

[0055] (3) To address the overfitting problem under small sample conditions, a multi-scale feature fusion and fuzzy regularization collaborative optimization strategy is proposed: multi-scale perception is introduced in the feature enhancement stage, fuzzy entropy constraint is added to the loss function, feature utilization under small sample conditions is improved through multi-scale feature mining, and overfitting is prevented through fuzzy regularization.

[0056] The following detailed description is provided in conjunction with specific implementation methods. Specific implementation method one:

[0058] This embodiment is a medical image analysis system based on a multi-granularity fuzzy classification network, including:

[0059] I. Data Preprocessing Unit:

[0060] Download full-view digital slide (WSI) images of single cancer types (such as cervical cancer and lung cancer) from open-source datasets such as TCGA. The images were created using professional pathology slide scanners such as Aperio AT2 / CS2 and Leica, in .svs format, with a resolution of up to 40x magnification and a single image size of up to 100,000 x 100,000 pixels.

[0061] Data Preprocessing Unit: For the input .svs format Whole Field of View (WSI) digital slice files, the .svs images are converted into normalized feature vectors (.npy format). Each .npy file is a floating-point matrix of shape (n, 2048), representing the features of all valid regions of a slice. The output is a feature-enhanced numerical dataset that can be used to directly train a classification network.

[0062] n = number of effective tiles = number of tissue regions with diagnostic value, which is also the number of rows in the NumPy file matrix obtained from the data preprocessing unit.

[0063] A pathological slide image has a tissue-rich "foreground" area and a blank or irrelevant "background" area; Figure 1 These are the original pathological slide images. Figure 2 This is the valid region. The following explanation details how the algorithm selects the valid region:

[0064] A tag file records the AJCC phase tags (I, II, III, IV) corresponding to each .npy file.

[0065] The data preprocessing workflow consists of five main steps: adaptive downsampling → adaptive block division → color normalization → ResNet-50 feature extraction → composite data augmentation (used only during training).

[0066] A1. Adaptive Downsampling:

[0067] WSI files are extremely large (typically several gigabytes), making it impossible to send the entire file over the network for processing. It is necessary to significantly reduce the data volume while preserving as much tissue morphology information as possible. The specific processing steps in this process include:

[0068] A101. Use the tifffile library to read .svs files and obtain their multi-resolution pyramid structure.

[0069] A102. Calculate the downsampling rate (downsampling=mag_max / mag_selected) based on the preset target magnification factor (e.g., mag_selected=20) and the maximum magnification factor of WSI itself (e.g., mag_max=40).

[0070] By selecting the level in the pyramid that best matches the target magnification, downsampling is naturally achieved, avoiding computationally intensive manual scaling operations.

[0071] Output: Downsampled .svs file, i.e., .svs file with reduced pixel size.

[0072] A2. Adaptive Tiling:

[0073] The downsampled WSI image is divided into a series of smaller, fixed-size image tiles for subsequent tile-by-tile processing and analysis. The specific processing steps in this step include:

[0074] A201. For downsampled .svs files, the entire sliced ​​image is meshed using a sliding window with a fixed step size, according to the set tile size (e.g., tile_size=512 pixels), generating thousands of image blocks, also known as tiles. The main purpose of tileization is to enable large-size WSIs to be processed by deep learning models or image processing algorithms.

[0075] More specifically, the downsampled whole-slice pathological image is divided into several rectangular image blocks of fixed size according to a set spatial grid, denoted as tile Q. i, where i∈{1,2,...,N}, and N is the total number of tiles. The size of each tile is S×S pixels, and in this embodiment, S=512.

[0076] A202. Distinguish between the organized "foreground" area and the blank or irrelevant "background" area in each tile.

[0077] (1) Grayscale conversion:

[0078] For each tile Q i Convert it from color space to grayscale space to obtain grayscale image G. i The conversion formula adopts the International Commission on Illumination (ICI) standard: G i =0.299·R+0.587·G+0.114·B. Where R, G, and B are the values ​​of tile Q. i Pixel values ​​in the red, green, and blue channels.

[0079] (2) Gradient field calculation:

[0080] For grayscale image G i Applying the Sobel edge detection operator, we calculate its approximate first-order partial derivatives in the horizontal (x-axis) and vertical (y-axis) directions, respectively:

[0081] ▽ x G i =Sobel x *G i

[0082] ▽ γ G i =Sobel γ *G i

[0083] in, Sobel represents a two-dimensional convolution operation. x and Sobel γ Sobel convolution kernels for the horizontal and vertical directions, respectively: Sobel x =[[-1,0,1],[-2,0,2],[-1,0,1]], Sobel γ =[[-1, -2, -1], [0, 0, 0], [1, 2, 1]].

[0084] (3) Gradient magnitude calculation:

[0085] Based on the calculated gradient components, the gradient magnitude M at each pixel (x, y) is calculated. i (x, y):

[0086] M i (x, y) = |▽x G i (x, y)|+|▽ γ G i (x, y)|

[0087] Here, |·| represents the absolute value operation. This formula strikes a balance between computational efficiency and accuracy, avoiding the computational overhead of square and square root operations.

[0088] (4) Texture richness quantification:

[0089] Define a gradient magnitude threshold τ to distinguish between regions with significant texture features and flat regions. Statistical tile Q i The number of pixels, P_flat, whose gradient magnitude is below the threshold τ:

[0090] P_flat=Σ_{x,y}I[M i (x, y) < τ]

[0091] Here, I[·] is an indicator function, which has a value of 1 when the condition is true and 0 otherwise. It is used to traverse every pixel position in the image to mark which pixels belong to the "flat region".

[0092] In this embodiment, the gradient amplitude threshold τ is set to 15 (to suit most H&E stained pathological images).

[0093] Then calculate the flat area ratio R_flat:

[0094] R_flat = P_flat / (S × S)

[0095] By organizing the region evaluation function, the Sobel edge gradients in the x and y directions of the grayscale image of each image patch are calculated, and the gradient magnitudes are calculated. Then, the proportion of pixels with gradient magnitudes below a threshold (i.e., flat regions) is calculated. If the proportion of flat regions is too high, it is determined to be an unorganized background, and the tile is discarded; otherwise, the tile is retained. The specific implementation is as follows:

[0096] Set a flat region ratio threshold θ. In this embodiment, θ is set to 0.5. Compare R_flat with θ:

[0097] If R_flat>θ, determine tile Q. i This is an invalid background area and is excluded.

[0098] If R_flat≤θ, determine tile Q. i To effectively organize the area, it will be retained.

[0099] Ultimately, only tiles deemed to contain valid organization are retained, significantly reducing the computational burden of subsequent processing. The output is the set of tiles {Q} that are all determined to contain valid organization regions.k}, where k∈{1,2,...,M}, M is the total number of effective organization tiles, and M≤N.

[0100] A3. Color Normalization:

[0101] Eliminating color differences between slides caused by factors such as staining reagents, scanning equipment, and operating procedures allows the model to focus on tissue structure features rather than color variations, significantly improving the model's generalization ability.

[0102] For a rectangular image patch Q of a fixed size (e.g., 512×512 pixels) divided from a full slice image k Where k∈{1,2,...,M}, the following processing is performed:

[0103] A301. Calculate optical density (OD):

[0104] Optical density is a physical quantity describing the light absorption characteristics of a dye. It is defined as the logarithm of the ratio of incident light intensity I_0 to transmitted light intensity I, and is approximately calculated in digital images as follows:

[0105] OD=-log 10 (I / I_max)

[0106] Where I_max is the maximum light intensity (corresponding to a pixel value of 255).

[0107] Remove pixels with low OD values ​​(OD < 0.15) (probably the background).

[0108] A302. Determine the coloring basis vectors:

[0109] A staining basis vector is a unit vector representing the color direction of a staining agent in optical density space. Principal component analysis (PCA) is performed on the OD values ​​of the remaining pixels to extract the first two principal components with clear physical meaning as the staining basis vectors of the source image. For H&E staining, there are two main staining basis vectors: the hematoxylin vector H and the eosin vector E.

[0110] A303, Determine the concentration matrix C source :

[0111] For pixels processed by A301, the projected coordinate matrix of each pixel in the tile onto each staining basis vector represents the relative concentration of each staining agent at that pixel. The tile Q... k Projecting the staining basis vectors according to their OD values, we obtain the staining concentration matrix C. source .

[0112] For H&E staining, the OD value of each pixel is decomposed into two standard basis vectors, H and E, which are combined to obtain the concentration matrix C.source It reflects the relative content of each dye in each pixel of the image, and is a separate representation of color information and structural information.

[0113] In this embodiment, the tile size is 512×512, so the concentration matrix C source The dimension is (512×512)×2.

[0114] A304. Determine the target standard coloring basis vector H_target:

[0115] A predefined standardized staining color direction is used, and the target standardized staining basis vector H_target is obtained through statistical analysis of a large number of standard staining samples. In this embodiment, the target standardized staining basis vector H_target is fixed as follows:

[0116] H_target =[[0.5626, 0.7201, 0.4062],

[0117] [0.2159, 0.8012, 0.5581]]

[0118] This matrix (of shape (2,3), with the two rows corresponding to hematoxylin and eosin respectively) corresponds to the basis vectors of standard H&E staining in optical density space.

[0119] A305, Target Standard Concentration Distribution:

[0120] Source image staining concentration matrix C source 99th percentile max(C source ), and normalize:

[0121]

[0122] Where diag(·) denotes the construction of a diagonal matrix; This is the normalized concentration matrix;

[0123] Define the 99th percentile max (C) of the target standard concentration distribution. target The preferred values ​​are [1.9705, 1.0308].

[0124] A306, Reconstructed Image:

[0125] Using the normalized concentration matrix C normalized and the target standard coloring basis vector H target Reconstruct the RGB image. Reconstruct the normalized optical density values:

[0126] OD normalized =C normalized ×H target

[0127] Convert the optical density value back to RGB space:

[0128]

[0129] For each tile, the Macenko normalization method described above is used for normalization to obtain the color-normalized image patch Q'. k .

[0130] A4. ResNet-50 Deep Feature Extraction:

[0131] For the color-normalized image patch Q' k The pixel-level information extracted from the tiles is transformed into higher-level, more abstract semantic feature vectors, which serve as input to the subsequent classification network. The specific process includes:

[0132] A401 uses a ResNet-50 model pre-trained on ImageNet as a feature extractor.

[0133] A402: Remove its original classification head (fully connected layer), retain only the convolutional backbone network, and convert the model into a "feature encoder".

[0134] A403, For each color-normalized image tile Q' k The standard preprocessing (scaled to 224x224, normalized ImageNet mean and standard deviation) is then applied before being fed into the encoder.

[0135] A404. Extract the feature map (usually a 2048-dimensional vector) output by the encoder as the final feature representation of the tile, i.e., matrix F.

[0136] A5. Feature-Level Data Augmentation Module:

[0137] For each matrix F of shape (n, 2048) representing a single WSI, data augmentation is performed in the feature space rather than the pixel space to avoid the computational overhead of image-level augmentation. This also generates diverse and semantically consistent new samples, expanding the training dataset. The specific process includes:

[0138] A501. For each feature F, generate multiple enhanced features independently and in parallel:

[0139] a. Gaussian Noise Injection:

[0140]

[0141] Where F is the feature matrix extracted by ResNet-50, ϵ is a Gaussian noise matrix with a mean of 0 and a standard deviation of σ, where σ = 0.2, and I is the identity matrix.

[0142] Gaussian noise injection is used to simulate sensor noise during image acquisition, enhancing the model's robustness to small perturbations.

[0143] b. Random Feature Masking:

[0144]

[0145] in, The enhanced feature matrix; M∈{0,1} n×2048 It is a mask matrix, where each element independently follows a Bernoulli distribution: P(M ij =0)=0.4, P(M ij =1)=0.6; ⊙ represents the Hadamard product (element-by-element multiplication).

[0146] After this enhancement, each element is randomly assigned a probability of 0.4 to become 0; this can simulate situations where features are missing or occluded, forcing the model not to rely on a few key features and improving the coverage of feature distribution.

[0147] c. Feature Space Interpolation: Randomly select two samples F from the same batch of features extracted by ResNet50. i and F j New samples are generated by linear interpolation using weight λ. :

[0148]

[0149] in, Indicates uniform distribution. These are random interpolation coefficients, uniformly distributed between 0.3 and 0.7.

[0150] By randomly selecting two eigenvectors in the matrix and performing linear interpolation to generate a new vector to replace the original vector, an "intermediate state" between the original samples can be generated in the feature space to simulate the continuous morphological changes of pathological tissues.

[0151] d. Adaptive Scaling:

[0152]

[0153] Where ⊙ denotes element-wise multiplication, where each feature dimension is independently multiplied by a uniformly distributed scaling factor. .

[0154] Original matrix F One obey The scaling vector of the (0.8, 1.2) distribution sampling can simulate changes in staining concentration or light intensity, enhancing the model's adaptability to intensity changes.

[0155] For each feature F from ResNet-50, four enhanced versions of the feature are generated, with the same shape (n, 2048). These are then processed by a composite data augmentation module to enhance the workflow and expand the data.

[0156] A502, for F and its four corresponding augmented features, the final data volume becomes 5 times that of the original feature F (1 original file + 4 augmented files). The composite data augmentation module is needed in the laboratory to increase the dataset before network training. In practical applications, if the dataset is sufficient, this module is not necessary.

[0157] For each input file F, four different enhanced versions of features are generated simultaneously. Including the F itself directly output by ResNet50, there are a total of five feature files. For example, there were originally 200 matrices (npy files) from WSI, which become 1000 matrices after the feature enhancement module.

[0158] Based on existing clinical files on TCGA, the UUIDs of pathological slide images are matched with their AJCC pathological stage classifications, ultimately dividing them into four major stage categories (Roman numerals) 1 to 4, and placing them into four folders. The feature files are then modified to use random UUIDs. This method is used to determine labels for the previously extracted features.

[0159] The AJCC staging is done by downloading these WSI clinical files from TCGA (the open-source biological database in the United States). From this file, you can get the pathological stage classification of the patient's tumor, I, II, III, IV, which represent tumors that are getting larger and more serious.

[0160] It should be noted that during training, a composite data augmentation module is needed to obtain multiple files to increase the number of samples and improve model training performance. In actual use, however, the composite data augmentation module is not required; the obtained features F can be directly fed into the subsequent multi-granularity fuzzy classification network for processing.

[0161] II. Fuzzy Classification Networks:

[0162] The multi-granularity fuzzy classification network model of this invention comprises five core modules: a feature enhancement module, a multi-scale feature fusion module, a fuzzy classifier, a multi-granularity decision fusion module, and an ensemble learning module. It achieves high-precision classification and uncertainty quantification of medical images through a hierarchical processing mechanism. The overall architecture adopts an end-to-end design, with each module undertaking a specific function from input features to the final classification decision, working collaboratively to address the fuzziness, noise interference, and small sample size issues in medical images.

[0163] The feature enhancement module is used to suppress noise and enhance key features of the input medical image features.

[0164] The multi-scale feature fusion module is used to extract and fuse feature information at different scales.

[0165] The fuzzy classifier is used to perform fuzzy logic classification based on the enhanced features and to quantify the prediction uncertainty.

[0166] The multi-granularity decision fusion module is used to divide features into multiple granularities and perform weighted decision fusion.

[0167] The ensemble learning module is used to improve the robustness of the final prediction by averaging the output of multiple independent models.

[0168] The input to the multi-granularity fuzzy classification network model is a floating-point tensor F of shape (n, 2048) output by the data preprocessing unit. The network is configured with two outputs, thus outputting two tensors:

[0169] A tensor of shape (n, 4) represents the original logits of each sample belonging to the four AJCC phases (I, II, III, IV).

[0170] A tensor of shape (n, 1) represents the confidence score predicted for each sample, ranging from [0, 1].

[0171] During training, a batch of WSI features is input, for example, a batch of 200 WSIs. After the data preprocessing unit, the training data size becomes 1000 (including all F and WSI features). A batch contains 1000 features.

[0172] The specific structure and processing procedure of the multi-granularity fuzzy classification network model are as follows:

[0173] The input is a tensor F of size (n, 2048). Since all of these are treated as inputs (processed separately), they are all treated as T.

[0174] B1. Feature Enhancement Module: Enhances the input features, improves feature representation capabilities, suppresses noise interference, and provides robust feature input for subsequent classification.

[0175] B101. Residual Feature Transformation:

[0176] The input is processed using a residual connection. The input passes through a sub-network of Linear(2048->256)->LayerNorm->GELU->Linear(256->2048). The output is added to the original input T, as shown below:

[0177]

[0178] Where T is the input feature, It is a nonlinear transformation layer (including linear layers, LayerNorm, and GELU activation). This refers to the enhanced features.

[0179] This process is used to learn feature residuals through the transformation layer, thereby enhancing the feature representation capability.

[0180] B102, Feature Standardization:

[0181] Layer normalization (LayerNorm) is used to standardize the features and stabilize the training process. The formula is as follows:

[0182]

[0183] Where μ and σ are characteristic The mean and standard deviation of , where γ and β are learnable parameters.

[0184] This process is used to reduce internal covariate shifts and accelerate model convergence.

[0185] B103, Feature Attention Mechanism:

[0186] First, the features are compressed to a low-dimensional space (reduced to 1 / 4 of the original dimension), then the original dimension is restored, and a sigmoid-activated attention map is output. This attention map is then multiplied point-by-point with the features to obtain the enhanced features. .

[0187] In specific processing, the normalized features are passed through a sub-network of Linear(2048->512)->Tanh->Linear(512->2048)->Sigmoid to obtain the attention weights (between 0 and 1) for each dimension, represented as:

[0188]

[0189] in , σ represents the weights of the linear layer. σ is the Sigmoid function, which outputs weights ranging from 0 to 1. ⊙ represents element-wise multiplication.

[0190] This leads to the enhanced features. ,in This is a tensor with the enhanced shape of (n, 2048).

[0191] This processing can adaptively focus on key feature dimensions and suppress irrelevant or noisy features.

[0192] B2. Multi-Scale Feature Fusion Module: For tensors Extract and fuse feature information at different scales to capture a complete feature representation from local details to global context.

[0193] B201, Feature Granularity Division:

[0194] eigenvectors The data is uniformly divided into K subsets, each subset corresponding to a feature at a specific granularity, and each subset corresponds to a feature dimension subspace. ,in For the feature of the i-th granularity, This refers to the number of particles (a hyperparameter); in this embodiment... .

[0195] B202, Multi-scale Feature Extraction:

[0196] The design employs three independent paths: a shallow layer (1 linear layer), a middle layer (2 linear layers), and a deep layer (3 linear layers). The input tensor is simultaneously fed into these three parallel sub-networks, each capturing features at different levels of abstraction. This is represented as:

[0197]

[0198]

[0199]

[0200] in, - Indicates weight, - Indicates bias.

[0201] In the multi-scale feature extraction process, the shallow layer preserves local texture, the middle layer captures regional structure, and the deep layer extracts semantic context.

[0202] B203, Adaptive Weight Fusion:

[0203] Introducing learnable fusion weight parameters w raw =[w 1raw, w 2raw w 3raw The result, obtained through softmax normalization, is:

[0204]

[0205] The condition is satisfied that w1+w2+w3=1.

[0206] Then, weighted fusion is performed:

[0207]

[0208] in This indicates a splicing operation.

[0209] B204, Nonlinear Feature Fusion:

[0210] The concatenated features are then subjected to a nonlinear transformation using a multilayer perceptron to enhance feature interaction, resulting in...

[0211]

[0212] Among them, W f These are learnable parameters that are automatically learned and updated during model training, with the aim of minimizing the loss function. f It is also a learnable parameter of the linear layer, and serves as the bias for the fusion layer.

[0213] This allows for dynamic adjustment of contributions at each scale, avoiding the need for manual weighting.

[0214] The above process can be briefly represented as: simultaneously feeding the input tensor into three parallel subnetworks:

[0215] Scale 1: Linear(2048 / K->256).

[0216] Scale 2: Linear(2048 / K->256)->GELU->Linear(256->256).

[0217] Scale 3: Linear(2048 / K->256)->GELU->Linear(256->256)->GELU->Linear(256->256).

[0218] The three outputs (all (n, 256)) are weighted and concatenated with the learnable weights to obtain a tensor of (n, 256 * 3 = 768). Finally, a linear (768 -> 256) layer is used for fusion. The output is the fused (n, 256) tensor. .

[0219] B3. Construct an Enhanced Fuzzy Classifier. The fuzzy classifier performs fuzzy logic classification based on the enhanced features, outputs the class probability, and estimates the prediction uncertainty.

[0220] B301, Multi-stage Feature Acquisition:

[0221] Get each The output of the multi-scale fusion module .

[0222] B302. Construct fuzzy classifiers for each... corresponding Classification prediction is performed, and confidence is estimated using an independent subnetwork for uncertainty estimation; the fuzzy classifier is as follows:

[0223] (a) Deep classification network:

[0224] For tensors The classification head is constructed using multi-layer nonlinear transformations (fully connected layers, LayerNorm, GELU activation, and Dropout):

[0225]

[0226] in, ∈R 256×128 W c2 ∈R 128×64 W c3 ∈R 64×4 As the weight, b c1 ∈R 128 b c2 ∈R 64 b c3 ∈R 4 For bias.

[0227] By mapping features to the category space through deep networks, the ability to learn classification boundaries is enhanced.

[0228] (b) Uncertainty estimation:

[0229] Independent subnetworks for uncertainty estimation from input features Regression uncertainty score, output range [0, 1]:

[0230]

[0231] Where u is the uncertainty estimate, and a higher value indicates a lower confidence level in the prediction; ∈R 64×512 , ∈R 1×64 As weight, ∈R 64 , ∈R represents the bias; σ is the Sigmoid function, outputting in the range [0, 1]; T i Let be the enhanced feature at the i-th granularity, with a dimension of 512.

[0232] An independent subnetwork for uncertainty estimation refers to a specially designed small neural network module that operates independently of the main classification network and is specifically designed to quantify the confidence level of the prediction. Here, we use... Instead ,because It is a feature representation that is closer to the original input, and it retains more information from the original data, including possible noise and anomalies; this independent sub-network is specifically responsible for the "metacognitive" task—that is, evaluating the reliability of the model's own predictions, rather than making classification decisions directly.

[0233] Provide confidence metrics for each prediction to aid clinical decision-making.

[0234] The above process can be briefly summarized as follows:

[0235] Classification Header: Tensor T output The classification logits(n, 4) is obtained by passing through the network Linear(256->128)->LayerNorm->GELU->Dropout->Linear(128->64)->...->Linear(64->4).

[0236] Uncertainty estimation head: will enhance features (n, 512) estimates the uncertainty through a small network of Linear(512->64)->GELU->Linear(64->1)->Sigmoid. (n, 1).

[0237] The final output is the classification logits and uncertainty score. .

[0238] B4. Building a multi-granularity decision fusion module based on a fuzzy classifier: This module addresses multiple granularities... The system processes data in parallel using a fuzzy classifier and then fuses decisions at different granularities to improve classification robustness.

[0239] B401. Obtain the output of each fuzzy classifier:

[0240] Each granular feature corresponding Let it be T m T corresponding to multiple granularities m Parallel processing via a fuzzy classifier and feature granularity partitioning can create complementary feature views, allowing for data analysis from different perspectives.

[0241] T for each granularity m The corresponding independent fuzzy classifier outputs the logits and uncertainty u of that category, denoted as and .

[0242] Attention weights are calculated based on uncertainties at each granularity.

[0243]

[0244] in, Let m be the attention weight at the m-th granularity. , In this embodiment, the particle size is specified. =4; thus, granularities with lower uncertainty (higher confidence) receive higher weights; Let be the uncertainty fraction for the m-th granularity. Let be the uncertainty score for the j-th granularity. Both m and j represent granularity indices; m specifies the current granularity, and j is used for iterative summation.

[0245] This invention proposes two solutions:

[0246] Option 1: B402, weighting the prediction results for each granularity using attention weights:

[0247]

[0248] In this embodiment, m = 1, 2, 3, 4, then ;

[0249] B403, Global Feature Aggregation: Targeting Feature Tensors Capture cross-granularity relationships using a global classifier: logits global =Classifier global ( ),

[0250]

[0251] Classifier global For a global classification network, logitsglobal Global classification logical value, dimension R n×4 In this embodiment, the global classification network structure is: linear layer + layer normalization + GELU activation + Dropout.

[0252] B404, Decision Fusion: [This will...] and The data is then concatenated to generate the final prediction.

[0253]

[0254] in, Indicates will and all Then, the parts are assembled. The final classification output of the MGC module is obtained after two layers of transformation. :

[0255]

[0256]

[0257]

[0258] It should also be noted that the final number of categories in this implementation is 4, therefore the final output is 4-dimensional. This can be adjusted according to the actual number of categories. .

[0259] Final output classification (n, 4) and the uncertainty score after fusion.

[0260] Alternatively, a simplified solution two can be adopted:

[0261] B402, Weighted fusion of outputs at various granularities:

[0262]

[0263] in, This is the classification logical value for the m-th granularity (from parallel classification). This is the weighted and merged logical value.

[0264] B403, Global Feature Aggregation: Targeting Feature Tensors Capture cross-granularity relationships using a global classifier: logits global =Classifier global ( ),

[0265]

[0266] in For global classification networks, Global classification logical value, dimension R n×4 In this embodiment, the global classification network structure is: linear layer + layer normalization + GELU activation + Dropout.

[0267] B404, Decision Fusion: Concatenates the global output with the weighted fusion logic value, and generates the final prediction through the fusion layer.

[0268]

[0269] in, This indicates a concatenation operation, with dimensions of... ; This is the weight matrix of the fusion layer; The final classification logical value integrates global and local granular information, with the following dimensions. .

[0270] Final output classification (n, 4) and the uncertainty score after fusion (n, 1).

[0271] The fuzzy classification network of this invention draws on some ideas of fuzzy control, such as fuzzy decision-making and dynamic range. It quantifies uncertainty, extending the traditional binary decision (yes / no) to a continuous uncertainty score, thus achieving soft rather than hard decision-making. Regarding the fuzzy attention mechanism between granularities, it adopts an uncertainty-oriented fuzzy fusion strategy, where granularities with lower uncertainty (higher confidence) receive higher weights, reflecting fuzzy logic.

[0272] The dataset obtained from the data preprocessing unit is divided into training and test sets in a 4:1 ratio.

[0273] The data preprocessing unit encapsulates 2048-dimensional NumPy format feature vectors containing key histological information such as cell nuclear morphology, tissue texture, and chromatin distribution. For each original feature file (e.g., case_001.npy), an enhancement operation is randomly selected to generate four enhanced versions, resulting in NumPy files with five times the data volume. Furthermore, NumPy vectors with random UUIDs are generated after classifying TCGA clinical files into four categories according to AJCC classification.

[0274] The network model is trained using a training set. The training process employs an advanced fuzzy loss function, integrating FocalLoss, entropy regularization, and uncertainty weighting mechanisms to achieve accurate optimization under multiple constraints. The loss function construction logic is as follows:

[0275] Basic loss: Cross-entropy loss is used to characterize the difference between the model's predicted results and the true labels, serving as the basic optimization objective. Each element in This represents the probability that the model predicts a sample belongs to class c. From the probability distribution... In the index, the probability value corresponding to the real label is obtained. The basic entropy loss is:

[0276]

[0277] FocalLoss Focus:

[0278]

[0279] Introducing modulation factor (γ=2.0), reducing the loss weight of easily classified samples to guide the model to focus on difficult-to-classify samples; among which, It is the model's predicted probability of the true class. It is an adjustable focusing parameter. The larger the value, the stronger the modulation effect and the greater the attention paid to difficult samples. It is an optional weighting factor used to handle class imbalance.

[0280] Entropy regularization constraint: To encourage the model to make more certain predictions, the predicted probability distribution is calculated. Information entropy as a regularization term:

[0281]

[0282] Total Loss Integration: The final loss is the weighted sum of the weighted Focal Loss mean and the weighted entropy loss mean, with the entropy loss balance coefficient... The default value is 0.5, which achieves balanced optimization of multi-objective loss.

[0283]

[0284] in, This represents the number of samples in a batch.

[0285] By applying greater attention to fuzzy samples near the classification boundary through the FocalLoss loss function, the model's ability to discriminate in regions with low certainty and overlapping features is specifically optimized, demonstrating the special handling of boundary situations by fuzzy systems.

[0286] The system performs classification based on a multi-granularity decision fusion module, outputting cervical cancer AJCC classification (Class I / II / III / IV); confidence quantification: outputting a continuous 0-1 confidence score based on prediction entropy and fuzzy membership degree; uncertainty source tracing: identifying potential noise interference sources in low-confidence predictions; and training the multi-granularity decision fusion module based on labels. Specific Implementation Method Two:

[0288] This embodiment is a medical image analysis system based on a multi-granularity fuzzy classification network. It adds an ensemble learning module to the first embodiment to improve prediction stability and reliability through model averaging.

[0289] Initialize P = 3 independent multi-granularity decision fusion modules, and enhance ensemble diversity through differentiated training strategies:

[0290] Differentiated parameter initialization: Each model uses a different random seed to initialize the network parameters, ensuring diversity of the starting point for optimization.

[0291] Dynamic augmentation strategy: During training, a random combination of the following processing methods is applied to each batch for data augmentation;

[0292] Features are randomly discarded (40% of feature dimensions are randomly masked with a 50% probability).

[0293] Gaussian noise injection (adding Gaussian noise with a standard deviation of 0.01 with a 50% probability);

[0294] Feature rearrangement (randomly shuffling the order of feature dimensions with a 20% probability);

[0295] Parallel independent training: Each model is optimized independently on the same training set, but due to the enhanced randomness, the actual training samples seen are different.

[0296] Then, ensemble learning is performed based on P multi-granularity decision fusion modules:

[0297] (1) Model average prediction: During the inference phase, the outputs of the P models are averaged:

[0298]

[0299] in For the p-th independent multi-granularity decision fusion module .

[0300] This can effectively improve the robustness of prediction.

[0301] (2) Integrated evaluation: Provides uncertainty estimation based on multiple multi-granularity decision fusion modules, such as prediction variance:

[0302]

[0303] By comprehensively assessing the confidence level of the model, we can provide more reliable decision support for clinical practice.

[0304] It should be noted that this implementation initializes the three aforementioned multi-granularity decision fusion modules. During training, each model is trained independently, and the loss of the three models trained independently is as follows: Figures 3 to 5 As shown. To facilitate the explanation of the model ensemble effect, the loss corresponding to the ensemble learning based on the three models is as follows. Figure 6 As shown.

[0305] During prediction, the same input feature is simultaneously input into three models, resulting in three outputs. The average of the logits and uncertainty scores of the three models' outputs is then used as the final prediction result of the ensemble model. The following embodiments use Scheme 1 of the present invention for experimental purposes.

[0306] Figure 7 The robustness of existing classification models and the present invention (Advanced model in the figure) in three noise scenarios is shown to be comparable. It can be seen that the model of the present invention has good classification performance and high accuracy. Figure 8 The diagram shows a comparison of the uncertainties of different classification models and the present invention (Advanced model in the figure) when noise is added to an image. It can be seen that the present invention has relatively small uncertainties when noise is added.

[0307] Figure 9 The graph shows the ROC curve for Gaussian noise, where (a), (b), and (c) correspond to noise levels of 0.1, 0.3, and 0.5, respectively. Figure 10 The diagram shows the ROC effect of sudden noise, where (a), (b), and (c) correspond to noise levels of 0.1, 0.3, and 0.5, respectively. Figure 11 The diagram shows the effects of systematic ROC shift, where (a), (b), and (c) correspond to noise levels of 0.1, 0.3, and 0.5, respectively. (Through...) Figures 9-11 It can be seen that the Advanced model of this invention performs better and is more stable compared to other models.

[0308] The features of this invention are as follows:

[0309] 1. End-to-End Noise Reduction Paradigm Reconstruction: This invention breaks through the traditional sequential process of "denoising first, then classifying" in existing technologies, and creatively constructs an integrated processing paradigm of "denoising and classifying simultaneously." This paradigm eliminates the need for any independent preprocessing denoising operations on the input raw scan image, directly inputting the noisy image into the classification network. By deeply integrating noise resistance capabilities into the network architecture, it achieves dynamic suppression of noise interference during inference while simultaneously completing high-precision classification. This fundamental scenario-level change greatly improves processing efficiency, shortens the diagnostic process, and better meets the needs of real-time clinical applications.

[0310] 2. Innovative Structural Approach of Multi-Granularity Dynamic Weighted Fusion: To fundamentally ensure the network's inherent noise resistance, this invention designs a multi-granularity dynamic weighted fusion module. This technology intelligently divides the high-dimensional feature space into multiple complementary granularity subspaces and innovatively introduces an attention mechanism guided by "uncertainty." The network can automatically assess the impact of noise on each granularity feature and dynamically reduce the weights of granularities severely contaminated by noise, while increasing the contribution ratio of clear and reliable granularities. This structured adaptive fusion mechanism ensures that the system maintains robustness and high accuracy in decision-making even when facing complex noise with uneven distribution.

[0311] 3. Innovation in Confidence Measurement Based on Fuzzy Membership Degrees: To address the reliability and interpretability issues of prediction results in noisy environments, this invention integrates fuzzy mathematics theory and designs a confidence measurement method based on the entropy of the output probability distribution. This technology not only outputs "what it is" (classification result) but also accurately tells "how certain it is" (confidence score). It makes the inherent uncertainty of the network in the face of noise transparent and quantifies it, providing doctors with a direct basis for judging the reliability of AI suggestions, effectively supporting clinical decision-making risk management, and is a key link in realizing human-machine collaborative diagnosis.

[0312] The above examples of the present invention are merely illustrative of the computational model and process of the present invention, and are not intended to limit the implementation of the present invention. Those skilled in the art will recognize that other variations or modifications can be made based on the above description. It is impossible to exhaustively list all possible implementations here. Any obvious variations or modifications derived from the technical solutions of the present invention are still within the scope of protection of the present invention.

Claims

1. A medical image analysis system based on a multi-granularity fuzzy classification network, characterized in that, include: Data acquisition and preprocessing unit: used to acquire pathological slide images and process them into data suitable for processing by multi-granularity fuzzy classification networks; Fuzzy classification unit: Medical images are classified using a multi-granularity fuzzy classification network; The multi-granularity fuzzy classification network model includes: a feature enhancement module, a multi-scale feature fusion module, a fuzzy classifier, and a multi-granularity decision fusion module; Feature enhancement module: Enhances the input features; including: first, performing a nonlinear transformation on the input features T, and then obtaining the enhanced features through residual connections. Then, layer normalization is used to standardize the features to obtain the standardized features. ;against The attention weight A is obtained using the attention mechanism and then compared with... Element-wise multiplication yields the enhanced features ; Multi-scale feature fusion module: targeting Extracting and fusing feature information at different scales; including: combining feature vectors The data is uniformly divided into K subsets, each subset corresponding to a feature at a specific granularity, and each subset corresponds to a feature dimension subspace. ,in For the feature of the i-th granularity, For the number of granularities; targeting It captures features at different levels of abstraction through three parallel sub-networks. , , Based on learnable fusion weight parameters w raw Obtained through softmax normalization ; weighted fusion is performed to obtain , ⊕ indicates a concatenation operation; [the feature] The result is obtained by nonlinear transformation using a multilayer perceptron. ; Fuzzy classifier: performs fuzzy logic classification on features, outputs class probabilities, and estimates prediction uncertainty; including: for each corresponding The classification result is obtained by constructing a classification head using multi-level nonlinear transformation. ;against The regression uncertainty u is obtained by using an independent subnetwork with uncertainty estimation. Multi-granularity decision fusion module: performs fusion decision based on multiple results obtained through fuzzy classifiers; includes: Each granular feature corresponding Let it be T m , for each granularity T m The corresponding independent fuzzy classifier outputs the logits and uncertainty u of that category, denoted as and Attention weights are calculated based on the uncertainties at each granularity. α m Let m be the attention weight at the m-th granularity; Let be the uncertainty fraction for the m-th granularity. The uncertainty score for the j-th granularity, where m and j both represent granularity indices; output for each granularity. ; For feature tensors The global classification logical value logits is obtained through a global classification network. global ; will logits global With all The final prediction is obtained based on the stitching results. This refers to the classification results of medical images.

2. The medical image analysis system based on a multi-granularity fuzzy classification network according to claim 1, characterized in that, The data acquisition and preprocessing unit includes an adaptive downsampling module, which is used to adaptively downsample the pathological slide image. During the downsampling process, the downsampling rate is calculated based on the ratio of the maximum magnification of the corresponding data of the pathological slide image to the preset target magnification to complete the downsampling.

3. The medical image analysis system based on a multi-granularity fuzzy classification network according to claim 2, characterized in that, The data acquisition and preprocessing unit further includes an adaptive block segmentation module, used to divide the downsampled image into a series of image blocks, including: A201. For downsampled data, the entire slice image is meshed by sliding a window with a fixed step size according to the set tile size to generate image blocks, also known as tiles. A202. Perform grayscale conversion and gradient field measurement on each tile to obtain the gradient amplitude. Based on the gradient amplitude, obtain the flat area ratio. Based on the flat area ratio, determine the effective organization area and retain the corresponding tile.

4. The medical image analysis system based on a multi-granularity fuzzy classification network according to claim 3, characterized in that, Step A202 involves obtaining the proportion of flat regions based on the gradient magnitude, and determining the effective tissue region based on the proportion of flat regions. Statistical tile Q i The number of pixels whose gradient magnitude is below the gradient magnitude threshold τ is P_flat=Σ_{x,y}I[M i (x, y) < τ], where I[·] is an indicator function that has a value of 1 when the condition is true and 0 otherwise; M i (x, y) represents the gradient magnitude at pixel (x, y); Then calculate the flat area ratio R_flat = P_flat / (S × S), where S × S is the size of each tile; tiles with R_flat less than or equal to the flat area ratio threshold θ are considered as Q. i It is considered an effective organizational area.

5. A medical image analysis system based on a multi-granularity fuzzy classification network according to claim 3, characterized in that, The data acquisition and preprocessing unit further includes a color normalization module for color normalizing image patches, including: A301. Calculate the optical density OD based on the transmitted light intensity I and the maximum light intensity I_max, and remove background pixels based on the pixel OD. A302. Perform principal component analysis on the OD values ​​of the remaining pixels to extract the coloring basis vectors H and E of the image; A303, Place tile Q k Projecting the staining basis vectors according to their OD values, we obtain the staining concentration matrix C. source ; A304. Determine the target standard coloring basis vector H_target: A305, Obtain the staining concentration matrix C source The first percentile is denoted as max(C). source Normalization is performed as follows: Where diag(·) denotes the construction of a diagonal matrix; The normalized concentration matrix; max(C target () represents the first percentile of the target standard concentration distribution; A306, Using the normalized concentration matrix C normalized and the target standard coloring basis vector H target Reconstruct the normalized optical density value OD normalized =C normalized ×H target The optical density values ​​are converted back to RGB space to obtain the reconstructed RGB image.

6. A medical image analysis system based on a multi-granularity fuzzy classification network according to claim 5, characterized in that, The data acquisition and preprocessing unit also includes a depth feature extraction module, used for color-normalized image tiles Q' k Deep feature extraction yields data suitable for processing by multi-granularity fuzzy classification networks.

7. A medical image analysis system based on a multi-granularity fuzzy classification network according to claim 6, characterized in that, The data acquisition and preprocessing unit also includes a composite data augmentation module for generating various augmentation features; The generated multiple enhanced features are used for training the multi-granularity fuzzy classification network. The process of generating multiple enhanced features adopts a parallel enhancement method, which includes at least one of Gaussian noise injection, random feature masking, feature space interpolation, and adaptive scaling.

8. A medical image analysis system based on a multi-granularity fuzzy classification network according to claim 1, characterized in that, The structure of the global classification network is: linear layer + layer normalization + GELU activation + Dropout.

9. A medical image analysis system based on a multi-granularity fuzzy classification network according to any one of claims 1 to 8, characterized in that, The process of fusion decision making in the multi-granularity decision fusion module is replaced by the following: Each granular feature corresponding Let it be T m , for each granularity T m The corresponding independent fuzzy classifier outputs the logits and uncertainty u of that category, denoted as and Attention weights are calculated based on the uncertainties at each granularity. α m Let m be the attention weight at the m-th granularity; Let be the uncertainty fraction for the m-th granularity. The uncertainty score for the j-th granularity, where m and j both represent granularity indices; weighted fusion of outputs from each granularity. ,logits m This is the classification logical value for the m-th granularity. The weighted and merged logical value; For feature tensors The global classification logical value logits is obtained through a global classification network. global ; will logits global and The final prediction is obtained based on the stitching results. This refers to the classification results of medical images.

10. A medical image analysis system based on a multi-granularity fuzzy classification network according to any one of claims 1 to 8, characterized in that, The multi-granularity fuzzy classification network model also includes an ensemble learning module, which integrates P independent multi-granularity decision fusion modules to perform ensemble classification; Envelope: Averaging is performed based on the outputs of p multi-granularity decision fusion modules: ,in For the p-th independent multi-granularity decision fusion module ; Will As the final prediction result, the medical images are classified.