A deep learning-based breast imageomics classification method and system

CN122243956APending Publication Date: 2026-06-19CHONGQING JIULONGPO DISTRICT HOSPITAL OF TRADITIONAL CHINESE MEDICINE

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHONGQING JIULONGPO DISTRICT HOSPITAL OF TRADITIONAL CHINESE MEDICINE
Filing Date
2026-03-20
Publication Date
2026-06-19

Smart Images

  • Figure CN122243956A_ABST
    Figure CN122243956A_ABST
Patent Text Reader

Abstract

This application provides a deep learning-based breast radiomics classification method and system. The method involves preprocessing breast image data to obtain preprocessed breast image data, extracting regions of interest (ROIs) for breast lesions from the preprocessed breast image data based on lesion annotation information, performing dual-path feature extraction on the ROIs to obtain radiomics features of morphology and texture, and semantic deep features with lesion semantic representation information, and constructing a deep learning feature fusion model for breast images. Based on this model, the radiomics features and semantic deep features are fused to obtain a fused feature vector for the breast images. Finally, lesion category analysis is performed on the breast images using lesion samples and the fused feature vector to obtain the classification results for the breast lesions. Using this method, breast images can be classified based on their physical interpretability and deep semantic representation capabilities.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of radiomics classification technology, and more specifically, to a deep learning-based method and system for breast radiomics classification. Background Technology

[0002] Radiomics classification is a quantitative analysis method based on medical images. It extracts a large number of quantitative features (such as texture, morphology, intensity, etc.) from images such as CT, MRI or PET in high throughput, and combines them with machine learning or statistical models to classify lesions or tissues into different categories or subtypes.

[0003] Breast imaging is a core tool for clinical screening and diagnosis of breast lesions. However, its diagnostic results are highly dependent on the clinical experience of radiologists, resulting in problems such as strong subjectivity, easy missed diagnosis of early and small lesions, and low consistency in diagnosis among different physicians. Among the existing breast lesion classification technologies, traditional radiomics methods extract surface physical features such as lesion morphology and texture through statistical models. Although these methods have a certain degree of interpretability, they have a single feature dimension, are difficult to capture deep pathological association information of lesions, and have limited generalization ability. Pure deep learning methods can extract semantic deep features of lesions, but they lack intuitive physical interpretability, have insufficient clinical acceptance, and the two types of features have not been effectively integrated, resulting in the accuracy and stability of the classification model failing to meet actual clinical needs. Therefore, how to classify breast images based on the physical interpretability and deep semantic representation capabilities of breast imaging has become a problem faced by the industry. Summary of the Invention

[0004] This application provides a deep learning-based breast imaging omics classification method and system, which can classify breast images based on their physical interpretability and deep semantic representation capabilities.

[0005] In a first aspect, this application provides a deep learning-based breast radiomics classification method, comprising the following steps: Obtain breast imaging data and corresponding lesion annotation information; The breast imaging data is preprocessed to obtain preprocessed breast imaging data, and the region of interest of breast lesions in the preprocessed breast imaging data is extracted based on the lesion annotation information. Dual-path feature extraction is performed on the region of interest to obtain morphological and textural image omics features and semantic depth features with lesion semantic representation information; A deep learning feature fusion model for breast images is constructed, and the radiomics features and semantic deep features are fused based on the deep learning feature fusion model to obtain the fused feature vector of breast images. Obtain lesion samples from breast images, and perform lesion category analysis on the breast images using the lesion samples and the fused feature vector to obtain the classification results of breast lesions.

[0006] In some embodiments, preprocessing the breast imaging data to obtain preprocessed breast imaging data specifically includes: The breast imaging data is denoised to obtain denoised breast imaging data. The denoised breast image data is normalized to obtain normalized breast image data. The normalized breast image data is interpolated to obtain preprocessed breast image data.

[0007] In some embodiments, extracting the region of interest (ROI) of breast lesions in the preprocessed breast imaging data based on the lesion annotation information specifically includes: The lesion annotation information is mapped onto the preprocessed breast image data to obtain the lesion annotation region in the breast image; The gray-level similarity threshold range for lesion region expansion identification is determined based on the gray-level distribution characteristics in the preprocessed breast imaging data. By expanding the lesion-labeled region using the gray-scale similarity threshold range, the region of interest for breast lesions in the preprocessed breast imaging data is obtained.

[0008] In some embodiments, performing dual-path feature extraction on the region of interest to obtain morphological and textural radiomics features and semantic deep features with lesion semantic representation information specifically includes: Morphological and texture features are extracted from the region of interest to obtain morphological and texture image omics features; Semantic deep features are extracted from the region of interest to obtain semantic deep features with lesion semantic representation information.

[0009] In some embodiments, constructing a deep learning feature fusion model for breast images specifically includes: Establish a feature adaptation preprocessing branch for image omics features and semantic deep features; The feature adaptation preprocessing branch is weighted based on an attention mechanism to obtain a deep learning feature fusion model for breast images.

[0010] In some embodiments, the fusion of the radiomics features and the semantic deep features based on the deep learning feature fusion model to obtain the fused feature vector of the breast image specifically includes: The image omics features and the semantic deep features are input into the deep learning feature fusion model, and the feature fusion step in the deep learning feature fusion model is executed. The deep learning feature fusion model outputs a fused feature vector of breast images.

[0011] In some embodiments, the classification results of breast lesions obtained by analyzing the lesion category of breast images using the lesion sample and the fused feature vector specifically include: The lesion sample and the fusion feature vector are correlated and fused to obtain the correlated fusion feature; Based on the aforementioned correlation fusion features, breast images are classified into lesions, thereby obtaining the classification results of breast lesions.

[0012] Secondly, this application provides a deep learning-based breast imaging omics classification system, comprising: The acquisition module is used to acquire breast imaging data and corresponding lesion annotation information; The processing module is used to preprocess the breast imaging data to obtain preprocessed breast imaging data, and extract the region of interest of breast lesions in the preprocessed breast imaging data based on the lesion annotation information. The processing module is also used to perform dual-path feature extraction on the region of interest to obtain morphological and texture radiomics features and semantic depth features with lesion semantic representation information. The processing module is also used to construct a deep learning feature fusion model for breast images, and to perform feature fusion on the radiomics features and the semantic deep features based on the deep learning feature fusion model to obtain a fused feature vector of breast images. The execution module is used to acquire lesion samples from breast images, and to perform lesion category analysis on the breast images using the lesion samples and the fused feature vector to obtain the classification results of breast lesions.

[0013] Thirdly, this application provides a computer device including a memory and a processor, the memory storing code, and the processor being configured to acquire the code and execute the aforementioned deep learning-based breast radiomics classification method.

[0014] Fourthly, this application provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the aforementioned deep learning-based breast radiomics classification method.

[0015] The technical solutions provided by the embodiments disclosed in this application have the following beneficial effects: The breast radiomics classification method and system based on deep learning provided in this application first acquires breast image data and corresponding lesion annotation information; preprocesses the breast image data to obtain preprocessed breast image data, and extracts regions of interest (ROIs) of breast lesions in the preprocessed breast image data based on the lesion annotation information; performs dual-path feature extraction on the ROIs to obtain radiomics features of morphology and texture and semantic deep features with lesion semantic representation information; constructs a deep learning feature fusion model for breast images, and fuses the radiomics features and semantic deep features based on the deep learning feature fusion model to obtain a fused feature vector of breast images; acquires lesion samples from breast images, and performs lesion category analysis on breast images using the lesion samples and the fused feature vector to obtain the classification results of breast lesions.

[0016] Therefore, in the breast radiomics classification process, this application first acquires and preprocesses images with lesion annotations to ensure data quality and consistency. Simultaneously, it extracts regions of interest using the annotation information, eliminating interference from irrelevant tissues and improving the targeting and efficiency of subsequent feature extraction. Secondly, it employs a dual-path feature extraction strategy. On one hand, it extracts radiomics features such as morphology and texture, which have clear physical meaning and clinical interpretability, helping doctors understand the model's judgment basis. On the other hand, it extracts semantic depth features through deep learning, capturing complex lesion representations that are difficult for the human eye to discern, enhancing the model's ability to recognize subtle and deep patterns. Then, it constructs a feature fusion model that organically combines the two types of features, preserving the physical interpretability of radiomics features while incorporating the high representational power of deep learning, forming a complementary relationship and improving the comprehensiveness and robustness of feature expression. Finally, it performs classification analysis based on lesion samples and fused features, enabling the model to make decisions based simultaneously on the explicit patterns of morphology and texture and deep semantic information. This not only improves classification accuracy and generalization ability but also enhances doctors' trust in the model through interpretable feature components. Using the above scheme, breast images can be classified based on their physical interpretability and deep semantic representation capabilities. Attached Figure Description

[0017] Figure 1 This is an exemplary flowchart of a deep learning-based breast radiomics classification method according to some embodiments of this application; Figure 2 This is an exemplary flowchart illustrating the determination of preprocessed breast imaging data according to some embodiments of this application; Figure 3 This is an exemplary flowchart illustrating the determination of fused feature vectors according to some embodiments of this application; Figure 4This is a schematic diagram of the structure of a deep learning-based breast imaging omics classification system according to some embodiments of this application; Figure 5 This is a schematic diagram of the structure of a computer device implementing a deep learning-based breast radiomics classification method, according to some embodiments of this application. Detailed Implementation

[0018] To better understand the technical solution of this application, the technical solution of this application will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0019] refer to Figure 1 The figure is an exemplary flowchart of a deep learning-based breast radiomics classification method according to some embodiments of this application. The deep learning-based breast radiomics classification method mainly includes the following steps: In step 101, breast imaging data and corresponding lesion annotation information are obtained.

[0020] It should be noted that the breast imaging data in this application reflects the anatomical structure, physiological metabolism, and pathological morphology of breast tissue. The breast imaging data includes multiple breast images, which include the visualization features of grayscale distribution, spatial location, density / signal intensity, and boundary morphology of breast tissue lesions and surrounding normal tissue. The lesion annotation information reflects the clinical and pathological attributes and image localization information of breast tissue lesions. The lesion annotation information includes the clearly defined lesion boundary range, lesion type label, pathological grade, and specific location of the lesion on the breast image. It is the key supervisory information for establishing the association between image features and lesion nature and realizing supervised learning of the model. It corresponds one-to-one with the breast imaging data to form complete sample data.

[0021] In step 102, the breast imaging data is preprocessed to obtain preprocessed breast imaging data, and the region of interest of breast lesions in the preprocessed breast imaging data is extracted based on the lesion annotation information.

[0022] In some embodiments, reference Figure 2 As shown, this figure is an exemplary flowchart for determining preprocessed breast imaging data in some embodiments of this application. In this embodiment, the breast imaging data is preprocessed to obtain preprocessed breast imaging data, which can be achieved by the following steps: In step 1021, the breast imaging data is denoised to obtain denoised breast imaging data. In step 1022, the denoised breast image data is normalized to obtain normalized breast image data. In step 1023, the normalized breast image data is interpolated to obtain preprocessed breast image data.

[0023] In specific implementation, the breast imaging data is denoised to obtain the denoised breast imaging data. This can be achieved in the following way: For the noise characteristics of three common mammograms (X-ray, ultrasound, and MRI), corresponding algorithms are selected. X-ray images mainly contain scanning electronic noise, so a 3×3 pixel window median filtering algorithm is used. This involves traversing each pixel in the image, sorting its eight neighboring pixels and its own nine pixel values, and then selecting the median value as the new grayscale value for that pixel, thus eliminating isolated noise points. Ultrasound images contain speckle noise, so an adaptive Wiener filtering algorithm is used. First, the image is divided into multiple sub-blocks using a 5×5 pixel window. The local mean and variance of each sub-block are calculated, and then the filtering coefficients are dynamically adjusted according to the noise intensity of the sub-blocks, suppressing speckle noise while preserving the details of the lesion edges. MRI images are susceptible to artifacts caused by magnetic field inhomogeneity. This is achieved by dividing the image into multiple overlapping regions, fitting a polynomial function to each region to correct the grayscale shift, and then using weighted fusion to obtain the globally denoised image data. Other methods can also be used in other embodiments, which are not limited here.

[0024] In addition, in specific implementation, the denoised breast image data is normalized to obtain normalized breast image data. This can be achieved by the following method: using the Z-score normalization algorithm, firstly, the global mean (μ) and standard deviation (σ) of the gray values ​​of all breast images in the denoised breast image data are statistically calculated, and then each pixel of the denoised image is traversed, and the normalized gray value is calculated according to the formula "new gray value = (original gray value - μ) / σ". This ensures that the gray values ​​of all images are uniformly distributed within a range with a mean of 0 and a standard deviation of 1, eliminating gray-scale shifts caused by differences in parameters such as different scanning devices, tube voltage, and tube current, thus obtaining normalized breast image data. Other methods can also be used in other embodiments, which are not limited here.

[0025] In addition, in specific implementation, the normalized breast image data is interpolated to obtain preprocessed breast image data. This can be achieved in the following way: For mammograms and ultrasound images, a bilinear interpolation algorithm is used. First, the target resolution is determined, for example, 0.1mm × 0.1mm pixels. Then, for each target pixel in the image, the gray value of the target pixel is calculated by weighting the distance using the gray values ​​of its four surrounding original pixels. The formula is: "Target gray value = (1-u) × (1-v) × f(x,y) + u × (1-v) × f(x,y)". +1,y)+(1-u)×v×f(x,y+1)+u×v×f(x+1,y+1)”, where u and v are the fractional offsets of the target pixel relative to the original pixel, and f(x,y) is the gray value of the original pixel; a trilinear interpolation algorithm is used for magnetic resonance images, and the gray value of the target voxel is calculated by weighting the gray values ​​of the eight adjacent original voxels around the target voxel according to the three-dimensional spatial distance. Finally, all images are unified to the preset resolution to obtain preprocessed breast image data. Other methods can be used in other embodiments, which are not limited here.

[0026] It should be noted that the preprocessed breast imaging data in this application refers to breast imaging data that has been standardized by denoising, normalization, interpolation and other processes, resulting in breast tissue imaging data that is free from noise and artifact interference, has a uniform grayscale distribution, consistent spatial resolution and accurate geometric location.

[0027] In some embodiments, extracting the region of interest (ROI) of breast lesions in the preprocessed breast imaging data based on the lesion annotation information can be achieved using the following steps: The lesion annotation information is mapped onto the preprocessed breast image data to obtain the lesion annotation region in the breast image; The gray-level similarity threshold range for lesion region expansion identification is determined based on the gray-level distribution characteristics in the preprocessed breast imaging data. By expanding the lesion-labeled region using the gray-scale similarity threshold range, the region of interest for breast lesions in the preprocessed breast imaging data is obtained.

[0028] In specific implementation, mapping the lesion annotation information to the preprocessed breast image data to obtain the lesion annotation region in the breast image can be achieved in the following way: a coordinate linear transformation algorithm is used to establish the spatial correspondence between the lesion annotation information and the preprocessed breast image data. The pixel coordinates of the lesion boundary in the lesion annotation information are first extracted from the breast image data. Combined with the interpolation scaling ratio and the offset of the cropped area during the preprocessing process, the corresponding coordinates in the preprocessed image are calculated by the formula "mapping coordinates = (original coordinates - cropped offset) × scaling ratio". Then, the lesion boundary is redrawn according to the coordinates to form a lesion annotation region that is precisely aligned with the space of the preprocessed image. Other methods can also be used in other embodiments, which are not limited here.

[0029] In addition, in specific implementation, the gray-level similarity threshold range for lesion region expansion identification based on the gray-level distribution characteristics in the preprocessed breast image data can be determined in the following way: First, traverse all pixels or voxels within the lesion annotation area, extract the gray-level value of each pixel or voxel, and calculate the gray-level mean (μ_roi) of the annotation area by summing the values ​​and dividing by the total number of pixels or voxels. Then, calculate the gray-level standard deviation (σ_roi) of the annotation area by calculating the sum of the squares of the differences between each gray-level value and the mean, dividing by the total number of pixels or voxels, and taking the square root. At the same time, select a ring-shaped area with a width of 5-10 pixels or voxels around the lesion annotation area as a background reference area, and use the same statistical method. The gray-level mean (μ_bg) and gray-level standard deviation (σ_bg) of the background region are calculated to quantify the gray-level distribution features. Based on the gray-level distribution features obtained from the above statistics, the statistical thresholding method commonly used in clinical image segmentation is adopted, with a threshold range of [μ_roi-1.5×σ_roi,μ_roi+1.5×σ_roi]. At the same time, it is verified that this threshold range does not overlap with the gray-level distribution of the background region [μ_bg-σ_bg,μ_bg+σ_bg], ensuring the threshold's ability to distinguish between lesions and the background. The above threshold range is used as the gray-level similarity threshold range for extended identification of lesion regions. Other methods can also be used in other embodiments, which are not limited here.

[0030] In addition, in specific implementation, the lesion region expansion of the lesion annotation area through the gray-scale similarity threshold range to obtain the region of interest of breast lesions in the preprocessed breast image data can be achieved in the following way: the lesion region expansion is achieved by using a region growing algorithm, taking all pixels or voxels in the lesion annotation area as seed points, setting 8-connectivity (2D image) or 26-connectivity (3D image) neighborhood judgment rules, traversing the adjacent pixels or voxels around the seed point, and judging whether their gray values ​​fall within the preset gray-scale similarity threshold range. If the condition is met, the pixel or voxel is included in the lesion region and used as a new seed point to continue growing until there are no adjacent pixels or voxels that meet the conditions to be added. Finally, a region of interest of breast lesions with complete boundaries, covering all lesion tissue and without redundant background is obtained. Other methods can also be used in other embodiments, which are not limited here.

[0031] It should be noted that, in this application, the lesion-marked region refers to the binary marker region formed in the preprocessed breast image that is precisely aligned with the lesion location; the gray-scale similarity threshold range refers to the gray-scale value range used to distinguish the lesion from the background. This range can accurately match the gray-scale distribution pattern of the lesion tissue and avoid misjudging the background tissue as the lesion; the region of interest refers to the specific area in the breast image that is finally determined, completely covers the entire lesion tissue and is strictly distinguished from the background, and can be used to analyze the lesion area.

[0032] In step 103, dual-path feature extraction is performed on the region of interest to obtain morphological and texture radiomics features and semantic depth features with lesion semantic representation information.

[0033] In some embodiments, performing dual-path feature extraction on the region of interest to obtain morphological and textural radiomics features and semantic deep features with lesion semantic representation information can be achieved through the following steps: Morphological and texture features are extracted from the region of interest to obtain morphological and texture image omics features; Semantic deep features are extracted from the region of interest to obtain semantic deep features with lesion semantic representation information.

[0034] In specific implementation, morphological and texture features are extracted from the region of interest to obtain the imageomics features of morphology and texture. This can be achieved in the following way: First, the region of interest is masked according to the image dimension (2D / 3D). For morphological features, the area (2D) or volume (3D), surface area, and sphericity = surface area are calculated by traversing all pixels or voxels within the mask. 2 / (36π × volume) 2The geometric parameters are: major axis / minor axis ratio = the ratio of the major axis to the minor axis of the circumscribed cuboid; compactness = volume / volume of the circumscribed sphere; for texture features, parameters such as contrast, correlation, energy, and entropy are calculated based on the gray-level co-occurrence matrix, setting pixel or voxel distance d=1-3 and angle coverage 0° / 45° / 90° / 135°. Parameters of long run advantage and short run advantage are extracted by combining the gray-level run length matrix, and parameters of region size non-uniformity are extracted by the gray-level region size matrix. All extracted morphological and texture parameters are Z-score normalized to eliminate the influence of dimensions, ultimately forming the image omics features of morphology and texture. Other methods can be used in other embodiments, which are not limited here.

[0035] In addition, in specific implementation, semantic deep feature extraction of the region of interest to obtain semantic deep features with lesion semantic representation information can be achieved in the following way: For 2D region of interest mammogram and ultrasound images, the image is first scaled to 224×224 pixels and the single-modal image is copied and expanded into a 3-channel tensor, which is then input into a pre-trained ResNet50 network. The last fully connected layer of the network is removed, and the 2048-dimensional vector output by the second to last layer is taken. For 3D region of interest magnetic resonance imaging, the image is scaled to a 128×128×128 voxel tensor, which is then input into a pre-trained 3D ResNet18 network. The 1024-dimensional vector output by the global average pooling layer is taken. The above two vectors are the semantic deep features that can represent high-level information such as lesion edge morphology, internal density heterogeneity, and histopathological correlation features. Other methods can also be used in other embodiments, which are not limited here.

[0036] It should be noted that the radiomics features in this application reflect the intuitive and quantifiable spatial structure and internal gray-scale distribution of lesions in breast images, which is a direct representation of the physical characteristics of the lesions; the semantic depth features reflect the deeper abstract semantic information of lesions in breast images, including the fine structure of the lesion edge, the heterogeneity of internal tissues, the association characteristics with surrounding normal tissues, and the potential pathological attribute association information, which can be used to analyze the lesion features in breast images.

[0037] In step 104, a deep learning feature fusion model for breast images is constructed. Based on the deep learning feature fusion model, the radiomics features and the semantic deep features are fused to obtain the fused feature vector of the breast images.

[0038] In some embodiments, constructing a deep learning feature fusion model for breast images can be achieved through the following steps: Establish a feature adaptation preprocessing branch for image omics features and semantic deep features; The feature adaptation preprocessing branch is weighted based on an attention mechanism to obtain a deep learning feature fusion model for breast images.

[0039] In specific implementation, the feature adaptation preprocessing branch for radiomics features and semantic deep features can be implemented as follows: For radiomics features, first, mean imputation is used to fill in missing values, and outliers are removed using the 3σ principle, i.e., values ​​deviating from the mean by more than three times the standard deviation are removed. Then, a fully connected layer is connected to perform dimensionality standardization, mapping the original 100+ dimensional morphological and texture feature vectors to 256 dimensions. The fully connected layer is configured with a batch normalization layer to stabilize data distribution, accelerate training convergence, and introduce nonlinear transformations using the ReLU activation function to adapt to feature complexity. A dropout rate of 0.3 is also set. To avoid overfitting, a dropout rate is used. For semantic depth features, considering the dimensionality difference between 2D images (corresponding to 2048-dimensional vectors) and 3D images (corresponding to 1024-dimensional vectors), a fully connected layer is also used to uniformly reduce the dimensionality to 256 dimensions. This fully connected layer uses the LeakyReLU activation function with a negative slope of 0.2 to solve the gradient vanishing problem of the ReLU function in the negative interval, ensuring that the two types of features have consistent dimensions and that the data distribution is adapted to subsequent weighted processing. This completes the construction of the feature adaptation preprocessing branch for image omics features and semantic depth features. Other methods can be used in other embodiments, which are not limited here.

[0040] Furthermore, in specific implementation, the deep learning feature fusion model for breast images can be obtained by weighting the feature adaptation preprocessing branch based on the attention mechanism, as follows: Weighting is performed based on the channel attention mechanism, using the SENet attention structure from the field of medical image feature fusion. The two sets of preprocessed 256-dimensional features are input into independent attention submodules. First, global average pooling is performed on the feature vectors. By calculating the mean of each feature channel, a global feature statistical value of 1×256 is obtained. Then, the channel weights are learned through a two-level fully connected layer ("256-dimensional to 64-dimensional to 256-dimensional"). The first-level fully connected layer achieves dimensionality compression to reduce computation, and the second level achieves dimensionality restoration. The weights are mapped to the 0-1 interval using the Sigmoid activation function. Finally, the weights are compared with the corresponding preprocessed... The feature vectors are multiplied element-wise to enhance key feature channels for lesion classification and suppress redundant feature channels. Then, a hybrid fusion layer is built. The two sets of 256-dimensional features after attention weighting are first added element-wise to obtain a 256-dimensional intermediate fused feature. This intermediate fused feature is then concatenated with the original two sets of weighted features to form a 768-dimensional feature vector. Subsequently, a fully connected layer is connected to reduce the 768-dimensional feature to 128-dimensional features. This fully connected layer is configured with a batch normalization layer and a ReLU activation function to achieve deep feature fusion and dimensionality optimization. Finally, the above structure of feature adaptation preprocessing, attention weighting, and hybrid fusion layer is connected and integrated layer by layer to form a deep learning feature fusion model for breast images that can be directly used for subsequent lesion classification. Other methods can be used in other embodiments, which are not limited here.

[0041] It should be noted that the feature adaptation preprocessing branch in this application represents the pre-feature regularization module in the deep learning feature fusion model. Specifically, it refers to the standardized processing unit designed for radiomics features and semantic deep features that differ in source, dimension, and distribution. This eliminates problems such as dimensional mismatch and inconsistent data distribution between the two types of features, reflecting the regularization and adaptation capabilities of the original features. The deep learning feature fusion model represents a complete feature processing model that integrates the feature adaptation preprocessing branch, the attention weighting module, and the hybrid fusion layer. It can organically combine the intuitive physical quantitative information of radiomics features with the abstract pathological correlation information of semantic deep features, reflecting the comprehensive extraction and integration capabilities of multi-dimensional features of breast lesions, and can output more comprehensive and discriminative fused features.

[0042] In some embodiments, reference Figure 3 As shown, this figure is an exemplary flowchart of determining the fused feature vector in some embodiments of this application. In this embodiment, the fused feature vector of breast images is obtained by fusing the radiomics features and the semantic deep features based on the deep learning feature fusion model using the following steps: In step 1041, the image omics features and the semantic deep features are input into the deep learning feature fusion model, and the feature fusion step in the deep learning feature fusion model is executed; In step 1042, the fused feature vector of the breast image is output based on the deep learning feature fusion model.

[0043] In practice, radiomics features and semantic deep features are simultaneously input into the constructed deep learning feature fusion model, triggering the model's built-in feature fusion process. After the process starts, the model's feature adaptation preprocessing branch first completes data normalization. For radiomics features, the mean imputation method is used to fill in missing values, and outliers deviating from the mean by three times the standard deviation are removed according to the 3σ principle. Then, the data is input into a fully connected layer configured with batch normalization layers and ReLU activation functions, mapping it from the original 100+ dimensions to 256 dimensions, while setting a dropout rate of 0.3 to avoid redundant information interference. For semantic deep features, a fully connected layer configured with a LeakyReLU activation function with a negative slope of 0.2 is used to uniformly reduce the dimensionality to 256 dimensions, ensuring that the two types of features have consistent dimensions and data distribution fit. Subsequently, the model enters the attention weighting stage, using the SENet channel attention structure, inputting the two sets of 256-dimensional normalized features into independent sub-modules, first through... Global average pooling is used to calculate the mean value of each feature channel to obtain the global statistical value. Then, two fully connected layers ("256-dimensional to 64-dimensional to 256-dimensional") are used to compress and restore the dimension to learn the channel weights. After Sigmoid activation, the weights are mapped to the 0-1 interval and multiplied element-wise with the corresponding regular features to strengthen the key features of lesion classification and suppress redundant features. Next, the model's hybrid fusion layer is started. First, the two sets of weighted 256-dimensional features are added element-wise to obtain a 256-dimensional intermediate feature. Then, this intermediate feature is concatenated with the two sets of weighted features to form a 768-dimensional feature vector. This vector is then input into a fully connected layer with a batch normalization layer and a ReLU activation function to reduce the dimension to 128, completing the deep integration of features. Finally, the model's output layer directly outputs the fused feature vector of breast images that integrates radiomics physical quantitative information and semantic deep abstract pathological correlation information. Other methods can be used in other embodiments, which are not limited here.

[0044] It should be noted that the fusion feature vector in this application represents a high-dimensional quantized vector that combines physical interpretability with deep semantic information. Its essence is a comprehensive representation of the multi-dimensional features of breast lesions, reflecting the all-round lesion characteristics from "appearance physical features" to "essential pathological correlation features". It can provide more comprehensive and discriminative feature support for tasks such as classifying benign and malignant breast lesions and determining pathological types.

[0045] In step 105, lesion samples of breast images are obtained, and lesion category analysis is performed on the breast images using the lesion samples and the fused feature vector to obtain the classification results of breast lesions.

[0046] It should be noted that the breast lesion sample in this application represents a standardized dataset containing complete information related to breast lesions and undergoing standardized processing. It is a systematic encapsulation of clinical images and related information of breast lesions; it reflects the correspondence between the imaging manifestations, pathological attributes, and clinical characteristics of breast lesions, including the visualization features of the morphology, density / signal, and texture of breast lesions under different imaging modalities, as well as the core clinical diagnostic information of the benign or malignant status, pathological classification, and grading of the lesions; the lesion sample includes breast images, lesion location and boundary information annotated by radiologists, radiomics features and semantic depth feature data of the corresponding lesions, and also covers the patient's basic clinical information, pathological examination report results, and follow-up data correlation information, providing real, comprehensive, and clinically valuable data source support for the training and performance verification of breast lesion-related models.

[0047] In some embodiments, the classification result of breast lesions can be obtained by performing lesion category analysis on breast images using the lesion sample and the fused feature vector, which can be achieved by the following steps: The lesion sample and the fusion feature vector are correlated and fused to obtain the correlated fusion feature; Based on the aforementioned correlation fusion features, breast images are classified into lesions, thereby obtaining the classification results of breast lesions.

[0048] In specific implementation, the lesion sample and the fusion feature vector are associated and fused to obtain the associated fusion feature. This can be achieved in the following way: Perform the association and fusion operation between the lesion sample and the fusion feature vector. First, extract key auxiliary information from the lesion sample, including the clinical category label of the lesion and the patient's basic clinical data. Then, map and bind the 128-dimensional fusion feature vector corresponding to each lesion sample one by one with the above-processed clinical auxiliary information and category label. The associated fusion feature is formed by vector concatenation. For example, 128-dimensional fusion feature + 2-dimensional clinical auxiliary information = 130-dimensional associated fusion feature. At the same time, the 3σ principle is used to remove feature samples with outliers after association to ensure data integrity and consistency. Other methods can also be used in other embodiments, which are not limited here.

[0049] Furthermore, in specific implementation, classifying breast lesions based on the aforementioned correlation fusion features to obtain the classification results of breast lesions can be achieved in the following way: A support vector machine (SVM) from the field of medical image classification is selected as the classifier, and a radial basis function (RBF) is configured as the kernel function. The optimal parameter combination is determined by traversing the regularization parameter C (range 0.1-10.0) and the kernel function parameter gamma (range 0.001-0.1) using a grid search method. Simultaneously, a recursive feature elimination method is used to screen the top 80% of the correlation fusion features, removing redundant information and reducing computational complexity. The correlation fusion features are then stratified and divided into groups according to a 7:1:2 ratio. Training, validation, and test sets are used to ensure consistent lesion category distribution across sets. A 5-fold cross-validation method is employed to train the classifier. After each training round, model performance is evaluated using the classification accuracy, sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve on the validation set. When validation set performance shows no improvement for three consecutive rounds, the optimal parameters are fixed. Finally, the associated fusion features corresponding to the breast images to be analyzed are input into the trained classifier. By calculating the similarity between these features and the feature spaces of each category of samples, the class label with the highest probability value and its corresponding confidence score are output, which is the final classification result of the breast lesion. Other methods can be used in other embodiments, which are not limited here.

[0050] It should be noted that the association fusion feature in this application represents a high-dimensional quantitative feature set formed by concatenating the fusion feature vector that integrates radiomics physical quantitative information and semantic deep abstract information with the standardized clinical auxiliary information and lesion category labels in the lesion sample. In essence, it is a comprehensive encapsulation of the imaging features and clinical association information of breast lesions, reflecting the multi-dimensional and all-round characteristics of lesions from imaging manifestations to clinical background, and providing a more comprehensive basis for lesion classification. The classification result of breast lesions represents the result of classifying lesions in breast imaging, which can intuitively present the potential pathological attributes of lesions and provide objective and quantitative reference conclusions for clinical radiological diagnosis and treatment plan formulation.

[0051] Furthermore, in another aspect of this application, in some embodiments, this application provides a deep learning-based breast radiomics classification system, with reference to... Figure 4 The figure is a schematic diagram of the structure of a deep learning-based breast radiomics classification system according to some embodiments of this application. The deep learning-based breast radiomics classification system 400 includes: an acquisition module 401, a processing module 402, and an execution module 403, which are described below: The acquisition module 401 in this application is mainly used to acquire breast imaging data and corresponding lesion annotation information; Processing module 402, in this application, is used to preprocess the breast imaging data to obtain preprocessed breast imaging data, and extract the region of interest of breast lesions in the preprocessed breast imaging data based on the lesion annotation information. It should be noted that the processing module 402 in this application is also used to perform dual-path feature extraction on the region of interest to obtain morphological and texture radiomics features and semantic depth features with lesion semantic representation information. Additionally, it should be noted that the processing module 402 in this application is also used to construct a deep learning feature fusion model for breast images, and to perform feature fusion on the radiomics features and the semantic deep features based on the deep learning feature fusion model to obtain a fused feature vector of breast images. The execution module 403 in this application is mainly used to acquire lesion samples of breast images, and to perform lesion category analysis on the breast images through the lesion samples and the fused feature vector to obtain the classification result of breast lesions.

[0052] In addition, this application also provides a computer device, the computer device including a memory and a processor, the memory storing code, and the processor being configured to acquire the code and execute the above-described deep learning-based breast radiomics classification method.

[0053] In some embodiments, reference Figure 5 The figure is a schematic diagram of a computer device implementing a deep learning-based breast radiomics classification method according to some embodiments of this application. The deep learning-based breast radiomics classification method in the above embodiments can... Figure 5 The computer device shown is used to implement this, and the computer device 500 includes at least one processor 501, a communication bus 502, a memory 503, and at least one communication interface 504.

[0054] Processor 501 can be a general-purpose central processing unit (CPU) or an application-specific integrated circuit (ASIC).

[0055] The communication bus 502 can be used to transmit information between the aforementioned components.

[0056] Memory 503 may be a read-only memory (ROM) or other type of static storage device capable of storing static information and instructions, random access memory (RAM) or other type of dynamic storage device capable of storing information and instructions, or electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CDROM) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital versatile optical discs, Blu-ray discs, etc.), magnetic disks or other magnetic storage devices, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but not limited thereto. Memory 503 may exist independently and be connected to processor 501 via communication bus 502. Memory 503 may also be integrated with processor 501.

[0057] The memory 503 stores program code for executing the scheme of this application, and its execution is controlled by the processor 501. The processor 501 executes the program code stored in the memory 503. The program code may include one or more software modules. The method used in the above embodiments can be implemented by the processor 501 and one or more software modules in the program code in the memory 503.

[0058] Communication interface 504 uses any transceiver-like device to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), etc.

[0059] In a specific implementation, as one example, a computer device may include multiple processors, each of which may be a single-core (single CPU) processor or a multi-core (multi CPU) processor. Here, a processor may refer to one or more devices, circuits, and / or processing cores used to process data (e.g., computer program instructions).

[0060] The aforementioned computer device can be a general-purpose computer device or a special-purpose computer device. In specific implementations, the computer device can be a desktop computer, a portable computer, a network server, a handheld digital assistant (PDA), a mobile phone, a tablet computer, a wireless terminal device, a communication device, or an embedded device. This application does not limit the type of computer device.

[0061] In addition, this application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described deep learning-based breast radiomics classification method.

[0062] Although preferred embodiments of this application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of this application.

[0063] Obviously, those skilled in the art can make various modifications and variations to this application without departing from the spirit and scope of this application. Therefore, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.

Claims

1. A deep learning-based breast radiomics classification method, characterized in that, Includes the following steps: Obtain breast imaging data and corresponding lesion annotation information; The breast imaging data is preprocessed to obtain preprocessed breast imaging data, and the region of interest of breast lesions in the preprocessed breast imaging data is extracted based on the lesion annotation information. Dual-path feature extraction is performed on the region of interest to obtain morphological and textural image omics features and semantic depth features with lesion semantic representation information; A deep learning feature fusion model for breast images is constructed, and the radiomics features and semantic deep features are fused based on the deep learning feature fusion model to obtain the fused feature vector of breast images. Obtain lesion samples from breast images, and perform lesion category analysis on the breast images using the lesion samples and the fused feature vector to obtain the classification results of breast lesions.

2. The method as described in claim 1, characterized in that, Preprocessing the breast imaging data to obtain preprocessed breast imaging data specifically includes: The breast imaging data is denoised to obtain denoised breast imaging data. The denoised breast image data is normalized to obtain normalized breast image data. The normalized breast image data is interpolated to obtain preprocessed breast image data.

3. The method as described in claim 1, characterized in that, The specific regions of interest (ROIs) extracted from the preprocessed breast imaging data based on the lesion annotation information include: The lesion annotation information is mapped onto the preprocessed breast image data to obtain the lesion annotation region in the breast image; The gray-level similarity threshold range for lesion region expansion identification is determined based on the gray-level distribution characteristics in the preprocessed breast imaging data. By expanding the lesion-labeled region using the gray-scale similarity threshold range, the region of interest for breast lesions in the preprocessed breast imaging data is obtained.

4. The method as described in claim 1, characterized in that, Dual-path feature extraction is performed on the region of interest to obtain morphological and textural imagemic features and semantic deep features with lesion semantic representation information, specifically including: Morphological and texture features are extracted from the region of interest to obtain morphological and texture image omics features; Semantic deep features are extracted from the region of interest to obtain semantic deep features with lesion semantic representation information.

5. The method as described in claim 1, characterized in that, The construction of a deep learning feature fusion model for breast imaging specifically includes: Establish a feature adaptation preprocessing branch for image omics features and semantic deep features; The feature adaptation preprocessing branch is weighted based on an attention mechanism to obtain a deep learning feature fusion model for breast images.

6. The method as described in claim 1, characterized in that, Based on the deep learning feature fusion model, the radiomics features and the semantic deep features are fused to obtain the fused feature vector of the breast image, specifically including: The image omics features and the semantic deep features are input into the deep learning feature fusion model, and the feature fusion step in the deep learning feature fusion model is executed. The deep learning feature fusion model outputs a fused feature vector of breast images.

7. The method as described in claim 1, characterized in that, By analyzing the lesion categories of breast images using the lesion samples and the fused feature vectors, the classification results of breast lesions specifically include: The lesion sample and the fusion feature vector are correlated and fused to obtain the correlated fusion feature; Based on the aforementioned correlation fusion features, breast images are classified into lesions, thereby obtaining the classification results of breast lesions.

8. A breast imaging omics classification system based on deep learning, characterized in that, include: The acquisition module is used to acquire breast imaging data and corresponding lesion annotation information; The processing module is used to preprocess the breast imaging data to obtain preprocessed breast imaging data, and extract the region of interest of breast lesions in the preprocessed breast imaging data based on the lesion annotation information. The processing module is also used to perform dual-path feature extraction on the region of interest to obtain morphological and texture radiomics features and semantic depth features with lesion semantic representation information. The processing module is also used to construct a deep learning feature fusion model for breast images, and to perform feature fusion on the radiomics features and the semantic deep features based on the deep learning feature fusion model to obtain a fused feature vector of breast images. The execution module is used to acquire lesion samples from breast images, and to perform lesion category analysis on the breast images using the lesion samples and the fused feature vector to obtain the classification results of breast lesions.

9. A computer device, characterized in that, The computer device includes a memory and a processor, the memory storing code, and the processor being configured to retrieve the code and execute the deep learning-based breast radiomics classification method as described in any one of claims 1 to 7.

10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements the deep learning-based breast radiomics classification method as described in any one of claims 1 to 7.