A deep learning method for thyroid nodule rapid screening

By combining a segmentation network with adaptive local contrast enhancement and shape prior constraints with a multi-branch convolutional neural network, the problems of image imbalance, inaccurate segmentation, and incomplete risk assessment in thyroid nodule screening are solved, achieving more efficient nodule segmentation and benign/malignant differentiation, and providing multi-level risk stratification screening suggestions.

CN122222945APending Publication Date: 2026-06-16自贡市第一人民医院

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
自贡市第一人民医院
Filing Date
2026-03-11
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies for thyroid nodule screening suffer from problems such as uneven image preprocessing, inaccurate segmentation, incomplete classification, and unreasonable risk assessment, which affect the accuracy and stability of diagnosis.

Method used

An adaptive local contrast enhancement and shape prior constraint segmentation network combined with a multi-branch convolutional neural network and a meta-learning classifier are employed. By adaptively adjusting image contrast and introducing shape and edge constraint loss functions, multiple structural features are explicitly extracted, and multi-level risk assessment is performed.

🎯Benefits of technology

It improves the accuracy of automatic segmentation of thyroid nodules and the stability of benign/malignant differentiation, generates segmentation results that are more consistent with anatomical features, and provides screening suggestions with multi-level risk stratification.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122222945A_ABST
    Figure CN122222945A_ABST
Patent Text Reader

Abstract

The application discloses a kind of deep learning methods for thyroid nodule rapid screening: obtaining thyroid ultrasound original image, the image is adaptively locally contrast enhanced, and contrast enhanced image is obtained;The contrast enhanced image is input into the segmentation network based on encoding-decoding structure;The segmentation probability map is thresholded to obtain a nodule binary mask, and a structure feature vector is constructed;The region of interest is input into multiple convolutional neural network branches, respectively, the structure feature vector is input into the structure feature branch, multiple intermediate classification results or deep feature representations are obtained, and the meta-learning classifier is integrated, the benign and malignant prediction probability of nodule and the multi-level risk stratification suggestion are output.The application considers image adaptive enhancement, shape prior constraint segmentation and structure feature and deep feature fusion, which can improve the accuracy and stability of thyroid nodule automatic segmentation and benign and malignant discrimination.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of medical image processing technology and relates to a deep learning method for rapid screening of thyroid nodules. Background Technology

[0002] Thyroid cancer is one of the most common malignant tumors in clinical practice, and its incidence has been rising globally in recent years. Thyroid nodules have a high detection rate in the general population, especially among those undergoing physical examinations and the middle-aged and elderly. Ultrasound imaging, due to its non-invasive, radiation-free, low-cost, and real-time characteristics, has become the preferred imaging method for screening and evaluating thyroid nodules.

[0003] However, thyroid ultrasound images often suffer from low contrast, high noise levels, and numerous artifacts, making it difficult to identify small nodules and their internal micro-hypoechoic points in the original images. Current clinical diagnosis relies heavily on the experience and subjective judgment of ultrasound physicians, which is significantly influenced by the physician's skill and experience, especially in primary healthcare institutions.

[0004] With the development of deep learning technology, numerous studies in recent years have utilized convolutional neural networks for automatic segmentation and benign / malignant classification of thyroid nodules. U-Net and its improved structures are widely used for medical image segmentation, while convolutional neural networks such as ResNet are used for lesion benign / malignant differentiation. Some studies have even linked segmentation and classification to form an end-to-end automated diagnostic process. However, existing technologies still have the following shortcomings: In the image preprocessing stage, fixed-window contrast enhancement or simple filtering is often used without fully considering the differences in thyroid tissue depth and local texture. This may lead to over-enhancement of superficial areas and insufficient enhancement of deep areas, affecting the performance of subsequent segmentation and classification.

[0005] During the segmentation stage, most methods directly use a common segmentation loss function, which does not adequately consider the overall shape of the thyroid nodule, the continuity of its edges, and other anatomical priors. This can easily lead to problems such as broken segmentation contours, jagged edges, or unreasonable outward expansion.

[0006] During the classification phase, some methods only extract depth features from the region of interest of the nodules, making it difficult to explicitly utilize morphological features widely used in clinical scoring systems such as TI-RADS, such as aspect ratio, boundary irregularity, density of small hyperechoic spots, and halo features.

[0007] Most existing systems only output binary classification or simple probability values, without combining population screening scenarios to perform multi-level risk stratification, which is not conducive to the formulation of follow-up and further examination strategies.

[0008] Therefore, it is necessary to provide an integrated method that combines image adaptive enhancement, segmentation network with shape prior constraints, multi-branch classification network that integrates structural and depth features, and multi-level risk stratification mechanism to improve the accuracy and stability of automatic segmentation and benign / malignant differentiation of thyroid nodules, and better serve the rapid screening of thyroid nodules. Summary of the Invention

[0009] The purpose of this invention is to overcome the shortcomings of the prior art and provide a deep learning method for rapid screening of thyroid nodules.

[0010] To achieve the above objectives, the present invention employs the following technical solutions: A deep learning method for rapid screening of thyroid nodules includes the following steps: S1. Obtain the original ultrasound image of the thyroid region, and perform adaptive local contrast enhancement on the original ultrasound image to obtain a contrast-enhanced image; S2. Input the contrast-enhanced image into a segmentation network based on an encoder-decoder structure to obtain a segmentation probability map representing the thyroid nodule region; wherein, the segmentation network is optimized during training with a total loss function that includes segmentation loss, shape regularization loss and edge constraint loss, so as to constrain the prediction mask to be close to the real contour in terms of overall shape and edge continuity. S3. Threshold the segmentation probability map to obtain a nodule binary mask, and determine the region of interest of the nodule on the original ultrasound image. Based on the region of interest of the nodule and the nodule binary mask, calculate a variety of structural parameters including aspect ratio, nodule area, boundary irregularity, internal high echo density and halo thickness ratio, and construct a structural feature vector. S4. After normalizing and unifying the size of the region of interest of the nodule, input it into at least two branches of a convolutional neural network. Input the structural feature vector into the structural feature branch, and each branch outputs the intermediate classification result or the deep feature representation. S5. Input the multiple intermediate classification results or deep feature representations into the meta-learning classifier for ensemble learning, and output the benign or malignant prediction probability and corresponding risk level of the thyroid nodules.

[0011] As a preferred embodiment, the adaptive local contrast enhancement in step S1 includes: dividing the original ultrasound image into multiple depth band regions along the row direction; setting multiple local windows for each depth band region, calculating the gray-level variance for each local window, and using the gray-level variance to characterize the local texture complexity; selecting the local window size and contrast enhancement parameters according to the texture complexity of each depth band region and its corresponding depth, and performing local histogram equalization or contrast-limited adaptive histogram equalization processing on each depth band region to obtain the contrast-enhanced image.

[0012] As a preferred embodiment, the encoding end of the segmentation network includes multi-scale input branches, with different branches receiving contrast-enhanced images at different scaling ratios; the decoding end of the segmentation network sets side output layers at different scales, with each side output layer outputting a segmentation probability map at the corresponding scale, and participating in the calculation of the total loss function respectively.

[0013] As a preferred embodiment, the shape regularization loss is implemented based on a signed distance transformation, including: calculating a signed distance transformation map based on the labeled nodule contours, where the distance value of the foreground region is positive and the distance value of the background region is negative, and the absolute value of the distance represents the distance to the true contour; mapping the predicted mask to a preset interval and multiplying it pixel by pixel with the signed distance transformation map and summing the squares, which is used as the shape regularization loss.

[0014] As a preferred embodiment, the edge constraint loss is implemented based on the curvature penalty of the predicted mask contour, including: sampling the contour of the predicted mask, calculating the local curvature or curvature change of each sampling point; applying a penalty term to the pixels corresponding to the boundary segments where the local curvature change exceeds a preset threshold, as the edge constraint loss, to suppress the jaggedness or unreasonable protrusion of the contour.

[0015] As a preferred embodiment, the structural feature vector includes at least one or more of the following structural parameters: the ratio of the major axis to the minor axis calculated from the major axis and the minor axis of the minimum bounding rectangle of the nodule; the boundary irregularity calculated from the difference between the area of ​​the binary mask of the nodule and the fitted ellipse; the number and area ratio of the internal high grayscale small regions obtained by performing grayscale threshold segmentation and connected component analysis within the nodule mask, and the internal high echo density calculated accordingly; the halo width estimated by grayscale statistics with a preset range outside the nodule boundary as the halo region, and the ratio of the halo thickness to the nodule diameter calculated.

[0016] As a preferred embodiment, the multi-branch network in step S4 includes: a first convolutional neural network branch, which takes the grayscale image of the region of interest of the nodule as input and is used to extract the depth features of the overall shape and internal texture of the nodule; a second convolutional neural network branch, which takes the combined image of the nodule outline superimposed on the grayscale image as input and is used to extract the depth features related to the boundary shape; and a structural feature branch, which takes the structural feature vector as input and outputs a structural feature classification vector through a multi-layer fully connected network.

[0017] As a preferred approach, the meta-learning classifier is a multilayer perceptron model or a gradient boosting tree model, and the multiple intermediate classification results or deep feature representations are concatenated or weighted fused before being input into the meta-learning classifier.

[0018] As a preferred approach, the thyroid nodules are classified into at least three risk levels, including low risk, medium risk and high risk, based on the predicted probabilities of benign or malignant transformation, and corresponding follow-up periods and further examination recommendations are output for each level.

[0019] The present invention has the following advantages: I. By dividing the original ultrasound image into multiple strip regions along the depth direction and combining the texture complexity represented by the local gray-level variance, the contrast enhancement parameters of each region are adaptively adjusted to make the image quality of the superficial and deep regions more balanced, significantly improve the visibility of nodule boundaries and small hyperechoic points, and provide better input for subsequent segmentation and classification.

[0020] Second, by introducing shape regularization loss based on signed distance transformation and edge constraint loss based on contour curvature into the segmentation network training, the overall shape and edge continuity of the prediction mask are constrained, which can effectively reduce problems such as contour breakage, jagged edges and unreasonable outward expansion, and generate nodule segmentation results that are more consistent with anatomical priors.

[0021] Third, the structural features such as the aspect ratio, boundary irregularity, internal high echo density, and halo thickness ratio are explicitly calculated using the segmentation results. These features are then fused with the deep features extracted by the multi-branch convolutional neural network through a meta-learning classifier. This approach retains the advantages of deep learning in feature representation while enhancing the interpretability and stability of benign and malignant discrimination. Attached Figure Description

[0022] Figure 1 This is a schematic diagram illustrating the principle of the present invention. Detailed Implementation

[0023] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.

[0024] Example: A deep learning method for rapid screening of thyroid nodules includes the following steps: During the offline training phase, thyroid ultrasound images are acquired from clinical ultrasound equipment or portable ultrasound equipment. The images are typically in two-dimensional grayscale format with resolutions such as 512×512 pixels or 640×480 pixels. The segmentation network, multi-branch classification network, and meta-learning classifier are trained using labeled nodule outlines and benign / malignant labels.

[0025] During the online inference phase, the system acquires raw thyroid ultrasound images from ultrasound equipment or image storage system, executes steps S1 to S5 in sequence, and finally outputs the benign or malignant prediction probability and risk level of the nodules, and provides screening suggestions.

[0026] The following provides a detailed explanation of steps S1 to S5: The input raw thyroid ultrasound image is divided into multiple band-shaped regions along the row direction, for example, uniformly divided into superficial, mid-superficial, mid-deep, and deep layers based on image height. Multiple sliding local windows are set for each band-shaped region, and the gray-level variance of each window is calculated. The larger the gray-level variance, the more complex the texture at that location.

[0027] Based on the average or quantile gray-level variance of each strip region and its corresponding depth, determine the local window size and contrast enhancement parameters for that region: for shallow and textured regions, select a larger window and a smaller contrast enhancement coefficient to avoid over-enhancement; for deep and textured regions, select a smaller window and a larger contrast enhancement coefficient to improve the visibility of deep nodules and structures behind them.

[0028] Local histogram equalization or contrast-limited adaptive histogram equalization (CLAHE) is performed on each strip region to obtain enhanced sub-images. An overlap area is set at the boundary of the strip regions, and the sub-images are stitched together into a complete contrast-enhanced image by weighted averaging and / or smoothing filtering.

[0029] The segmentation network employs an encoder-decoder structure. The encoder end includes multiple levels of convolutional and downsampling layers, while the decoder end includes multiple levels of upsampling layers connected to the encoder end via skip connections. To enhance multi-scale feature representation capabilities, the encoder end features multi-scale input branches, with different branches receiving contrast-enhanced images at different scaling ratios. The decoder end features side-output layers at different scales, upsampling the feature maps at each scale and outputting segmentation probability maps at the corresponding scales.

[0030] During network training, a signed distance transformation map is calculated for the labeled mask of each sample. The distance values ​​for the foreground region are positive, and the distance values ​​for the background region are negative. The absolute value of the distance represents the distance to the true contour. After mapping the network prediction mask to a preset interval, it is multiplied pixel by pixel with the signed distance transformation map and the sum of squares is obtained to obtain the shape regularization loss, which is used to penalize predictions that deviate from the true shape.

[0031] Simultaneously, the contour points of the prediction mask are extracted and uniformly sampled, and the local curvature or curvature change of each sampling point is calculated. For contour segments whose curvature changes exceed the threshold, the loss weight of their corresponding pixels is increased to form an edge constraint loss, thereby suppressing jagged or unreasonably protruding edges.

[0032] The total loss function is a weighted combination of segmentation loss, shape regularization loss, and edge constraint loss. The segmentation loss can be a weighted sum of Dice loss and cross-entropy loss. By minimizing the total loss function, the segmentation network is trained to generate nodule masks with reasonable overall shape and smooth, continuous edges while ensuring segmentation accuracy.

[0033] Thresholding is applied to the segmentation probability map to obtain a binary mask for the nodule, and connected component analysis is used to obtain the target nodule region. The minimum bounding rectangle of the mask is calculated on the original ultrasound image, and extended outward by a predetermined number of pixels, with this rectangular region serving as the region of interest for the nodule. Based on the mask, shape features such as aspect ratio, nodule area, and boundary irregularity are calculated. The number and area ratio of small high-gray-level regions inside the nodule are statistically analyzed using gray-level thresholding and connected component analysis to estimate the density of internal high-echo points. A halo region is constructed within a predetermined range outside the nodule edge, and the halo width is measured by gray-level statistics, calculating the ratio of halo thickness to nodule diameter. The above parameters are concatenated into a structural feature vector in a predetermined order.

[0034] A multi-branch classification network is constructed: the first convolutional neural network branch takes the normalized nodule grayscale image as input to extract overall morphology and internal texture features; the second convolutional neural network branch takes a combined image of nodule contours superimposed on the grayscale image as input to emphasize boundary shape features; the structural feature branch takes the structural feature vector as input and outputs a structural feature classification vector through a multi-layer fully connected network. Each branch outputs intermediate classification results or high-dimensional feature representations to provide input for the subsequent meta-learning classifier.

[0035] The intermediate classification results or deep feature representations output from the multiple branches are concatenated and input into a meta-learning classifier for ensemble learning. The meta-learning classifier can be a multilayer perceptron (MLP) or a gradient boosting tree model, etc. By training the meta-learning classifier, it learns how to integrate the information from each branch and outputs the predicted probability that the nodule is malignant.

[0036] Based on the malignancy prediction probability, two or more thresholds are set to classify nodules into at least three risk levels, such as low risk, intermediate risk, and high risk. For different risk levels, the system provides corresponding follow-up and further examination suggestions. For example, low-risk nodules are advised to be re-examined during routine physical examinations, intermediate-risk nodules are advised to be followed up or undergo fine-needle aspiration biopsy within a certain period of time, and high-risk nodules are advised to be referred to a surgical specialist for further evaluation as soon as possible.

[0037] This invention is not limited to the specific embodiments described above. The invention extends to any new feature or combination disclosed in this specification, as well as any new method or process step or combination disclosed herein.

Claims

1. A deep learning method for rapid screening of thyroid nodules, characterized in that, Includes the following steps: S1. Obtain the original ultrasound image of the thyroid region, and perform adaptive local contrast enhancement on the original ultrasound image to obtain a contrast-enhanced image; S2. Input the contrast-enhanced image into a segmentation network based on an encoder-decoder structure to obtain a segmentation probability map representing the thyroid nodule region; wherein, the segmentation network is optimized during training with a total loss function that includes segmentation loss, shape regularization loss and edge constraint loss, so as to constrain the prediction mask to be close to the real contour in terms of overall shape and edge continuity. S3. Threshold the segmentation probability map to obtain a nodule binary mask, and determine the region of interest of the nodule on the original ultrasound image. Based on the region of interest of the nodule and the nodule binary mask, calculate a variety of structural parameters including aspect ratio, nodule area, boundary irregularity, internal high echo density and halo thickness ratio, and construct a structural feature vector. S4. After normalizing and unifying the size of the region of interest of the nodule, input it into at least two branches of a convolutional neural network. Input the structural feature vector into the structural feature branch, and each branch outputs the intermediate classification result or the deep feature representation. S5. Input the multiple intermediate classification results or deep feature representations into the meta-learning classifier for ensemble learning, and output the benign or malignant prediction probability and corresponding risk level of the thyroid nodules.

2. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that, The adaptive local contrast enhancement in step S1 includes: dividing the original ultrasound image into multiple depth band regions along the row direction; setting multiple local windows for each depth band region, calculating the gray-level variance for each local window, and using the gray-level variance to characterize the local texture complexity; selecting the local window size and contrast enhancement parameters according to the texture complexity of each depth band region and its corresponding depth, and performing local histogram equalization or contrast-limited adaptive histogram equalization processing on each depth band region to obtain the contrast-enhanced image.

3. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that: The encoding end of the segmentation network includes multi-scale input branches, with different branches receiving contrast-enhanced images at different scaling ratios; the decoding end of the segmentation network sets side output layers at different scales, with each side output layer outputting a segmentation probability map at the corresponding scale, and participating in the calculation of the total loss function respectively.

4. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that, The shape regularization loss is based on a signed distance transformation and includes: calculating a signed distance transformation map based on the labeled nodule contours, where the distance value of the foreground region is positive and the distance value of the background region is negative, and the absolute value of the distance represents the distance to the true contour; mapping the predicted mask to a preset interval and multiplying it pixel by pixel with the signed distance transformation map and summing the squares as the shape regularization loss.

5. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that, The edge constraint loss is implemented based on the curvature penalty of the predicted mask contour, including: sampling the contour of the predicted mask, calculating the local curvature or curvature change of each sampling point; applying a penalty term to the pixels corresponding to the boundary segments where the local curvature change exceeds a preset threshold, as the edge constraint loss, to suppress the jaggedness or unreasonable protrusion of the contour.

6. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that, The calculation methods for each structural parameter in the structural feature vector include: the ratio of the major axis to the minor axis calculated from the major axis and the minor axis of the minimum bounding rectangle of the nodule; the boundary irregularity calculated from the difference between the area of ​​the binary mask of the nodule and the fitted ellipse of the contour; the number and area ratio of the internal high gray-level small regions obtained by performing gray-level threshold segmentation and connected component analysis within the nodule mask, and the internal high echo density calculated accordingly; the halo width estimated by gray-level statistics with a preset range outside the nodule boundary as the halo region, and the ratio of the halo thickness to the nodule diameter calculated.

7. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that, The multi-branch network in step S4 includes: a first convolutional neural network branch, which takes the grayscale image of the region of interest of the nodule as input and is used to extract the depth features of the overall shape and internal texture of the nodule; a second convolutional neural network branch, which takes the combined image of the nodule outline superimposed on the grayscale image as input and is used to extract the depth features related to the boundary shape; and a structural feature branch, which takes the structural feature vector as input and outputs the structural feature classification vector through a multi-layer fully connected network.

8. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that, The meta-learning classifier is a multilayer perceptron model or a gradient boosting tree model, and the multiple intermediate classification results or deep feature representations are concatenated or weighted fused before being input into the meta-learning classifier.

9. The deep learning method for rapid screening of thyroid nodules according to claim 1, characterized in that, Based on the predicted probabilities of benign or malignant transformation, the thyroid nodules are classified into at least three risk levels: low risk, medium risk, and high risk, and corresponding follow-up periods and further examination recommendations are output for each level.