A double-model-driven nanotube automatic segmentation and parameter measurement method and system
By combining a dual-model architecture and an anti-adhesion strategy, the problems of manual dependence and multi-tube adhesion in carbon nanotube image segmentation are solved, enabling efficient and accurate automatic nanotube segmentation and physical parameter measurement, thus improving analysis efficiency and accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- UNIV OF SCI & TECH BEIJING
- Filing Date
- 2026-04-14
- Publication Date
- 2026-06-19
AI Technical Summary
Existing carbon nanotube image segmentation methods suffer from problems such as strong reliance on human subjectivity, multi-tube adhesion, difficulty in handling complex background noise and dense areas, and lack of automatic scale recognition and physical parameter measurement, resulting in low analysis efficiency and inaccurate results.
A dual-model architecture is adopted, combining a scale recognition model and a deep neural network segmentation model. The dataset is expanded through synchronous geometric transformation, and anti-adhesion strategies and cross-shaped structural element corrosion are applied to optimize the supervision signal, thereby realizing automatic segmentation and physical parameter measurement of nanotubes.
It improves the objectivity and efficiency of image segmentation, significantly reduces labor and time costs, and enhances segmentation accuracy and physical parameter measurement accuracy under complex structures.
Smart Images

Figure CN122244449A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of micro-nano scale image segmentation technology, and in particular to a dual-model driven automatic nanotube segmentation and parameter measurement method and system. Background Technology
[0002] Carbon nanotubes (CNTs) have been extensively studied in several cutting-edge fields due to their superior physicochemical properties. However, in existing simulations, the calculated thermal conductivity of carbon nanotube networks often falls short of theoretical expectations due to insufficient model accuracy. To construct more accurate simulation models, it is necessary to extract their true morphology and structural features from microscopic images.
[0003] Carbon nanotubes exhibit complex entanglement, stacking, and adhesion morphologies at the microscopic level. Traditional manual segmentation processes are cumbersome and time-consuming, severely limiting analytical efficiency and the objectivity of results. Furthermore, existing methods for selecting analytical regions and determining boundaries largely rely on researchers' subjective visual judgment, which is prone to introducing human error.
[0004] To overcome these limitations, researchers have been exploring methods to intelligently identify and segment carbon nanotube networks using computers. These methods aim to automatically identify and analyze the morphological features of images through software algorithms, thereby replacing tedious manual annotation processes and improving analysis efficiency. Existing image analysis methods, including traditional threshold segmentation, feature enhancement methods based on tubular filters, and conventional deep learning methods, while making some progress in initial image recognition, still face challenges such as difficulty in handling structural breaks caused by complex background noise, inability to handle tube adhesion in dense areas, and a lack of automatic scale recognition and direct measurement and derivation of physical parameters. Summary of the Invention
[0005] This application provides a dual-model driven automatic nanotube segmentation and parameter measurement method and system, which can eliminate the dependence on human subjective perspective, solve the problem of multi-tube adhesion, and directly output physical parameters to support the construction of model networks. By adopting a dual-model architecture and anti-adhesion strategy, it aims to improve the objectivity and analysis efficiency of microscale image segmentation, while significantly reducing the manual and time costs of data processing.
[0006] In a first aspect, this application provides a dual-model-driven automatic nanotube segmentation and parameter measurement method, the method comprising: S1. Obtain nanotube TEM images and corresponding artificially labeled masks, and perform synchronous geometric transformations on nanotube TEM images and artificially labeled masks to expand the dataset; S2. Construct a dual-model architecture that includes a scale recognition model and a deep neural network segmentation model; S3. Train a deep neural network segmentation model using the expanded dataset, and optimize the supervision signal by applying an anti-adhesion processing strategy to the manually labeled mask. S4. Input the TEM image of the nanotube to be tested into the dual-model architecture for prediction, obtain the spatial resolution coefficient, binary mask and skeleton features, and fuse the spatial resolution coefficient, binary mask and skeleton features to calculate the geometric morphology parameters of the nanotube.
[0007] In conjunction with the first aspect, in the first implementation of the first aspect of this application, the synchronous geometric transformation includes: rotating each pair of nanotube TEM images and corresponding artificially labeled masks by 90 degrees, 180 degrees, and 270 degrees, respectively, and flipping them horizontally and vertically, while maintaining the pixel correspondence between the nanotube TEM images and the artificially labeled masks unchanged during the transformation process, so as to generate new sample pairs to expand the training dataset.
[0008] In conjunction with the first aspect, the second implementation of the first aspect of this application includes constructing a dual-model architecture, which includes: Based on the grayscale features of nanotube TEM images, a grayscale threshold is set to perform binarization segmentation of the images and filter out candidate connected regions. Geometric filtering is performed on candidate connected regions to locate the position of scale line segments and calculate their pixel length. At the same time, optical character recognition technology is used to identify digital text near scale line segments to obtain physical label values. The spatial resolution coefficient is obtained by calculating the physical size of a single pixel based on the physical label value and the pixel length. A deep neural network segmentation model based on multi-scale feature extraction and reconstruction is constructed. The deep neural network segmentation model includes a feature encoding module and a feature decoding module. Skip connections are set between encoders and decoders at the same level in a deep neural network segmentation model to stitch shallow texture features into deep features, thereby enhancing the ability to capture the edge morphology of nanotubes.
[0009] In conjunction with the first aspect, in the third implementation of the first aspect of this application, training a deep neural network segmentation model using the expanded dataset includes: Pairs of nanotube TEM images and corresponding manually labeled masks were read in batches from the expanded dataset; The nanotube TEM images were processed using CLAHE and then normalized to adjust the image size before being used as model input data. A cross-shaped structural element erosion strategy is applied to the manually labeled mask to generate a binary training target with pipe wall separation characteristics; The model input data is fed into a deep neural network segmentation model, and after encoder feature extraction and decoder feature fusion, a pixel-level prediction probability map is output. The mixed loss value is calculated based on the predicted probability map and the binarized training objective. The gradient is calculated based on the mixed loss value, and the network parameters of the deep neural network segmentation model are updated using the backpropagation algorithm until the model converges.
[0010] In conjunction with the first aspect, in the fourth implementation of the first aspect of this application, the cross-shaped structural element erosion strategy is as follows: peel off the top, bottom, left and right neighboring edge pixels of the target pixel, and retain the connectivity of the diagonal neighboring pixels, so as to maintain the topological integrity of the single nanotube while separating the adhesion region.
[0011] In conjunction with the first aspect, in the fifth implementation of the first aspect of this application, the hybrid loss function for calculating the hybrid loss value is composed of a weighted sum of binary cross-entropy loss and Dice coefficient loss, and is used to jointly constrain pixel classification accuracy and overall morphological overlap.
[0012] In conjunction with the first aspect, in the sixth implementation of the first aspect of this application, in step S4, the TEM image of the nanotube to be tested is input into a dual-model architecture for prediction to obtain spatial resolution coefficients, a binary mask, and skeleton features, including: The TEM image of the nanotube to be tested is simultaneously input into the scale recognition model and the deep neural network segmentation model. The scale recognition model outputs the spatial resolution coefficient of the image, and the deep neural network segmentation model outputs the predicted probability map of the image to be tested. The predicted probability map of the image to be tested is thresholded to generate an initial binary mask map. Connectivity analysis is performed on the initial binary mask image to identify and fill closed background holes with an area smaller than a preset area threshold, thereby generating a corrected binary mask image. The modified binary mask image is subjected to morphological thinning, and the center skeleton line with a single pixel width is extracted and defined as the skeleton feature.
[0013] In conjunction with the first aspect, in the seventh implementation of the first aspect of this application, the spatial resolution coefficient, binary mask, and skeleton features are integrated to calculate the geometric morphology parameters of the nanotube, including: Perform Euclidean distance transformation on the corrected binary mask image to generate a distance map. Extract the pixel value of the coordinates corresponding to the central skeleton line on the distance map as the radius, and combine it with the spatial resolution coefficient to calculate the physical diameter distribution of the nanotube. Traverse the extracted central skeleton lines and count the number of pixels constituting the skeleton; denote the distance between two horizontally or vertically adjacent pixels as 1, and the distance between two diagonally adjacent pixels as 1. The total pixel distance is then multiplied by the spatial resolution coefficient to obtain the actual physical length of the nanotube. The discrete pixels corresponding to the central skeleton line are smoothed and fitted to generate a continuous smooth curve. Based on the continuous smooth curve, the curvature of the nanotube is calculated according to geometric differential calculus.
[0014] In conjunction with the first aspect, in the eighth implementation of the first aspect of this application, the smooth fitting process is as follows: the extracted single-pixel skeleton is parameterized, and the skeleton coordinates are smoothly fitted using local quadratic polynomial regression; during the fitting process, the endpoint neighborhood compensation is performed using mirror extension technology to avoid fitting divergence and calculation distortion caused by the lack of pixels outside the endpoints.
[0015] Secondly, this application provides a dual-model driven automatic nanotube segmentation and parameter measurement system, the system comprising: The data acquisition and augmentation module is used to acquire nanotube TEM images and corresponding artificially labeled masks, and to perform synchronous geometric transformations on the nanotube TEM images and artificially labeled masks to augment the dataset. The model building module is used to construct a dual-model architecture that includes a scale recognition model and a deep neural network segmentation model. The model training module is used to train a deep neural network segmentation model using the expanded dataset and optimize the supervision signal by applying an anti-adhesion processing strategy to the manually labeled mask. The model testing module is used to input the TEM image of the nanotube under test into the dual model architecture for prediction, obtain spatial resolution coefficients, binary masks and skeleton features, and fuse the spatial resolution coefficients, binary masks and skeleton features to calculate the geometric morphology parameters of the nanotube.
[0016] The beneficial effects of this invention are as follows: This invention proposes a dual-model driven method, system, and electronic device for automatic nanotube segmentation and parameter measurement. By employing a dual-model architecture combining a scale bar recognition model and a deep neural network segmentation model, it successfully achieves efficient measurement and segmentation of physical geometric parameters from raw microscopic TEM images, improving image segmentation efficiency and accuracy. The introduction of a morphological anti-adhesion strategy, using a cross-core erosion strategy during model training to construct a supervisory signal with separation characteristics, forces the model to learn the minute boundaries between tubes, significantly improving the segmentation accuracy in dense and intersecting regions, effectively overcoming the shortcomings of conventional deep learning models in handling complex adhered structures. Furthermore, this invention effectively addresses the current problem of scarce high-quality samples through a geometric synchronous transformation data augmentation method. Attached Figure Description
[0017] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0018] Figure 1 This is a flowchart of a dual-model driven automatic nanotube segmentation and parameter measurement method according to this application; Figure 2 This is a schematic diagram of the framework structure of this application; Figure 3 This is a schematic diagram of the structure of a dual-model driven automatic nanotube segmentation and parameter measurement system according to this application. Detailed Implementation
[0019] The terms “first,” “second,” “third,” “fourth,” etc. (if present) in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a particular order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments described herein can be implemented in a sequence other than that illustrated or described herein. Furthermore, the terms “comprising” or “having,” and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.
[0020] For ease of understanding, the specific process of the embodiments of this application is described below. Figure 1 The diagram shows a flowchart of a dual-model-driven automatic nanotube segmentation and parameter measurement method provided by the present invention. The flowchart specifically includes the following steps: S1. Obtain nanotube TEM images and corresponding artificially labeled masks, and perform synchronous geometric transformations on nanotube TEM images and artificially labeled masks to expand the dataset.
[0021] In one specific embodiment, the synchronous geometric transformation includes: Each pair of nanotube TEM images and their corresponding artificially labeled masks are simultaneously rotated by 90 degrees, 180 degrees, and 270 degrees, and horizontally and vertically flipped, while maintaining the pixel correspondence between the nanotube TEM images and the artificially labeled masks during the transformation process, in order to generate new sample pairs to expand the training dataset.
[0022] Specifically, raw images of nanotube samples are acquired using a TEM (Transmission Electron Microscopy) device, along with manually labeled masks. These masks precisely mark the pixel positions of the nanotubes within the images. Each pair of raw TEM images and their corresponding manually labeled masks are loaded into the data processing environment, ensuring a one-to-one correspondence between the pixel positions of each image and the corresponding mask, maintaining consistent spatial coordinates. Then, each pair of images is rotated sequentially by 90 degrees, 180 degrees, and 270 degrees. During rotation, both the image matrix and the mask matrix are transformed simultaneously to ensure that the rotated mask still accurately identifies the positions of the nanotubes. Furthermore, bilinear interpolation is used during rotation to maintain undistorted pixel grayscale information, while nearest-neighbor interpolation is used at the mask edges (ensuring the binary nature of the mask pixels and avoiding grayscale blurring caused by interpolation). Then, a horizontal flip operation is performed on each image-mask pair, reversing the pixel order of each row during the flip. The same operation is performed on the mask to ensure that the nanotube pixels in the mask still correctly cover the nanotube region. Next, a vertical flip operation is performed on each image-mask pair, reversing the pixel order of each column, and the mask is processed simultaneously to ensure that the nanotube positions corresponding to the mask pixels are not offset. After each rotation or flip, a new image-mask sample pair is generated and stored in the expanded training dataset. The dataset structure records the original image number, transformation type, and spatial coordinate information for each sample pair so that the corresponding mask can be accurately read and the training labels can be kept consistent during subsequent training. Simultaneously, the mask edge pixels are checked during the transformation process. If pixel breaks or missing edges are found, morphological dilation or erosion operations are used to repair them, ensuring the skeleton and edge features are preserved intact. Furthermore, the image grayscale is normalized or enhanced to maintain consistency in brightness and contrast between the rotated and flipped images and the original images, thus ensuring that the deep neural network can correctly learn the nanotube morphological features during training.
[0023] The synchronous geometric transformation method can significantly expand the amount of training data, improve the model's ability to recognize nanotubes in different directions and poses, solve the problem of poor generalization ability caused by insufficient original data, and ensure that the mask label is precisely aligned with the nanotube pixels after all geometric transformations. This prevents errors caused by mask offset during training, enabling deep neural networks to accurately segment nanotube edges and shapes and achieve accurate geometric parameter measurement.
[0024] S2. Construct a dual-model architecture that includes a scale recognition model and a deep neural network segmentation model.
[0025] In one specific embodiment, constructing a dual-model architecture includes: Based on the grayscale features of nanotube TEM images, a grayscale threshold is set to perform binarization segmentation of the images and filter out candidate connected regions. Geometric filtering is performed on candidate connected regions to locate the position of scale line segments and calculate their pixel length. At the same time, optical character recognition technology is used to identify digital text near scale line segments to obtain physical label values. The spatial resolution coefficient is obtained by calculating the physical size of a single pixel based on the physical label value and the pixel length. A deep neural network segmentation model based on multi-scale feature extraction and reconstruction is constructed. The deep neural network segmentation model includes a feature encoding module and a feature decoding module. Skip connections are set between encoders and decoders at the same level in a deep neural network segmentation model to stitch shallow texture features into deep features, thereby enhancing the ability to capture the edge morphology of nanotubes.
[0026] Specifically, each image is read from the expanded nanotube TEM image dataset. Gray-level features are used to analyze the images. A preset gray-level threshold T (preferably less than 15) is set. Pixels with gray-level values below the threshold T are marked as foreground, and pixels with gray-level values above the threshold T are marked as background, thus forming a binary image. The foreground consists of candidate nanotubes or scale line segments, and the background consists of non-target region pixels. Connected component analysis is performed using this binary image to identify the pixel set of each connected region. The geometric properties of each region are calculated, including the aspect ratio R, area, and orientation angle of the bounding rectangle, as the basis for geometric shape filtering. Candidate scale line segment regions are filtered according to the preset aspect ratio condition R > 5.0 to exclude non-elongated regions, thereby obtaining candidate connected regions that may be scale line segments. In the selected candidate scale area, its pixel length L is measured, and the digital text near the scale line segment is identified by combining optical character recognition (OCR) technology to obtain the actual physical label value V. Then, the physical size corresponding to a single pixel is calculated according to the formula K=V / L, where K is the physical size corresponding to a single pixel, which is also the spatial resolution coefficient of the image. It is used to convert the pixel unit into the actual physical size, thereby completing the marking of spatial information in the image and ensuring the accuracy of subsequent nanotube geometric parameter measurements.
[0027] Preferably, the ROI of the digital region near the scale line segment is cropped, and the cropped region is subjected to grayscale inversion, binarization (threshold T=120) and dilation operation (2×2 square structuring element, 1 iteration) to enhance the edge features of the digital text and reduce OCR recognition error.
[0028] A deep neural network segmentation model is constructed, employing a multi-scale feature extraction and reconstruction method based on the U-Net structure. The model includes an encoder and a decoder. The encoder progressively extracts feature representations from the image, while the decoder restores feature resolution through upsampling. Skip connections are established between the encoder and decoder at the same level to concatenate texture features from the shallow encoder with those from the deeper layers, enhancing the ability to capture the edges and subtle morphologies of nanotubes, thereby improving the model's segmentation accuracy for complex nanotube shapes. The output of the dual-model architecture after construction includes spatial resolution coefficients from the scale recognition model and pixel-level predicted probability maps from the deep neural network segmentation model.
[0029] The implementation process requires ensuring that the scale recognition and segmentation steps do not interfere with each other, while the output data can be combined. The spatial resolution coefficient provides a physical unit reference, and the mask provides the nanotube morphological outline. Simultaneously, the skip connection technique addresses the problem of low-level texture information being lost in high-level features in deep networks, enabling the model to consider both global structure and local details. This dual-model architecture not only solves the separation problem of scale recognition and pixel-level segmentation in nanotube TEM images but also overcomes the technical problems of low segmentation accuracy and inaccurate handling of blurred nanotube edges and adhered regions in traditional single-model approaches. This architecture enables high-precision automatic segmentation and accurate spatial scale measurement of nanotubes.
[0030] S3. Train a deep neural network segmentation model using the expanded dataset, and optimize the supervision signal by applying an anti-adhesion processing strategy to the manually labeled mask.
[0031] In one specific embodiment, training a deep neural network segmentation model using the augmented dataset includes: Pairs of nanotube TEM images and corresponding manually labeled masks were read in batches from the expanded dataset; The nanotube TEM images were processed using CLAHE and then normalized to adjust the image size before being used as model input data. A cross-shaped structural element erosion strategy is applied to the manually labeled mask to generate a binary training target with pipe wall separation characteristics; The model input data is fed into a deep neural network segmentation model, and after encoder feature extraction and decoder feature fusion, a pixel-level prediction probability map is output. The mixed loss value is calculated based on the predicted probability map and the binarized training objective. The gradient is calculated based on the mixed loss value, and the network parameters of the deep neural network segmentation model are updated using the backpropagation algorithm until the model converges.
[0032] In a preferred embodiment, the cross-shaped structural element erosion strategy is as follows: peel off the top, bottom, left and right neighboring edge pixels of the target pixel, while retaining the connectivity of the diagonal neighboring pixels, so as to maintain the topological integrity of the single nanotube while separating the adhesion area.
[0033] In a preferred embodiment, the hybrid loss function for calculating the hybrid loss value is composed of a weighted sum of binary cross-entropy loss and Dice coefficient loss, and is used to jointly constrain pixel classification accuracy and overall morphological overlap.
[0034] Specifically, pairs of nanotube TEM images and corresponding manually labeled masks are read in batches from the expanded dataset, which forms the basis of the data input for model training.
[0035] The read TEM image is processed using CLAHE (Contrast-Limited Adaptive Histogram Equalization). This process divides the image into several small blocks, performs histogram equalization on each block, and limits the contrast amplification to enhance the grayscale difference between the pipe wall edge and the background. This addresses the blurring of the pipe wall edge caused by uneven lighting or insufficient contrast in the original image, resulting in a contrast-enhanced image. The enhanced image is then normalized to 256×256 pixels to standardize the input size and meet the requirements of the deep neural network segmentation model for a fixed input dimension. The output is the model input data X of a uniform size.
[0036] In the supervisory signal construction stage, a cross-shaped structural element erosion strategy is applied to the read manually labeled mask. Due to the extremely high aspect ratio and random network interweaving of nanotubes, to avoid skeleton breakage caused by erosion, a cross-shaped structural element method is used to peel off the top, bottom, left, and right edge pixels of the target pixel while retaining the connectivity of diagonal neighbor pixels. This achieves the separation of adhered regions while maintaining the topological integrity of individual nanotubes. Specifically, a 3×3 cross-shaped structural element is constructed, with its center and four adjacent pixels set to 1, and its four diagonal pixels set to 0. This structural element is used to traverse the mask image; only when the center pixel (representing a nanotube) in the area covered by the structural element is foreground and its four adjacent pixels are also foreground, is the center pixel retained as foreground; otherwise, it is set as background. This processing artificially breaks the connections between tube walls in densely adhered regions while maintaining the topological skeleton continuity of individual nanotubes. This solves the problem of the lack of separation features in adhered regions in manually labeled masks, which makes it difficult for the model to learn boundaries. The processed output is a binary training target Y with tube wall separation features.
[0037] The model input data X is fed into a deep neural network segmentation model. This model is based on the U-Net architecture. It extracts deep semantic features by progressively downsampling through the encoder module and then upsampling through the decoder module to restore spatial resolution. Skip connections are set between the encoder and decoder at the same level to stitch shallow edge texture features into deep features. Finally, a pixel-level predicted probability map P is output, where each pixel value represents the probability that the position belongs to a nanotube.
[0038] The mixed loss value is calculated based on the predicted probability map P and the binarized training target Y. The mixed loss function is composed of a weighted sum of binary cross-entropy loss (BCE Loss) and Dice coefficient loss (Dice Loss). BCE Loss calculates the cross-entropy between the predicted probability and the ground truth label for each pixel, used to optimize pixel-level classification accuracy; Dice Loss calculates the similarity between the predicted region and the ground truth region, used to optimize the overall morphological overlap. The weighted sum of the two loss values jointly constrains the model training, solving the problems of imbalanced positive and negative samples and inaccurate boundary segmentation caused by using only a single loss, and outputs the mixed loss value L. total .
[0039] Preferably, the mathematical expression for the hybrid loss function is: L total =λ×L Dice +(1−λ)×L BCE , where L total Let L represent the total loss value, λ represent the balancing weight coefficient, and its value ranges from [0, 1], preferably 0.5. BCE L represents the binary cross-entropy loss, used to optimize pixel-level classification accuracy. Dice This represents the Dice coefficient loss, used to optimize regional overlap.
[0040] L BCE The calculation formula is: , L Dice The calculation formula is: , Where N represents the total number of pixels in the image. This represents the true label of the i-th pixel. This represents the probability value that the model predicts the i-th pixel as a carbon nanotube. To prevent smoothing terms with a denominator of zero, the value is taken as 1 × 10. -5 .
[0041] Finally, based on the mixed loss value L totalThe gradient is calculated, and the network parameters of the deep neural network segmentation model are updated using the backpropagation algorithm. Through multiple iterations, the model gradually learns the mapping relationship between the input image and the training target with separation features, until the model converges on the validation set. The convergence criterion is: the mixed loss value on the validation set decreases by less than 1 × 10⁻⁻⁻⁶ for 10 consecutive iterations. 4 Training will stop once either condition is met, provided the pixel classification accuracy is ≥95%, to avoid overfitting.
[0042] S4. Input the TEM image of the nanotube to be tested into the dual-model architecture for prediction, obtain the spatial resolution coefficient, binary mask and skeleton features, and fuse the spatial resolution coefficient, binary mask and skeleton features to calculate the geometric morphology parameters of the nanotube.
[0043] In one specific embodiment, the TEM image of the nanotube under test is input into a dual-model architecture for prediction to obtain spatial resolution coefficients, a binary mask, and skeleton features, including: The TEM image of the nanotube to be tested is simultaneously input into the scale recognition model and the deep neural network segmentation model. The scale recognition model outputs the spatial resolution coefficient of the image, and the deep neural network segmentation model outputs the predicted probability map of the image to be tested. The predicted probability map of the image to be tested is thresholded to generate an initial binary mask map. Connectivity analysis is performed on the initial binary mask image to identify and fill closed background holes with an area smaller than a preset area threshold, thereby generating a corrected binary mask image. The modified binary mask image is subjected to morphological thinning, and the center skeleton line with a single pixel width is extracted and defined as the skeleton feature.
[0044] Specifically, the scale recognition model first uses a grayscale threshold to binarize the image, filtering out candidate regions with grayscale values close to pure black. Then, it locates the scale line segments by calculating the aspect ratio of the circumscribed rectangle and measures their pixel length L. At the same time, it reads the digital text near the scale using optical character recognition (OCR) technology to obtain the physical label value V. Finally, it calculates the spatial resolution coefficient K using the formula K=V / L. This coefficient is used to convert the pixel units into physical dimensions in the subsequent process.
[0045] The deep neural network segmentation model outputs a predicted probability map of the image to be tested, where the value of each pixel represents the probability that the location belongs to a nanotube. A threshold (usually 0.5) is set for this predicted probability map, and pixels with probabilities greater than the threshold are set as foreground and pixels with probabilities less than the threshold are set as background, generating an initial binary mask map.
[0046] Because small-area closed background holes are easily generated in the overlapping areas of nanotubes in the initial binary mask image, these holes are segmentation artifacts rather than real nanotube network gaps. If left untreated, they will cause breakage or branching errors in subsequent skeleton extraction. Therefore, a targeted filling threshold needs to be set based on the nanotube skeleton features and wall thickness characteristics of the neighborhood of the closed background hole. This targeted filling threshold is determined based on the typical area range of the intersection artifact holes and the average wall thickness of the neighborhood: for each closed background hole, the average wall thickness of its neighborhood is extracted. If the hole area is less than the upper limit of the preset area range (e.g., 10 pixels) and the average wall thickness of the neighborhood is greater than the preset thickness threshold, the hole is determined to be an intersection artifact. Only the closed background holes determined to be intersection artifacts are filled, generating a corrected binary mask image. This processing achieves accurate repair of intersection artifacts while preserving the real holes of the nanotubes themselves, avoiding topological distortion caused by blind filling.
[0047] A morphological thinning algorithm is applied to the corrected binary mask image. The thinning algorithm iteratively peels away pixels at the mask edges while preserving the connectivity and topological invariance of the tubular structure, ultimately shrinking the tube walls, which are wider than one pixel, to a center line with a width of only one pixel. This center line completely preserves the path of each nanotube and maintains the topological connectivity at intersections; this extracted feature is defined as the skeleton feature.
[0048] In one specific embodiment, the geometric morphology parameters of the nanotube are calculated by fusing spatial resolution coefficients, binary masks, and framework features, including: Perform Euclidean distance transformation on the corrected binary mask image to generate a distance map. Extract the pixel value of the coordinates corresponding to the central skeleton line on the distance map as the radius, and combine it with the spatial resolution coefficient to calculate the physical diameter distribution of the nanotube. Traverse the extracted central skeleton lines and count the number of pixels constituting the skeleton; denote the distance between two horizontally or vertically adjacent pixels as 1, and the distance between two diagonally adjacent pixels as 1. The total pixel distance is then multiplied by the spatial resolution coefficient to obtain the actual physical length of the nanotube. The discrete pixels corresponding to the central skeleton line are smoothed and fitted to generate a continuous smooth curve. Based on the continuous smooth curve, the curvature of the nanotube is calculated according to geometric differential calculus.
[0049] Specifically, first, the physical diameter distribution of the nanotubes is calculated. A Euclidean distance transformation is performed on the corrected binary mask image, traversing each pixel in the mask image. For any pixel, the Euclidean distance between that pixel and the nearest background pixel is calculated, and the result is output as the pixel value. After traversing all pixels, a distance map is generated. In the distance map, the value corresponding to each pixel within the nanotube region represents the shortest distance from that point to the tube wall, i.e., the radius value of that point (in pixels). The pixel value corresponding to the coordinates of each pixel on the central skeleton line in the distance map is extracted, and this value is used as the radius at that skeleton point, resulting in a radius sequence in pixels. Each value in the radius sequence is multiplied by 2 to obtain the diameter (in pixels), and then multiplied by the spatial resolution coefficient K to convert the diameter unit from pixels to nanometers, outputting the physical diameter distribution of the nanotubes.
[0050] Next, the actual physical length of the nanotube is calculated. The extracted central skeleton line is traversed, treating it as a path composed of discrete pixels. When calculating the total path length, the Euclidean distance between adjacent pixels is calculated sequentially: if two pixels are adjacent horizontally (x-coordinate difference 1, y-coordinate difference 0) or vertically (x-coordinate difference 0, y-coordinate difference 1), the distance between them is recorded as 1; if two pixels are adjacent diagonally (x-coordinate difference 1, y-coordinate difference 1), the distance between them is recorded as... The distances between every two adjacent points are summed to obtain the total skeleton length in pixels. This total length is then multiplied by the spatial resolution coefficient K to convert the length unit from pixels to nanometers, outputting the actual physical length of the nanotube.
[0051] Finally, the curvature of the nanotube is calculated. The central skeleton line is composed of discrete pixels, and the lines connecting adjacent points are jagged, which would result in a large error if the curvature were calculated directly. Therefore, the discrete pixels corresponding to the central skeleton line are smoothed and fitted to generate continuous smooth curve functions x(t) and y(t) with respect to t. Based on this continuous smooth curve, the curvature is calculated according to the geometric differential calculus formula: the first derivative ( ), representing the tangent of the curve, the second derivative ( The curvature () indicates the degree of curvature of the curve. The formula for calculating local curvature is: The formula for calculating the average curvature of a single nanotube is: , where j is a positive integer from 1 to M, and M is the skeleton length. The curvature value is calculated point by point along the skeleton path, and the curvature distribution curve and average curvature of each nanotube are output.
[0052] In a preferred embodiment, the smoothing fitting process is as follows: the extracted single-pixel skeleton is parameterized, and the skeleton coordinates are smoothly fitted using local quadratic polynomial regression; during the fitting process, the endpoint neighborhood compensation is performed using mirror extension technology to avoid fitting divergence and calculation distortion caused by the lack of pixels outside the endpoints.
[0053] Specifically, the smoothing fitting process first parameterizes the extracted single-pixel skeleton. The pixels along the skeleton line are arranged in path order, and each point is assigned its coordinates (x, y, y). j y j The expression represents a sequence of discrete points, where j = 1, 2, ..., M, and M is the total number of skeleton points. Using the point index j as the independent variable t, the discrete point sequence is transformed into the parameterized form (t, x). j ) and (t, y j This means that the skeleton coordinates are expressed as a function of the path position t.
[0054] Based on parameterization, a local quadratic polynomial regression is used to smoothly fit the skeleton coordinates. For any point j on the skeleton, 2w+1 points within its neighborhood window [j−w, j+w] are selected as local fitting samples, where w typically takes the value of 3 to 5. For these points within the window, a quadratic polynomial regression model is established with x-coordinates and y-coordinates as dependent variables and t as the independent variable: , Where t is a global path parameter (values 1 to M), which is limited to the window [j−w, j+w] for calculation during local fitting. , , , , , These are the fitting coefficients obtained by solving using the least squares method.
[0055] Substituting t=j into the fitted quadratic polynomial, we can calculate the smoothed coordinates (x, y) at point j. s y s The above local fitting is performed sequentially on each point on the skeleton to obtain a continuous curve composed of smooth coordinates.
[0056] Regarding endpoint neighborhood processing, when the fitted point j is close to the skeleton endpoint (j≤w or j≥M−w+1), its neighborhood window will exceed the skeleton range, resulting in insufficient fitted samples and making it impossible to stably solve the regression coefficients, leading to fitting divergence or calculation distortion. To solve this problem, a mirror extension technique is used for endpoint neighborhood compensation. Specifically, for the starting point (j≤w), with the endpoint as the center of symmetry, points located inside the endpoint within the window are mirrored outwards to construct virtual points missing within the window. For example, when w=3 and j=1, the window range should be [−2, 4] (t values are −2, −1, 0, 1, 2, 3, 4), but the actual t values are only 1, 2, 3, … In this case, the points at t=2, 3, 4 are symmetrically reflected about t=1 to generate virtual point coordinates at t=0, −1, −2, with the reflection rule being x. 1-k = 2x1− x 1+k The y-coordinate follows the same principle. For the endpoint (j≥M−w+1), with the endpoint point as the center of symmetry, the points inside the endpoint within the window are mirrored outwards to generate virtual points outside the endpoint. After mirror extension, the fitting window for the endpoint obtains a complete set of 2w+1 points, ensuring the stability of the fitting coefficient solution.
[0057] After smooth fitting, a continuous and smooth curve function x(t) and y(t) are obtained. Based on this, the curvature is calculated according to the geometric differential calculus formula, avoiding the error caused by directly differentiating the discrete sawtooth skeleton. The mirror extension technique solves the problem of incomplete fitting windows at the endpoints, ensuring the continuity and accuracy of curvature calculation for the entire skeleton from the start to the end.
[0058] The following, in conjunction with the accompanying drawings, provides a detailed explanation of a dual-model-driven automatic nanotube segmentation and parameter measurement method. Figure 2 A schematic diagram of the overall framework structure of the present invention is shown. As shown, the input module is used to acquire TEM images of nanotubes, the preprocessing module performs grayscale enhancement and size normalization on the images, the U-Net segmentation model module includes an encoder and decoder and skip connections to extract nanotube features and output a binary mask, the postprocessing module extracts the skeleton from the binary mask, the training target module generates training labels and calculates the loss function by combining a cross-shaped erosion strategy, the scale bar recognition module calculates the spatial resolution coefficient K through ROI cropping, thresholding and OCR recognition, and the final output module fuses the skeleton, mask and spatial coefficients to calculate the diameter, length and curvature parameters of the nanotubes.
[0059] The above describes a dual-model-driven automatic nanotube segmentation and parameter measurement method according to embodiments of this application. The following describes a dual-model-driven automatic nanotube segmentation and parameter measurement system according to embodiments of this application. Please refer to [link to relevant documentation]. Figure 3This application provides a schematic diagram of a dual-model driven automatic nanotube segmentation and parameter measurement system, which includes: The data acquisition and augmentation module is used to acquire nanotube TEM images and corresponding artificially labeled masks, and to perform synchronous geometric transformations on the nanotube TEM images and artificially labeled masks to augment the dataset.
[0060] The model building module is used to construct a dual-model architecture that includes a scale recognition model and a deep neural network segmentation model.
[0061] The model training module is used to train a deep neural network segmentation model using the expanded dataset and optimize the supervision signal by applying an anti-adhesion processing strategy to the manually labeled mask.
[0062] The model testing module is used to input the TEM image of the nanotube under test into the dual model architecture for prediction, obtain spatial resolution coefficients, binary masks and skeleton features, and fuse the spatial resolution coefficients, binary masks and skeleton features to calculate the geometric morphology parameters of the nanotube.
[0063] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application.
Claims
1. A dual-model driven automatic nanotube segmentation and parameter measurement method, characterized in that, The method includes: S1. Obtain the nanotube TEM image and the corresponding artificially labeled mask, and perform synchronous geometric transformation on the nanotube TEM image and the artificially labeled mask to expand the dataset; S2. Construct a dual-model architecture that includes a scale recognition model and a deep neural network segmentation model; S3. Train the deep neural network segmentation model using the expanded dataset, and optimize the supervision signal by applying an anti-adhesion processing strategy to the manually labeled mask; S4. Input the TEM image of the nanotube to be tested into the dual-model architecture for prediction, obtain the spatial resolution coefficient, binary mask and skeleton features, and fuse the spatial resolution coefficient, the binary mask and the skeleton features to calculate the geometric morphology parameters of the nanotube.
2. The method according to claim 1, characterized in that, The synchronous geometric transformation includes: Each pair of nanotube TEM images and their corresponding artificially labeled masks are simultaneously rotated by 90 degrees, 180 degrees, and 270 degrees, and horizontally and vertically flipped, while maintaining the pixel correspondence between the nanotube TEM images and the artificially labeled masks during the transformation process, in order to generate new sample pairs to expand the training dataset.
3. The method according to claim 1, characterized in that, Building a dual-model architecture includes: Based on the grayscale features of nanotube TEM images, a grayscale threshold is set to perform binarization segmentation of the images and filter out candidate connected regions. Geometric morphology filtering is performed on the candidate connected regions to locate the position of the scale line segment and calculate its pixel length. At the same time, optical character recognition technology is used to identify the digital text near the scale line segment to obtain the physical label value. The physical size of a single pixel is calculated based on the physical label value and the pixel length to obtain the spatial resolution coefficient. A deep neural network segmentation model based on multi-scale feature extraction and reconstruction is constructed, wherein the deep neural network segmentation model includes a feature encoding module and a feature decoding module; Skip connections are set between the encoder and decoder at the same level in the deep neural network segmentation model to stitch shallow texture features into deep features, thereby enhancing the ability to capture the edge morphology of nanotubes.
4. The method according to claim 1, characterized in that, Training the deep neural network segmentation model using the expanded dataset includes: Pairs of nanotube TEM images and corresponding manually labeled masks were read in batches from the expanded dataset; The nanotube TEM images were processed using CLAHE and then normalized to adjust the image size before being used as model input data. A cross-shaped structural element erosion strategy is applied to the manually labeled mask to generate a binary training target with pipe wall separation characteristics; The model input data is fed into a deep neural network segmentation model, and after encoder feature extraction and decoder feature fusion, a pixel-level prediction probability map is output. Calculate the hybrid loss value based on the predicted probability map and the binarized training objective; The gradient is calculated based on the mixed loss value, and the network parameters of the deep neural network segmentation model are updated using the backpropagation algorithm until the model converges.
5. The method according to claim 4, characterized in that, The cross-shaped structural element corrosion strategy is as follows: peel off the top, bottom, left and right neighboring edge pixels of the target pixel, and retain the connectivity of the diagonal neighboring pixels, so as to maintain the topological integrity of the single nanotube while separating the adhesion area.
6. The method according to claim 4, characterized in that, The hybrid loss function, which calculates the hybrid loss value, is composed of a weighted sum of binary cross-entropy loss and Dice coefficient loss. It is used to jointly constrain pixel classification accuracy and overall morphological overlap.
7. The method according to claim 1, characterized in that, In S4, the TEM image of the nanotube under test is input into the dual-model architecture for prediction, obtaining spatial resolution coefficients, binary masks, and skeleton features, including: The TEM image of the nanotube to be tested is simultaneously input into the scale recognition model and the deep neural network segmentation model. The scale recognition model outputs the spatial resolution coefficient of the image, and the deep neural network segmentation model outputs the predicted probability map of the image to be tested. The predicted probability map of the image to be tested is thresholded to generate an initial binary mask map. Perform connected component analysis on the initial binary mask image to identify and fill closed background holes with an area smaller than a preset area threshold, and generate a corrected binary mask image. The modified binary mask image is subjected to morphological thinning processing to extract the center skeleton line with a single pixel width, and defined as the skeleton feature.
8. The method according to claim 7, characterized in that, By fusing the spatial resolution coefficient, the binary mask, and the skeleton features, the geometric morphology parameters of the nanotube are calculated, including: Perform Euclidean distance transformation on the corrected binary mask to generate a distance map. Extract the pixel value of the coordinates corresponding to the central skeleton line on the distance map as the radius, and combine it with the spatial resolution coefficient to calculate the physical diameter distribution of the nanotube. Traverse the extracted central skeleton lines and count the number of pixels constituting the skeleton; denote the distance between two horizontally or vertically adjacent pixels as 1, and the distance between two diagonally adjacent pixels as 1. The total pixel distance is then multiplied by the spatial resolution coefficient to obtain the actual physical length of the nanotube. The discrete pixels corresponding to the central skeleton line are subjected to smooth fitting to generate a continuous smooth curve. Based on the continuous smooth curve, the curvature of the nanotube is calculated according to geometric differential calculus.
9. The method according to claim 8, characterized in that, The smoothing fitting process involves: parameterizing the extracted single-pixel skeleton and using local quadratic polynomial regression to smoothly fit the skeleton coordinates; during the fitting process, mirror extension technology is used to compensate for the endpoint neighborhood to avoid fitting divergence and calculation distortion caused by the lack of pixels outside the endpoints.
10. A dual-model driven automatic nanotube segmentation and parameter measurement system, used to implement the method as described in any one of claims 1 to 9, characterized in that, The system includes: The data acquisition and augmentation module is used to acquire nanotube TEM images and corresponding artificially labeled masks, and to perform synchronous geometric transformations on the nanotube TEM images and the artificially labeled masks to augment the dataset. The model building module is used to construct a dual-model architecture that includes a scale recognition model and a deep neural network segmentation model. The model training module is used to train the deep neural network segmentation model using the expanded dataset and to optimize the supervision signal by applying an anti-adhesion processing strategy to the manually labeled mask. The model testing module is used to input the TEM image of the nanotube under test into the dual model architecture for prediction, obtain the spatial resolution coefficient, binary mask and skeleton features, and fuse the spatial resolution coefficient, the binary mask and the skeleton features to calculate the geometric morphology parameters of the nanotube.