A deep learning-based superalloy microstructure segmentation method and device

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By improving the Mask R-CNN model and combining adaptive histogram equalization and Sobel operator edge information, the problem of automating the segmentation of microstructures in high-temperature alloys was solved, achieving efficient and accurate segmentation under a scanning electron microscope, which is suitable for high-throughput materials research.

CN118736211BActive Publication Date: 2026-06-19ZHEJIANG UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: ZHEJIANG UNIV
Filing Date: 2024-05-31
Publication Date: 2026-06-19

Application Information

Patent Timeline

31 May 2024

Application

19 Jun 2026

Publication

CN118736211B

IPC: G06V10/26; G06V10/25; G06V20/70; G06V10/764; G06V10/766; G06T5/94; G06T5/40; G06N3/045; G06N3/0464; G06N3/084; G06N3/096

AI Tagging

Application Domain

Image enhancement Character and pattern recognition

Technology Topics

Data set Algorithm

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Decision tree model generation method and data recommendation method based on decision tree model
CN114418035BAccurate classification effectCategory attributeData set
A family map-based insurance policy intelligent analysis and risk monitoring method and system
CN122264956AFinance Database management systemsRisk exposureData set
Dynamic tooth chart and automatic charting
WO2026151619A1Data set User device
Vehicle aerodynamic simulation model correction method and device, electronic equipment and medium
CN122263728Aimprove accuracy Improve robustness Geometric CAD Mathematical models Data set Simulation
Fabricated building construction progress simulation method and system based on digital twinning
CN122333952AData set Verification

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies make it difficult to perform precise and efficient automated segmentation of the microstructure of high-temperature alloy materials under scanning electron microscopy, especially in cases of high background noise and complex structures. Traditional methods rely on manual intervention and are inefficient, failing to meet the needs of high-throughput materials research.

Method used

An improved Mask R-CNN model is adopted, which combines adaptive histogram equalization to enhance image contrast, integrates Sobel operator edge information, and introduces Huber loss function. Through transfer learning training, automatic segmentation of microstructures of high-temperature alloys is achieved.

Benefits of technology

It enables accurate identification and quantitative analysis of the microstructure of high-temperature alloy materials under high noise and complex structures, and is suitable for high-throughput material research, improving segmentation efficiency and accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN118736211B_ABST

Patent Text Reader

Abstract

This invention discloses a deep learning-based method and apparatus for segmenting the microstructures of high-temperature alloys, comprising: collecting images of the microstructures of high-temperature alloy materials from scanning electron microscopes; labeling and annotating the microstructures in the images; constructing a dataset through image preprocessing and image enhancement; constructing an improved Mask R-CNN model, employing adaptive histogram equalization to expand the dynamic range of image gray levels and enhance local contrast; fusing image edge information into the network model using the Sobel operator and introducing the Huber loss function to improve model performance; training the model through transfer learning; and using the trained instance segmentation model to identify and segment the SEM images to be analyzed, obtaining the segmentation mask and labeling rectangle for each microstructure in the image. This invention achieves accurate identification of microstructure objects in high-temperature alloy materials based on the improved Mask R-CNN model.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of microstructure characterization of high-temperature alloy materials under scanning electron microscopy, and particularly to a method and apparatus for microstructure segmentation of high-temperature alloys based on deep learning. Background Technology

[0002] The microstructure of materials is one of the four fundamental elements of materials science. The properties and performance of materials depend on their microstructure, which can be controlled through the manufacturing process. The microstructure of a material is more indicative of its properties than its chemical composition. In the field of materials science, observing and analyzing the microstructure of materials to determine the relationship between its microstructure and performance is a crucial research topic.

[0003] The microstructure of materials is imaged as a grayscale image under a scanning electron microscope (SEM). The imaging quality is affected by factors such as the quality of sample pretreatment and manual operation during the scanning process, often resulting in difficulty in accurately reflecting microstructural features. Microstructural features extracted using traditional methods rely on subjective manual labeling, which is inefficient. Therefore, automated instance segmentation methods can provide accurate, high-throughput, and high-efficiency localization of microstructural features in in-situ experiments, which can greatly supplement material analysis. Microstructures in SEM images often exhibit blurred morphological features and poor imaging quality, while traditional segmentation methods rely on expert manual annotation, making it difficult to collaboratively extract and observe a large number of features. Furthermore, the microstructure of materials differs from ordinary objects in the macroscopic world; its structure is usually very complex, such as irregular grains, bent grain boundaries, and interlacing. Microstructural imaging is affected by instrument and sample preparation, resulting in high image noise and low signal-to-noise ratio. At the same time, the difference between microstructural features and the background environment in SEM images is not obvious, and the image contrast is uneven. Traditional instance segmentation techniques are difficult to adapt to the special properties of material microstructural images. Therefore, there is an urgent need to propose instance segmentation techniques that can meet the requirements of microstructure characterization.

[0004] Traditional automatic image segmentation algorithms, such as those incorporated in ImageJ, including threshold-based and watershed algorithms, typically segment by finding spots with constant contrast or edges with varying contrast. These methods can handle suitable microstructures well, but complex or non-ideal images often require a lot of human intervention, or even manual segmentation, resulting in a subjective, slow workflow that is limited to specific materials.

[0005] With the in-depth development of machine vision, image segmentation based on machine vision has important and wide applications in the fields of autonomous driving and medical imaging. Due to its excellent efficiency, it has attracted great attention in the segmentation of material microstructures.

[0006] However, existing machine vision models can only segment specific types of microstructures in relatively simple SEM / TEM, and perform poorly when dealing with high background noise, small microstructures, and dense microstructures. Furthermore, existing networks generally only perform semantic segmentation of microstructures, i.e., separating the target object from the background, which is unsuitable for the requirement of simultaneously quantitatively studying multiple microstructure features in the context of high-throughput materials research. Summary of the Invention

[0007] The purpose of this invention is to address the shortcomings of existing technologies by proposing a method and apparatus for segmenting the microstructure of high-temperature alloys based on deep learning.

[0008] The objective of this invention is achieved through the following technical solution: a method for segmenting the microstructure of high-temperature alloys based on deep learning, comprising the following steps:

[0009] S1: Collect images of the microstructure of high-temperature alloy materials from scanning electron microscopes, annotate and label the microstructure in the SEM images, and construct a dataset after image preprocessing and image enhancement.

[0010] S2, construct an improved Mask R-CNN model. The improved Mask R-CNN model uses adaptive histogram equalization to expand the dynamic range of image gray levels and enhance local contrast. It also uses the Sobel operator to fuse image edge information into the network model and introduces the Huber loss function to improve model performance.

[0011] S3, using the constructed dataset to train the improved Mask R-CNN model through transfer learning, to obtain a material microstructure instance segmentation model;

[0012] S4 uses a trained material microstructure instance segmentation model to identify and segment the SEM image to be analyzed, obtaining the segmentation mask and calibration rectangle for each microstructure in the image.

[0013] Further, step S1 involves collecting microstructure images of high-temperature alloy materials from scanning electron microscopes, labeling and annotating the microstructures in the SEM images, and constructing a dataset through image preprocessing and image enhancement, specifically including:

[0014] S1.1 Collect microscopic images of the microstructure of high-temperature alloys from scanning electron microscopes;

[0015] S1.1 Use the VGG image annotator to define regions in the image, add text descriptions to the regions, convert the data format to JSON format, and divide the dataset;

[0016] S1.1. Enhance the image data by using methods such as contrast enhancement, image sharpening, rotation, and deformation.

[0017] Further, step S2, constructing the improved Mask R-CNN model, specifically includes:

[0018] S2.1 The scanning electron microscope images of the microstructure of high-temperature alloys are expanded by adaptive histogram equalization to enhance the dynamic range of gray levels and local contrast.

[0019] S2.2 Input the scanning electron microscope image after adaptive histogram equalization into the shared convolutional layer of the 101-layer residual neural network ResNet101 and the feature pyramid network to extract the feature map of the material microstructure in the image.

[0020] S2.3. Use the Region Proposal Network (RPN) to generate target proposal candidate boxes of varying locations and sizes on the feature map;

[0021] S2.4. Use the ROI Align method to process the target suggestion candidate box, associate the suggestion candidate box feature map with a specific ROI and crop it into a feature vector of fixed size;

[0022] S2.5 The feature vectors output by ROI Align are fed into two heads: a classification and bounding box regression head and a mask prediction head.

[0023] The classification branch in the classification and bounding box regression head predicts the object category of each region proposed by the RPN through a fully connected layer to distinguish different types of objects in the image;

[0024] The bounding box regression branch in the classification and bounding box regression head is used to adjust the coordinates of the region proposed by each RPN, refining its size and position to more accurately enclose the object;

[0025] The mask prediction head uses a small fully convolutional network for each ROI region, which generates a binary mask that outlines the precise shape of the object;

[0026] S2.6 The Sobel operator is used to fuse image edge information into the network model. This is manifested by introducing a small edge detection network after the output mask branch. The network input is the predicted mask and the ground truth mask. The two are convolved with the Sobel operator to determine the edge difference between the predicted mask and the ground truth mask. Then, the error of edge consistency is calculated by the Huber loss function.

[0027] Furthermore, the construction of the improved Mask R-CNN model, using adaptive histogram equalization to expand the dynamic range of image gray levels and enhance local contrast in scanning electron microscope images of high-temperature alloy microstructures, specifically includes:

[0028] S2.1.1 Input scanning electron microscope image;

[0029] S2.1.2 Divide the image into several sub-regions;

[0030] S2.1.3 Calculate the histogram for each region and preset the maximum pixel frequency threshold;

[0031] S2.1.4. Allocate excess pixels from the histogram that exceed the frequency threshold to the remaining bins;

[0032] S2.1.5. Use the cumulative distribution function to scale and map the redistributed histogram;

[0033] S2.1.6. Use bilinear interpolation to stitch the generated sub-regions together;

[0034] S2.1.7 Output enhanced scanning electron microscope images.

[0035] Furthermore, the Region Proposal Network (RPN) generates target proposal candidate boxes of varying positions and sizes on the feature map, specifically including:

[0036] S2.2.1. Input the feature map of the material's microstructure by passing through the shared convolutional layers in the 101-layer ResNet101 residual neural network and the feature pyramid network.

[0037] S2.2.2. Generate anchor points for element points on the feature map. Using the anchor points as the center point, generate anchor frames of different sizes and shapes in a sliding window manner. The anchor frames are controlled by two parameters: the pixel size of the longest side of the frame and the aspect ratio.

[0038] S2.2.3. Generate 9 anchor boxes of different sizes and proportions at each anchor point of the feature map;

[0039] S2.2.4 Use the RPN network to obtain the probability that the predicted anchor point is the background or the foreground, and refine the anchor point.

[0040] Furthermore, the target proposal candidate boxes are processed using the ROI Align method, which associates the feature maps of the proposal candidate boxes with specific ROIs and clips them into feature vectors of a fixed size. Specifically, this includes:

[0041] S2.3.1. Traverse each candidate region, keeping the floating-point boundaries unquantized;

[0042] S2.3.2 Divide the candidate region into k*k units, and do not quantize the boundary of each unit;

[0043] S2.3.3 Calculate four fixed coordinate positions in each cell, use bilinear interpolation to calculate the interpolation value of these four positions, and then perform max pooling operation;

[0044] S2.3.4 Output a fixed-size feature map that preserves the edge feature information of the image.

[0045] Furthermore, the specific method of fusing image edge information into the network model using the Sobel operator includes: the Sobel operator for edge detection is a network with 3*3*2 convolutional kernels, containing a horizontal filter describing the horizontal gradient and a vertical filter describing the vertical gradient.

[0046] Furthermore, the error in calculating edge consistency using the Huber loss function specifically includes:

[0047] Edge detection is integrated as a branch into the network, and an edge loss function L is added to the loss function. edge The improved Mask R-CNN loss function is: L = L cls +L box +L mask +L edge ; where L cls For classification loss, the cross-entropy loss function is used, and the calculation formula is as follows:

[0048]

[0049] Where N is the total number of samples, y i p is the true class label of the i-th sample. i It is the predicted probability of the i-th sample;

[0050] L box The bounding box regression loss uses smoothed L1 loss, and the calculation formula is as follows:

[0051]

[0052] Among them, t i The parameters representing the predicted bounding box include the center coordinates x and y, width w, and height h. The parameters representing the true bounding box are defined as follows: The smooth L1 loss is defined as follows:

[0053]

[0054] L mask For mask loss, a binary cross-entropy loss for each pixel is used to calculate the difference between the ground truth mask and the predicted mask for each pixel. The calculation formula is as follows:

[0055]

[0056] Where m is the resolution of the mask, y i It is the real mask label of the i-th pixel, p i It is the predicted probability of the i-th pixel;

[0057] Edge loss function L edge The difference between the edge information of the instance segmentation mask and the ground truth edge information is evaluated using Huber Loss as the calculation formula for the edge detection loss function, namely:

[0058]

[0059] Further, step S3 includes using MS COCO weights and SEM image dataset to perform transfer learning training on the improved Mask R-CNN model to obtain a material microstructure instance segmentation model. During the training process, the training optimizer of the model is set to Adam, and the learning rate adjustment strategy is Step.

[0060] According to another aspect of the specification, a high-temperature alloy material microstructure segmentation device based on an improved Mask R-CNN model is also provided, including a memory and one or more processors. The memory stores executable code, and when the processor executes the executable code, it implements the deep learning-based high-temperature alloy microstructure segmentation method.

[0061] The beneficial effects of this invention are:

[0062] This invention provides a method for segmenting the microstructure of high-temperature alloy materials based on an improved Mask R-CNN model. The improved Mask R-CNN model replaces the traditional image segmentation method, which is subjective, relies on human intervention, is slow, and limited to specific materials, with an automated deep learning segmentation approach. Contrast-limited adaptive histogram equalization expands the dynamic range of image gray levels, enhancing local contrast, making it suitable for imaging scenarios with high noise, low signal-to-noise ratio, and uneven image contrast in scanning electron microscope images. The Sobel operator is used to fuse image edge information into the network model, and the Huber loss function is introduced to improve model performance. This solves the problem of segmenting high background noise, small microstructures, and dense microstructures, enabling accurate identification of the microstructure of high-temperature alloy materials under scanning electron microscope images. It is suitable for the requirement of simultaneously conducting quantitative research on multiple microstructure features in the context of high-throughput materials research, providing theoretical and methodological support for the characterization and analysis of the microstructure of high-temperature alloy materials, and has promising application prospects. Attached Figure Description

[0063] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0064] Figure 1 This is a flowchart of the microstructure segmentation method for high-temperature alloy materials based on the improved Mask R-CNN model of the present invention;

[0065] Figure 2 This is a schematic diagram of the improved Mask R-CNN model of the present invention;

[0066] Figure 3 Flowchart of a contrast-limited histogram adaptive equalization algorithm;

[0067] Figure 4 Here is a flowchart of the ROI Align algorithm;

[0068] Figure 5 This is a schematic diagram of the small edge detection network of the present invention;

[0069] Figure 6 A schematic diagram of the segmentation mask and calibration rectangle for each microstructure of a high-temperature alloy material in an SEM image;

[0070] Figure 7 A schematic diagram of a high-temperature alloy material microstructure segmentation device based on an improved Mask R-CNN model provided in an embodiment of the present invention. Detailed Implementation

[0071] The specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0072] The purpose of this invention is to provide a deep learning-based method for segmenting the microstructure of high-temperature alloys. This method employs adaptive histogram equalization and an improved Mask R-CNN model that integrates image edge information into the network model using the Sobel operator and introduces the Huber loss function to achieve accurate segmentation of the microstructure of high-temperature alloys in scanning electron microscope images.

[0073] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0074] like Figure 1 As shown, the microstructure segmentation method for high-temperature alloy materials based on an improved Mask R-CNN model provided by this invention includes the following steps:

[0075] S1: Collect images of the microstructure of high-temperature alloy materials from scanning electron microscopes, annotate and label the microstructure in the SEM images, and construct a dataset after image preprocessing and image enhancement.

[0076] S2, construct an improved Mask R-CNN model. The improved Mask R-CNN model uses adaptive histogram equalization to expand the dynamic range of image gray levels and enhance local contrast. It also uses the Sobel operator to fuse image edge information into the network model and introduces the Huber loss function to improve model performance.

[0077] S3, using MS COCO weights and SEM image dataset to perform transfer learning training on the improved Mask R-CNN model, to obtain a material microstructure instance segmentation model;

[0078] S4 uses a trained instance segmentation model to identify and segment the SEM image to be analyzed, obtaining the segmentation mask and calibration rectangle for each microstructure in the image.

[0079] Specifically, step S1 involves collecting images of the microstructure of high-temperature alloy materials from scanning electron microscopes, labeling and annotating the microstructures in the SEM images, and constructing a dataset through image preprocessing and image enhancement, including:

[0080] S1.1 Prepare an IN718 specimen suitable for high-temperature in-situ tensile testing. The specimen is dog-bone shaped, 50 mm long, and contains a gauge length of 1.5 mm and 1.35 mm. The original tensile specimen is mechanically polished with 400 grit, 600 grit, and 1200 grit sandpaper in sequence. The polished specimen is then coarsely polished and finely polished with a vibratory polisher. The polished specimen is then etched with copper chloride solution to focus on grains and grain boundaries in the in-situ test.

[0081] S1.2 Collect microscopic images of the high-temperature alloy microstructure from scanning electron microscopes during the in-situ experiment;

[0082] S1.3 uses the VGG image annotator to define regions in the image, adds text descriptions of the regions, converts the data format to JSON format, and divides the dataset;

[0083] S1.4 uses methods such as contrast enhancement, image sharpening, rotation, and deformation to enhance image data in order to improve the robustness of the model.

[0084] Specifically, step S2, as follows Figure 2 As shown, an improved Mask R-CNN model is constructed. This improved Mask R-CNN model employs adaptive histogram equalization to expand the dynamic range of image gray levels and enhance local contrast. It also uses the Sobel operator to fuse image edge information into the network model and introduces the Huber loss function to improve model performance, including:

[0085] Scanning electron microscope images of the microstructure of S2.1 high-temperature alloys expand the dynamic range of image gray levels and enhance local contrast through adaptive histogram equalization. Specifically, for example... Figure 3 As shown, it includes the following steps:

[0086] (1) Input scanning electron microscope image;

[0087] (2) Divide the image into several sub-regions;

[0088] (3) Calculate the histogram of each sub-region and preset the maximum frequency threshold of pixels;

[0089] (4) Distribute the excess pixels in the histogram that exceed the pixel frequency threshold range to the remaining pixels;

[0090] (5) Use the cumulative distribution function (CDF) to scale and map the redistributed histogram;

[0091] (6) Use bilinear interpolation to stitch the generated sub-regions together;

[0092] (7) Output enhanced scanning electron microscope images.

[0093] S2.2 Input the adaptive histogram equalized scanning electron microscope image into a 101-layer residual neural network ResNet101 and a feature pyramid network to share convolutional layers and extract feature maps of the material microstructure in the image.

[0094] S2.3 uses the Region Proposal Network (RPN) to generate target proposal candidate boxes of varying locations and sizes on the feature map. The specific steps are as follows:

[0095] (1) Input the feature map of the material microstructure obtained by the shared convolutional layer in the ResNet101 residual neural network with 101 layers and the feature pyramid network;

[0096] (2) Generate anchor points for all element points on the feature map, and generate anchor boxes of different sizes and shapes using a sliding window with the anchor points as the center. The anchor boxes are controlled by two parameters: the pixel size of the longest side of the box (Scales) and the aspect ratio (Aspectboxes).

[0097] (3) The pixel size (Scales) of the longest side of the box are 128×128, 256×256, and 512×512 respectively, and the aspect ratio (aspect boxes) are 1:1, 1:2 and 2:1 respectively. Therefore, according to the permutation and combination, 9 anchor boxes of different sizes and proportions are generated at each anchor point of the feature map.

[0098] For all pixels in the feature map, the anchor box location and classification are determined by computing the anchor box classifier layer and the anchor box regressor layer, with the loss function being:

[0099]

[0100] in L is the foreground judgment loss function. cls It is the classification loss function, p i The prospect probability score. N represents the ground truth value. cls Used for normalization, its value is determined by the size of the mini-batch.

[0101] The loss function for the classification model is t, where t is the coordinate. i It is a vector of four parametric coordinates of the predicted bounding box, N. reg Used for normalization, determined by the number of anchor points, with λ used as a balancing parameter for weighting; by default, λ is 10.

[0102] Backpropagation of this loss function filters target proposal candidate boxes, and target proposal candidate boxes with foreground and accurate positioning are input into the ROI Align in the next step.

[0103] S2.4 The ROI Align method is used to process the target proposed candidate boxes, associating the feature maps of the proposed candidate boxes with specific ROIs and cropping them into feature vectors of a fixed size, such as... Figure 4 As shown, the specific steps are as follows:

[0104] (1) Traverse each candidate region, keeping the floating-point boundaries unquantized;

[0105] (2) Divide the candidate region into k*k units, and do not quantize the boundary of each unit;

[0106] (3) Calculate the four fixed coordinate positions in each cell, and use bilinear interpolation to calculate the interpolation values for these four positions. The bilinear interpolation equation is:

[0107] (4) Perform max pooling operation;

[0108] (5) Output a feature vector of fixed size without losing important boundary features.

[0109] The feature vectors output by ROI Align as described in S2.5 are fed into two heads, one of which is the classification and bounding box regression head;

[0110] The classification branch predicts the object category for each region proposed by the RPN through a fully connected layer, in order to distinguish different types of objects in the image;

[0111] The bounding box regression branch is used to adjust the coordinates of the region proposed by each RPN, refining its size and position to more accurately enclose the object;

[0112] The ROI Align described in S2.6 uses a small fully convolutional network (FCN) for each ROI region to generate a binary mask that outlines the precise shape of the object.

[0113] like Figure 5 As shown, the Sobel operator is used to fuse image edge information into the network model. This is manifested by introducing a small edge detection network after the output mask branch. The network input is the predicted mask and the real mask. The two are convolved with the Sobel operator to determine the edge difference between the predicted mask and the real mask. Then, the Huber loss function is used to calculate the error of its edge consistency. The specific steps are as follows:

[0114] Edge detection is integrated as a branch into the network, and an edge loss function L is added to the loss function. edge The improved Mask R-CNN loss function is: L = L cls +L box +L mask +L edge ;

[0115] L cls The classification loss, or classification loss, is used to evaluate the accuracy of class prediction in object detection. It uses the cross-entropy loss function to calculate the difference between the actual class label and the predicted class. The formula is as follows:

[0116]

[0117] Where N is the total number of samples, y i p is the true class label of the i-th sample. i It is the predicted probability of the i-th sample.

[0118] L boxThe bounding box regression loss is used to evaluate the difference between the predicted and ground truth bounding boxes. It uses a smoothed L1 loss to calculate this difference, and its formula is as follows:

[0119]

[0120] Among them, t i The parameters representing the predicted bounding box are (center coordinates x, y, and width, height w, h). The parameters represent the true bounding box. The smooth L1 loss is defined as:

[0121]

[0122] L mask Mask loss, or masking loss, is used to evaluate the accuracy of mask prediction in instance segmentation. It uses the binary cross-entropy loss for each pixel to calculate the difference between the ground truth mask and the predicted mask for each pixel. The calculation formula is as follows:

[0123]

[0124] Where m is the resolution of the mask, y i It is the real mask label of the i-th pixel, p i It is the predicted probability of the i-th pixel.

[0125] The edge loss function L is integrated into Mask R-CNN. edge The difference between the edge information of the instance segmentation mask and the ground truth edge information is evaluated using Huber Loss as the calculation formula for the edge detection loss function, namely:

[0126]

[0127] The Sobel operator used for edge detection is a network with 3*3*2 convolutional kernels, which contains two filters:

[0128]

[0129] Among them, S x It is a transverse filter describing the horizontal gradient, S y It is a longitudinal filter that describes the vertical gradient. Generally speaking, edges in an image will produce a higher response along the filtering direction, which is reflected in the enhancement of the edges of the target object in the output image.

[0130] Specifically, step S3 involves using MS COCO weights and the SEM image dataset to train the improved Mask R-CNN model through transfer learning, resulting in a material microstructure instance segmentation model, including:

[0131] For example, the hardware environment for experimental training was an NVIDIA GeForce RTX 4060, an Intel i5-13400F processor, and 16GB of memory; the software environment for model training was Python 3.18.8, using Tensorflow 2.4.0 as the backend. The model's training optimizer was set to Adam, the learning rate strategy was Step, the initial learning rate was set to 0.001, Steps_Per_Epoch was set to 5, the batch size was set to 2, and the training epochs were set to 300. The specific process of model training based on the training set can be found in the general model training process.

[0132] Specifically, in step S4, a trained instance segmentation model is used to identify and segment the SEM image to be analyzed, obtaining a segmentation mask and calibration rectangle for each microstructure in the image. The specific steps are as follows:

[0133] The weights from the 300th epoch of training are saved. In Jupyter Notebook, the model mode is set to inference mode, and the addresses of the images to be detected are written into the program. The improved Mask R-CNN model performs microstructural segmentation on all SEM images within the specified addresses, such as... Figure 6 As shown, the segmentation mask and calibration rectangle of each microstructure of the high-temperature alloy material in the SEM image are obtained, and a unique ID is assigned to each object.

[0134] Corresponding to the aforementioned embodiments of the deep learning-based microstructure segmentation method for high-temperature alloys, this invention also provides an embodiment of a microstructure segmentation device for high-temperature alloy materials based on an improved Mask R-CNN model. See also... Figure 7 The present invention provides a high-temperature alloy material microstructure segmentation device based on an improved Mask R-CNN model, comprising a memory and one or more processors. The memory stores executable code, and when the processor executes the executable code, it is used to implement the high-temperature alloy microstructure segmentation method based on deep learning in the above embodiments.

[0135] The embodiment of the high-temperature alloy material microstructure segmentation device based on the improved Mask R-CNN model provided by this invention can be applied to any device with data processing capabilities, such as a computer. The device embodiment can be implemented through software, hardware, or a combination of both. Taking software implementation as an example, as a logical device, it is formed by the processor of any data processing device loading the corresponding computer program instructions from non-volatile memory into memory for execution. From a hardware perspective, such as... Figure 7 The diagram shown is a hardware structure diagram of any device with data processing capabilities, which includes a high-temperature alloy material microstructure segmentation device based on an improved Mask R-CNN model provided by the present invention. (Except for...) Figure 7 In addition to the processor, memory, network interface, and non-volatile memory shown, any data processing device in the embodiment may also include other hardware depending on the actual function of the data processing device, which will not be described in detail here.

[0136] The specific implementation process of the functions and roles of each unit in the above device can be found in the implementation process of the corresponding steps in the above method, and will not be repeated here.

[0137] For the device embodiments, since they basically correspond to the method embodiments, the relevant parts can be referred to in the description of the method embodiments. The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of the present invention according to actual needs. Those skilled in the art can understand and implement this without creative effort.

[0138] This invention also provides a computer-readable storage medium storing a program thereon, which, when executed by a processor, implements the deep learning-based high-temperature alloy microstructure segmentation method described in the above embodiments.

[0139] The computer-readable storage medium can be an internal storage unit of any data processing device as described in any of the foregoing embodiments, such as a hard disk or memory. The computer-readable storage medium can also be an external storage device of any data processing device, such as a plug-in hard disk, smart media card (SMC), SD card, flash card, etc., equipped on the device. Furthermore, the computer-readable storage medium can include both internal storage units and external storage devices of any data processing device. The computer-readable storage medium is used to store the computer program and other programs and data required by the data processing device, and can also be used to temporarily store data that has been output or will be output.

[0140] The present invention also provides a computer program product, including a computer program that, when executed by a processor, implements the aforementioned method for segmenting the microstructure of high-temperature alloy materials based on an improved Mask R-CNN model.

[0141] It should be noted that in the claims and specification of this patent, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one" does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0142] Although the invention has been illustrated and described with reference to certain preferred embodiments thereof, those skilled in the art will understand that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims

1. A method for segmenting the microstructure of high-temperature alloys based on deep learning, characterized in that, Includes the following steps: S1: Collect images of the microstructure of high-temperature alloy materials from scanning electron microscopes, annotate and label the microstructure in the SEM images, and construct a dataset after image preprocessing and image enhancement. S2, construct an improved Mask R-CNN model. The improved Mask R-CNN model uses adaptive histogram equalization to expand the dynamic range of image gray levels and enhance local contrast. It also uses the Sobel operator to fuse image edge information into the network model and introduces the Huber loss function to improve model performance. Step S2, constructing the improved Mask R-CNN model, specifically includes: S2.1 The scanning electron microscope images of the microstructure of high-temperature alloys are expanded by adaptive histogram equalization to enhance the dynamic range of gray levels and local contrast. S2.2 Input the scanning electron microscope image after adaptive histogram equalization into the shared convolutional layer of the 101-layer residual neural network ResNet101 and the feature pyramid network to extract the feature map of the material microstructure in the image. S2.

3. Use the Region Proposal Network (RPN) to generate target proposal candidate boxes of varying locations and sizes on the feature map; S2.

4. Use the ROI Align method to process the target suggestion candidate box, associate the suggestion candidate box feature map with a specific ROI and crop it into a feature vector of fixed size; S2.5 The feature vectors output by ROI Align are fed into two heads: a classification and bounding box regression head and a mask prediction head. The classification branch in the classification and bounding box regression head predicts the object category of each region proposed by the RPN through a fully connected layer to distinguish different types of objects in the image; The bounding box regression branch in the classification and bounding box regression head is used to adjust the coordinates of the region proposed by each RPN, refining its size and position to more accurately enclose the object; The mask prediction head uses a small fully convolutional network for each ROI region. The mask prediction head generates a binary mask that outlines the precise shape of the object. S2.6 The Sobel operator is used to fuse image edge information into the network model. This is reflected in the introduction of a small edge detection network after the output mask branch. The network input is the predicted mask and the ground truth mask. The two are convolved with the Sobel operator to determine the edge difference between the predicted mask and the ground truth mask. Then, the error of its edge consistency is calculated through the Huber loss function. S3, using the constructed dataset to perform transfer learning training on the improved Mask R-CNN model to obtain a material microstructure instance segmentation model; step S3 includes, using MS COCO weights and SEM image dataset to perform transfer learning training on the improved Mask R-CNN model to obtain a material microstructure instance segmentation model, during the training process the model's training optimizer is set to Adam, and the learning rate adjustment strategy is Step. S4 uses a trained material microstructure instance segmentation model to identify and segment the SEM image to be analyzed, obtaining the segmentation mask and calibration rectangle for each microstructure in the image.

2. The method for segmenting the microstructure of high-temperature alloys based on deep learning according to claim 1, characterized in that, Step S1 involves collecting images of the microstructure of high-temperature alloy materials from scanning electron microscopes, labeling and annotating the microstructures in the SEM images, and constructing a dataset through image preprocessing and image enhancement. Specifically, this includes: S1.1 Collect microscopic images of the microstructure of high-temperature alloys from scanning electron microscopes; S1.1 Use the VGG image annotator to define regions in the image, add text descriptions to the regions, convert the data format to JSON format, and divide the dataset; S1.

1. Enhance the image data by using methods such as contrast enhancement, image sharpening, rotation, and deformation.

3. The method for segmenting the microstructure of high-temperature alloys based on deep learning according to claim 1, characterized in that, The construction of the improved Mask R-CNN model, which expands the dynamic range of gray levels in scanning electron microscope images of high-temperature alloy microstructures through adaptive histogram equalization and enhances local contrast, specifically includes: S2.1.1 Input scanning electron microscope image; S2.1.2 Divide the image into several sub-regions; S2.1.3 Calculate the histogram for each region and preset the maximum pixel frequency threshold; S2.1.

4. Allocate excess pixels from the histogram that exceed the frequency threshold to the remaining bins; S2.1.

5. Use the cumulative distribution function to scale and map the redistributed histogram; S2.1.

6. Use bilinear interpolation to stitch the generated sub-regions together; S2.1.7 Output enhanced scanning electron microscope images.

4. The method for segmenting the microstructure of high-temperature alloys based on deep learning according to claim 1, characterized in that, The Region Proposal Network (RPN) generates target proposal candidate boxes of varying locations and sizes on the feature map, specifically including: S2.2.

1. Input the feature map of the material's microstructure by passing through the shared convolutional layers in the 101-layer ResNet101 residual neural network and the feature pyramid network. S2.2.

2. Generate anchor points for element points on the feature map. Using the anchor points as the center point, generate anchor frames of different sizes and shapes in a sliding window manner. The anchor frames are controlled by two parameters: the pixel size of the longest side of the frame and the aspect ratio. S2.2.

3. Generate 9 anchor boxes of different sizes and proportions at each anchor point of the feature map; S2.2.4 Use the RPN network to obtain the probability that the predicted anchor point is the background or the foreground, and refine the anchor point.

5. The method for segmenting the microstructure of high-temperature alloys based on deep learning according to claim 1, characterized in that, The target proposed bounding boxes are processed using the ROI Align method, which associates the feature maps of the proposed bounding boxes with specific ROIs and clips them into feature vectors of a fixed size. Specifically, this includes: S2.3.

1. Traverse each candidate region, keeping the floating-point boundaries unquantized; S2.3.2 Divide the candidate region into k*k units, and do not quantize the boundary of each unit; S2.3.3 Calculate four fixed coordinate positions in each cell, use bilinear interpolation to calculate the interpolation value of these four positions, and then perform max pooling operation; S2.3.4 Output a fixed-size feature map that preserves the edge feature information of the image.

6. The method for segmenting the microstructure of high-temperature alloys based on deep learning according to claim 1, characterized in that, The specific method of fusing image edge information into the network model using the Sobel operator includes: the Sobel operator for edge detection is a network with 3*3*2 convolutional kernels, containing horizontal filters describing horizontal gradients and vertical filters describing vertical gradients.

7. The method for segmenting the microstructure of high-temperature alloys based on deep learning according to claim 1, characterized in that, The error in calculating edge consistency using the Huber loss function specifically includes: Edge detection is integrated as a branch into the network, and an edge loss function is added to the loss function. The improved Mask R-CNN loss function is then: ;in For classification loss, the cross-entropy loss function is used, and the calculation formula is as follows: Where N is the total number of samples, It is the true class label of the i-th sample. It is the predicted probability of the i-th sample; Bounding box regression loss, using smoothing The loss is calculated using the following formula: in, The parameters representing the predicted bounding box include the center coordinates. ,width and height , Parameters representing the true bounding box, smoothing Loss is defined as: For mask loss, a binary cross-entropy loss for each pixel is used to calculate the difference between the ground truth mask and the predicted mask for each pixel. The calculation formula is as follows: in, It's the resolution of the mask. It is the real mask label of the i-th pixel. It is the predicted probability of the i-th pixel; Edge loss function The difference between the edge information of the instance segmentation mask and the ground truth edge information is evaluated using Huber Loss as the calculation formula for the edge detection loss function, namely: 。 8. A high-temperature alloy microstructure segmentation device based on deep learning, comprising a memory and one or more processors, wherein the memory stores executable code, characterized in that, When the processor executes the executable code, it implements the high-temperature alloy microstructure segmentation method based on deep learning as described in any one of claims 1-7.