A device for classifying histopathological images
By constructing an end-to-end pathological image classification network that combines feature extraction, graph neural networks, and a simplified Transformer, the problem of insufficient capture of tumor structural heterogeneity information is solved, and the accuracy and robustness of histopathological image classification are improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANHUA UNIV
- Filing Date
- 2022-12-02
- Publication Date
- 2026-06-26
Smart Images

Figure CN115953616B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to a histopathological image classification device. Background Technology
[0002] Traditional pathological image classification methods employ classic early computer vision techniques, such as handcrafted feature extraction, gray-level co-occurrence matrices, and local duality operators. These methods heavily rely on the quality of pathologist annotations; overworked pathologists often lead to misdiagnosis, ultimately resulting in classification accuracy below ideal levels, thus limiting the widespread adoption of some classic methods. However, breakthroughs in the digitization of pathological images and deep learning have made computer-aided cancer diagnosis and prognosis prediction possible. Some research has also confirmed that applying deep learning methods to pathological image classification offers significant performance improvements compared to traditional methods, attracting increasing attention to this medical task. Deep learning methods have recently been applied to various computer vision problems and have achieved better performance in pathological image classification. These methods are generally divided into two categories: the first is pathological image classification networks based on convolutional neural networks (CNNs); the second is pathological image classification networks based on Transformers. CNN-based pathological image classification networks typically employ an end-to-end learning approach, using CNNs to learn feature information from the underlying data of pathological images, such as different texture features, geometric features, and morphological features distinguishing between benign and malignant lesions. Transformer-based pathological image classification networks utilize a self-attention mechanism to acquire global feature information of the entire image, thereby improving image classification performance.
[0003] Although these methods have achieved good performance in histopathological image classification, their ability to capture unstructured spatial information about tumor structural heterogeneity is not flexible enough, leaving room for further improvement in classification results.
[0004] Therefore, improving the classification effect of histopathological images is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention
[0005] The purpose of this application is to provide a histopathological image classification device to improve the classification effect of histopathological images.
[0006] To address the aforementioned technical problems, this application provides a histopathological image classification device, comprising:
[0007] The acquisition module is used to acquire the pathological images to be tested.
[0008] The calling module is used to call a pre-trained pathology image classification network;
[0009] An input module is used to input the pathological image to be tested into the pathological image classification network for classification; wherein, the construction and training steps of the pathological image classification network include:
[0010] Obtain a dataset sample of pathological images, the dataset sample including a set of pathological images and a set of classification labels corresponding to the set of pathological images;
[0011] The pathological image set in the dataset sample was processed using data augmentation and data normalization methods.
[0012] An end-to-end pathological image classification network is constructed, comprising three parts: a first part is a feature extraction network for extracting pathological features of the image, including frequency domain information and spatial information; a second part is a graph neural network for constructing a graph representation of the pathological features and performing information interaction; and a third part is a simplified Transformer network for learning important nodes in the graph representation and performing final prediction.
[0013] The pathological image classification network is trained using the processed dataset samples.
[0014] Preferably, the feature extraction network is connected in parallel with the discrete wavelet decomposition operation through three layers of convolutional block operation, and the feature extraction network is also used to merge the extracted pathological features as the input of the graph neural network.
[0015] Preferably, the graph neural network constructs a graph representation of the pathological features and performs information interaction through a two-layer graph neural network.
[0016] Preferably, constructing a graph representation of the pathological features using a graph neural network includes:
[0017] The feature dimension of the output of the feature extraction network is set to H×W×C, where H represents the height of the pathological feature, W represents the width of the pathological feature, and C represents the number of channels of the pathological feature. The pathological feature is divided into N feature blocks, and each feature block is transformed into a feature vector to obtain the feature V of a node. For the node, the K nearest neighbors of the node are calculated, and an edge E from the node to the nearest neighbor is added. Finally, an undirected graph representation G=(V,E) is obtained.
[0018] Preferably, the simplified Transformer network uses two linear layers as an attention mechanism to learn important nodes in the graph representation, and performs final prediction through sequence pooling and linear layers.
[0019] Preferably, the feature matrix is obtained by learning important nodes in the graph representation using a simplified Transformer network, including:
[0020] The feature dimension of the output of the graph neural network is set to h×w×c, where h represents the height of the pathological feature, w represents the width of the pathological feature, and c represents the number of channels of the pathological feature; the pathological feature is divided into N feature blocks, and the dimension of each feature block is M=p×p×c, where p represents the size of the pathological feature, and the number of N is N=(h×w) / (p×p);
[0021] The feature block sequence is rearranged into an N×M matrix, and each matrix is linearly mapped to a specified dimension D; a position information symbol is added to the matrix, and the matrix dimension is (N+1)×M to obtain the input item;
[0022] The input terms are fed into a simplified self-attention mechanism to obtain the feature matrix.
[0023] Preferably, classifying the feature matrix using a sequence pooling layer and a fully connected layer to obtain the classification result includes:
[0024] The extracted pathological features are assigned specified weights using a sequence pooling layer, and the feature matrix is linearly classified using a fully connected layer to obtain the classification result.
[0025] Preferably, training the pathological image classification network using the processed dataset samples includes:
[0026] The loss value is calculated based on the classification results and the classification label set, and the pathological image classification network is adjusted until the loss value is minimized.
[0027] Preferably, calculating the loss value based on the classification result and the classification label set includes:
[0028] The classification result and the classification label set are substituted into the cross-entropy function to calculate the loss value.
[0029] Preferably, the data augmentation method involves randomly expanding, cropping, flipping, distorting contrast, and distorting brightness of the pathological image set in the dataset sample;
[0030] The data standardization method employs linear normalization to ensure that the pathological image set sample data in the dataset falls within the [0,1] interval.
[0031] The histopathological image classification device provided in this application first acquires the pathological image to be tested by the acquisition module, then calls the pre-trained pathological image classification network by the calling module, and the input module inputs the pathological image to be tested into the pathological image classification network for classification. The construction and training steps of the pathological image classification network include: acquiring a dataset of pathological images, which includes a set of pathological images and a set of classification labels corresponding to the pathological images; using data augmentation and data standardization methods to process and optimize the set of pathological images in the dataset; and constructing an end-to-end pathological image classification network. The pathological image classification network consists of three parts: the first part is a feature extraction network used to extract pathological features from the image, including frequency domain information and spatial information; the second part is a graph neural network used to construct a graph representation of the pathological features and perform information interaction; and the third part is a simplified Transformer network used to learn important nodes in the graph representation and perform final prediction. The pathological image classification network is trained using the optimized dataset. This application proposes a novel histopathological image classification network that uses a combination of discrete wavelet decomposition and convolutional neural networks to extract frequency domain and spatial features of the image, which can effectively compensate for some missing information in the spatial domain. Furthermore, by constructing a graph representation of features using graph neural networks and a simplified self-attention mechanism, and then performing information interaction and learning, unstructured spatial information about tumor structural heterogeneity can be effectively extracted. This method can effectively address some challenges in histopathological images and improve the accuracy and robustness of histopathological image classification. It can stably output classification results and improve the classification performance of histopathological images. Attached Figure Description
[0032] To more clearly illustrate the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0033] Figure 1 A structural diagram of a histopathological image classification device provided in this application embodiment;
[0034] Figure 2 A flowchart illustrating the construction and training of a pathological image classification network provided in this application embodiment;
[0035] Figure 3 This is a schematic diagram of the structure of the histopathological image classification network provided in the embodiments of this application;
[0036] Figure 4 This is a schematic diagram illustrating the generation of feature map representations provided in an embodiment of this application. Detailed Implementation
[0037] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of this application.
[0038] The core of this application is to provide a histopathological image classification device to improve the classification effect of histopathological images.
[0039] To enable those skilled in the art to better understand the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0040] Cancer pathology image classification is a fundamental task in the pathology workflow. In clinical practice, it is typically performed manually by pathologists on H&E-stained tissue sections, and its accurate classification directly impacts patient treatment choices and prognosis. Pathology image classification relies heavily on cell shape and arrangement. The number of abnormal cell nuclei is a crucial criterion for pathologists to determine whether a patient has cancer. Clinically, pathologists utilize a wealth of prior information to classify pathology images. Pathology images contain not only extensive morphological cellular information but also global contextual information about the tissue structure. These factors are subjective and time-consuming to examine manually by pathologists. With the rapid development of high-throughput technology and digital pathology, manual evaluation has become a bottleneck. Since the goal of computer-based classification diagnostic systems is to achieve automated interpretation of pathology, an accurate and efficient pathology image classification method can alleviate the workload of physicians and help improve diagnostic efficiency and accuracy.
[0041] The application provides a tissue pathology image classification device. Figure 1 A structural diagram of a histopathological image classification device provided in an embodiment of this application; as shown Figure 1 As shown, the method includes the following steps:
[0042] The acquisition module 10 is used to acquire the pathological images to be tested.
[0043] Module 11 is invoked to call the pre-trained pathological image classification network.
[0044] Input module 12 is used to input the pathological image to be tested into the pathological image classification network for classification.
[0045] Figure 2 A flowchart illustrating the construction and training of a pathological image classification network is provided for embodiments of this application; as shown below. Figure 2As shown, the construction and training steps of the pathological image classification network include:
[0046] S20: Obtain a dataset sample of pathological images.
[0047] The dataset samples include a set of pathological images and corresponding classification labels. In practical applications, the histopathological image samples can be sourced from the LDCH dataset collected by the Landing Artificial Intelligence Center, containing 286 cervical histopathological images with local annotations to mark several lesion areas, at a magnification of 40x. These areas can be divided into 9 categories; the number of local regions in each category are 1048, 1669, 302, 4494, 125, 8248, 322, 69, and 5913, respectively. This application can divide the histopathological image samples, using 60% for training, 30% for validation, and 10% for testing.
[0048] S21: Use data augmentation and data normalization methods to process the pathological image set in the dataset sample.
[0049] Specifically, data augmentation methods involve randomly expanding, cropping, flipping, and distorting the contrast and brightness of the histopathological images in the dataset samples; data normalization methods employ linear normalization to bring the histopathological image sample data in the dataset samples to the range [0,1], which can accelerate the training process of the network.
[0050] S22: Construct an end-to-end pathological image classification network.
[0051] The pathological image classification network consists of three parts: the first part is a feature extraction network, used to extract pathological features from the image, including frequency domain information and spatial information; the second part is a graph neural network, used to construct a graph representation of the pathological features and perform information exchange; and the third part is a simplified Transformer network, used to learn important nodes in the graph representation and perform the final prediction. Specifically, the three networks are connected in series, with the input of the previous network serving as the input of the next network.
[0052] S23: Train a pathological image classification network using the processed dataset samples.
[0053] Finally, the pathological image classification network is trained by processing the optimized dataset samples. After training is complete, the pathological images to be tested are input into the pathological image classification network for classification.
[0054] In the above-mentioned histopathological image classification device provided in the embodiments of this application, an end-to-end pathological image classification network is constructed. The frequency domain and spatial information of the pathological image are extracted through the feature extraction module, and the graph neural network and simplified self-attention mechanism are used to construct the graph representation of the features and perform information interaction and learning. This effectively supplements the missing information in the spatial domain. The flexible graph structure helps to extract the unstructured spatial information of tumor structural heterogeneity. This method can better cope with some challenges in histopathological images and effectively improve the accuracy and robustness of histopathological image classification, and can stably output classification results.
[0055] Figure 3 This is a schematic diagram of the structure of the histopathological image classification network provided in the embodiments of this application; as shown below. Figure 3 As shown, the histopathological image classification network is mainly divided into the following three parts:
[0056] The first part is the feature extraction network: the feature extraction module consists of a frequency domain feature extraction module and a spatial feature extraction module connected in parallel; the frequency domain feature extraction module is composed of two-stage wavelet decomposition connected in series; the spatial feature extraction module is composed of three traditional convolutional blocks connected in series; it is used to extract and continuously learn the frequency domain and spatial features of the target in the input image.
[0057] The second part is the graph neural network: The graph neural network consists of a two-layer graph neural network, used to construct the graph representation of features and interact with information; the graph neural network is used to construct the graph representation of features and perform information interaction. It is assumed that the feature dimension output by the feature extraction module is H×W×C, where H represents the height of the feature, W represents the width of the feature, and C represents the number of channels of the feature; the features are divided into N feature blocks, and a node feature V is obtained by converting each feature block into a feature vector; for a node, its K neighboring nodes are calculated, and an edge E from this node to the neighboring node is added; finally, an undirected graph representation G=(V,E) is obtained.
[0058] The third part is a simplified Transformer network. A simplified self-attention mechanism is used to learn important nodes in the features, and final classification is performed through sequence pooling and linear layers. The simplified self-attention mechanism consists of two linear layers. Assume the feature dimension of the graph neural network output is h×w×c, where h represents the feature height, w represents the feature width, and c represents the number of feature channels. The features are divided into N feature blocks, each with a dimension of M = p×p×c, where p represents the feature size, and N is the number of blocks: N = (h×w) / (p×p). The feature block sequence is rearranged into an N×M matrix, and each matrix is linearly mapped to a specified dimension D. A positional information symbol is added to the matrix, resulting in a matrix dimension of (N+1)×M, which is the input term. The input term is then fed into the simplified self-attention mechanism to obtain the feature matrix.
[0059] The steps for classifying the feature matrix using sequence pooling and fully connected layers to obtain classification results include: calculating a specific score for each class corresponding to each classification result based on the feature matrix; calculating a specific weight for each class corresponding to each classification result based on the specific score; and calculating the logical output corresponding to each classification result based on the specific weight, which serves as the predicted probability for each classification result in the pathological image set.
[0060] In specific implementation, the histopathological image classification device provided in this application embodiment, training the histopathological image classification network with optimized histopathological images, may specifically include: feeding the optimized histopathological images into the histopathological image classification network in batches; calculating the loss by comparing the classification prediction result of the last layer of the histopathological image classification network with the true labels of the pathological images; backpropagating the calculated loss in the network to obtain the gradient of the network parameters; and adjusting the network parameters using a gradient descent optimizer to minimize the loss and optimize the network. The formula for calculating the label smoothing cross-entropy loss is as follows. First, calculate the softmax of the output of the fully connected layer, which is regarded as the confidence probability of each category:
[0061]
[0062] Then use cross-entropy to calculate the loss:
[0063]
[0064] Among them, z i p is the output of the fully connected layer. i and q i These represent the true class and the predicted class, respectively. The function first calculates the softmax of the output of the fully connected layer, which is regarded as the confidence probability of each class, and then uses cross-entropy to calculate the loss.
[0065] It should be noted that this application can also use a specified number of training steps to set a dynamic learning rate to adjust the model. When the network's evaluation metrics no longer improve, the learning rate of the network is reduced to improve network performance. At the same time, in 100 iterations, when the validation loss reaches its minimum, the parameters of the model at this time are saved.
[0066] In specific implementation, in the histopathological image classification device provided in the embodiments of this application, after training is completed, step S105 inputs the histopathological image to be tested into the trained histopathological image classification network for classification.
[0067] In practical applications, the histopathological image classification device provided in this application embodiment can be implemented under the PyTorch deep learning framework, and the computer configuration is: Intel Core i5 6600K processor, 16G memory, NVIDIA V100 graphics card, and Linux operating system.
[0068] The performance of the histopathological image classification network in this application is evaluated below: In recent years, convolutional neural networks and Transformers have been widely used in the field of histopathological image classification, and this architecture has achieved very good performance on different categories of histopathological images. Alexnet, Resnet, Mobilenet, and Efficientnet are classic convolutional neural networks; Vit, Localvit, Cmt, and Swin are classic Transformer networks; GFnet, DCET, and CCT are networks that combine convolutional neural networks with Transformers to target the features of histopathological images; Mlp and Resmlp are classic MLP networks; Ours is the method described in this application.
[0069] Table 1: Comparison of classification performance between the proposed method and existing methods
[0070]
[0071] As shown in Table 1, the performance of the method in this application is compared with the aforementioned algorithms, with accuracy being the evaluation metric. Table 1 clearly shows that the method in this application achieves the best performance on this metric compared to the previous methods.
[0072] Figure 4 This is a schematic diagram illustrating the generation of feature map representations provided in this application embodiment. The histopathological image classification device provided in this application embodiment is a Graph-Transformer network based on discrete wavelet transform. It extracts the frequency domain and spatial information of the pathological image through a feature extraction module, and uses a graph neural network and a simplified self-attention mechanism to construct a graph representation of the features and perform information interaction and learning. This effectively supplements the missing information in the spatial domain. The flexible graph structure helps to extract unstructured spatial information of tumor structural heterogeneity. This method can better cope with some challenges in histopathological images and effectively improve the accuracy and robustness of histopathological image classification, and can stably output classification results.
[0073] The various embodiments in this specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.
[0074] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0075] The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein can be implemented directly by hardware, a software module executed by a processor, or a combination of both. The software module can be located in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.
[0076] The histopathological image classification device provided in this application is a Graph-Transformer network based on discrete wavelet transform. This network integrates graph neural networks and Transformers to generate an efficient classifier. It utilizes the graphical representation of pathological images and the computational efficiency of the Transformer structure to perform patch-level analysis. This mechanism helps to obtain unstructured spatial information about tumor structural heterogeneity. An enhanced frequency module is proposed, which uses a pre-determined wavelet kernel from discrete wavelet transform to extract low-dimensional data representations. Therefore, the method in this application can better address some challenges in histopathological images and effectively improve the accuracy and robustness of histopathological image classification, and can stably output classification results.
[0077] The foregoing has provided a detailed description of the histopathological image classification device provided in the embodiments of this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
[0078] The histopathological image classification device provided in this application embodiment first acquires the pathological image to be tested by the acquisition module, then calls the pre-trained pathological image classification network by the calling module, and finally the input module inputs the pathological image to be tested into the pathological image classification network for classification. The construction and training steps of the pathological image classification network include: acquiring a dataset of pathological images, which includes a set of pathological images and a set of classification labels corresponding to the pathological images; optimizing the set of pathological images in the dataset using data augmentation and data standardization methods; and constructing an end-to-end pathological image classification network. The pathological image classification network includes three parts: the first part is a feature extraction network used to extract pathological features of the image, which include frequency domain information and spatial information; the second part is a graph neural network used to construct a graph representation of the pathological features and perform information interaction; and the third part is a simplified Transformer network used to learn important nodes in the graph representation and perform final prediction. The pathological image classification network is trained using the optimized dataset. This application embodiment proposes a new histopathological image classification network that uses a combination of discrete wavelet decomposition and convolutional neural networks to extract frequency domain and spatial features of the image, which can effectively compensate for some missing information in the spatial domain. Furthermore, by constructing a graph representation of features using graph neural networks and a simplified self-attention mechanism, and then performing information interaction and learning, unstructured spatial information about tumor structural heterogeneity can be effectively extracted. This method can effectively address some challenges in histopathological images and improve the accuracy and robustness of histopathological image classification. It can stably output classification results and improve the classification performance of histopathological images.
[0079] Specifically, the feature extraction network can be connected in parallel with discrete wavelet decomposition operations through three layers of convolutional blocks. The feature extraction network is also used to merge the extracted pathological features as input to the graph neural network. The graph neural network constructs a graph representation of the pathological features and performs information exchange through a two-layer graph neural network. Further, information exchange is achieved through updating and aggregating nodes.
[0080] The graph update function is:
[0081] G'=F(G,W)=Update(Aggregate(G,W agg ), W update )
[0082] G represents the input graph data, and W represents the weights for updating the graph data. `Update()` represents the graph update function, and `Aggregate()` represents the graph aggregation function, where W... agg and W update These are the weights for aggregation and update operations, respectively.
[0083] The node update function is:
[0084] x j =h(x i g(x) i N(x) i ), W agg ), W update )
[0085] x i For graph nodes, N(x) i ) is x i The k-nearest nodes of a node, h() is the node update function, g() is the node aggregation function, W agg and W update These are the weights for aggregation and update operations, respectively.
[0086] As a preferred approach, a simplified self-attention method is used to learn key features through information interaction. The simplified self-attention method consists of two linear layers. That is, the simplified Transformer network learns important nodes in the graph representation through two linear layers as the attention mechanism, and makes the final prediction through sequence pooling and linear layers.
[0087] In addition, the graph representation of pathological features constructed using graph neural networks includes: setting the feature dimension of the feature extraction network output to H×W×C, where H represents the height of the pathological feature, W represents the width of the pathological feature, and C represents the number of channels of the pathological feature; dividing the pathological feature into N feature blocks, and obtaining the feature V of a node by converting each feature block into a feature vector; for a node, calculating the K nearest neighbors of the node and adding an edge E from the node to the nearest neighbor; finally obtaining an undirected graph representation G=(V,E).
[0088] The method of obtaining the feature matrix by learning important nodes in the graph representation using a simplified Transformer network includes: setting the feature dimension of the graph neural network output to h×w×c, where h represents the height of the pathological feature, w represents the width of the pathological feature, and c represents the number of channels of the pathological feature; dividing the pathological feature into N feature blocks, each with a dimension of M = p×p×c, where p represents the size of the pathological feature, and N is the number of blocks N = (h×w) / (p×p). The feature block sequence is rearranged into an N×M matrix, and each matrix is linearly mapped to a specified dimension D; a positional information symbol is added to the matrix, and the matrix dimension becomes (N+1)×M, obtaining the input term. Finally, the input term is fed into a simplified self-attention mechanism to obtain the feature matrix.
[0089] The classification of the feature matrix using sequence pooling layers and fully connected layers to obtain classification results includes: assigning specified weights to the extracted pathological features using sequence pooling layers, and performing linear classification processing on the feature matrix using fully connected layers to obtain classification results. Specifically, based on the feature matrix, a specific score for each classification result corresponding to the class is calculated; based on the specific score, a specific weight for each classification result corresponding to the class is calculated; and based on the specific weight, the logical output corresponding to each classification result is calculated as the predicted probability for each classification result in the pathological image set.
[0090] Training a pathological image classification network using the optimized dataset samples includes: calculating the loss value based on the classification results and the classification label set, and adjusting the pathological image classification network until the loss value is minimized. Specifically, calculating the loss value based on the classification results and the classification label set includes: substituting the classification results and the classification label set into the cross-entropy function to calculate the loss value. The optimized histopathological images are then fed into the histopathological image classification network in batches. The loss is calculated by comparing the classification prediction results of the last layer of the histopathological image classification network with the true labels of the pathological images. The calculated loss is backpropagated in the network to obtain the gradient of the network parameters. The network parameters are then adjusted using a gradient descent optimizer to minimize the loss.
[0091] Data augmentation methods involve randomly expanding, cropping, flipping, and distorting the contrast and brightness of the pathological images in the dataset. Data standardization methods employ linear normalization to ensure that the pathological image samples in the dataset fall within the [0,1] interval.
[0092] The foregoing has provided a detailed description of a histopathological image classification device provided in this application. The various embodiments in the specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. It should be noted that those skilled in the art can make various improvements and modifications to this application without departing from the principles thereof, and these improvements and modifications also fall within the protection scope of the claims of this application.
[0093] It should also be noted that, in this specification, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes the aforementioned element.
Claims
1. A histopathological image classification device, characterized in that, include: The acquisition module is used to acquire the pathological images to be tested. The calling module is used to call a pre-trained pathology image classification network; An input module is used to input the pathological image to be tested into the pathological image classification network for classification; wherein, the construction and training steps of the pathological image classification network include: Obtain a dataset sample of pathological images, the dataset sample including a set of pathological images and a set of classification labels corresponding to the set of pathological images; The pathological image set in the dataset sample was processed using data augmentation and data normalization methods. An end-to-end pathological image classification network is constructed, comprising three parts: a first part is a feature extraction network for extracting pathological features of the image, including frequency domain information and spatial information; a second part is a graph neural network for constructing a graph representation of the pathological features and performing information interaction; and a third part is a simplified Transformer network for learning important nodes in the graph representation and performing final prediction. The pathological image classification network is trained using the processed dataset samples; The feature extraction network is connected in parallel with the discrete wavelet decomposition operation through three layers of convolutional block operation. The feature extraction network is also used to merge the extracted pathological features as the input of the graph neural network. The use of graph neural networks to construct graph representations of the pathological features includes: The feature dimension of the output of the feature extraction network is set to H×W×C, where H represents the height of the pathological feature, W represents the width of the pathological feature, and C represents the number of channels of the pathological feature. The pathological feature is divided into N feature blocks, and each feature block is transformed into a feature vector to obtain the feature V of a node. For the node, the K nearest neighbors of the node are calculated, and an edge E from the node to the nearest neighbor is added. Finally, an undirected graph representation G=(V,E) is obtained. The simplified Transformer network uses two linear layers as an attention mechanism to learn important nodes in the graph representation, and performs final prediction through sequence pooling and linear layers. The classification results obtained by classifying the feature matrix using sequence pooling layers and fully connected layers include: Sequence pooling layers are used to assign specified weights to the extracted pathological features, and fully connected layers are used to perform linear classification on the feature matrix to obtain the classification result. The pathological image classification network is a dedicated Graph-Transformer network based on discrete wavelet transform, which targets and extracts unstructured spatial information about tumor structural heterogeneity. The histopathological image classification device includes: Based on the feature matrix, determine the specific score of the class corresponding to each classification result; Based on the specific score, determine the specific weight of the class corresponding to each of the classification results; Based on the specific weights, determine the logical output corresponding to each classification result, which serves as the predicted probability for each classification result in the pathological image set; The histopathological image classification device includes: The loss value is calculated by comparing the classification result of the pathological image classification network with the true label of the pathological image. The loss value is then backpropagated in the pathological image classification network to obtain the gradient of the network parameters. The network parameters are then adjusted by a gradient descent optimizer to control the loss value to meet a preset minimum condition.
2. The histopathological image classification device according to claim 1, characterized in that, The graph neural network constructs a graph representation of the pathological features and performs information interaction through a two-layer graph neural network.
3. The histopathological image classification device according to claim 1, characterized in that, The feature matrix is obtained by learning key nodes in the graph representation using a simplified Transformer network, including: The feature dimension of the output of the graph neural network is set to h×w×c, where h represents the height of the pathological feature, w represents the width of the pathological feature, and c represents the number of channels of the pathological feature; the pathological feature is divided into N feature blocks, and the dimension of each feature block is M=p×p×c, where p represents the size of the pathological feature, and the number of N is N=(h×w) / (p×p); The feature block sequence is rearranged into an N×M matrix, and each matrix is linearly mapped to a specified dimension D; a position information symbol is added to the matrix, and the matrix dimension is (N+1)×M to obtain the input item; The input terms are fed into a simplified self-attention mechanism to obtain the feature matrix.
4. The histopathological image classification device according to claim 1, characterized in that, The step of training the pathological image classification network using the processed dataset samples includes: The loss value is calculated based on the classification results and the classification label set, and the pathological image classification network is adjusted until the loss value is minimized.
5. The histopathological image classification device according to claim 4, characterized in that, The step of calculating the loss value based on the classification result and the classification label set includes: The classification result and the classification label set are substituted into the cross-entropy function to calculate the loss value.
6. The histopathological image classification device according to claim 1, characterized in that, The data augmentation method involves randomly expanding, cropping, flipping, distorting contrast, and distorting brightness of the pathological images in the dataset sample. The data standardization method employs linear normalization to ensure that the pathological image set sample data in the dataset falls within the [0,1] interval.