Hyperspectral remote sensing image classification method based on unsupervised multi-view pair-wise learning

By employing an unsupervised multi-view image contrast learning method, multiple views are constructed and data is adaptively augmented, solving the problem of hyperspectral remote sensing image classification's dependence on training samples and achieving high-precision and efficient image classification.

CN118918459BActive Publication Date: 2026-06-26WUHAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
WUHAN UNIV
Filing Date
2024-07-08
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing hyperspectral remote sensing image classification methods require a large number of training samples and the initial image is not perfect, resulting in the extraction of local information while ignoring long-range information.

Method used

We employ an unsupervised multi-view graph contrastive learning approach. Through an adaptively enhanced contrastive learning model, we construct multiple views from both spatial and spectral dimensions. We use an adaptive data augmentation module to discard unimportant edge and node features and extract deep features for classification.

Benefits of technology

It improves classification accuracy and efficiency without requiring a large number of labeled samples, comprehensively extracts local and long-range information from images, and enhances the model's feature extraction capabilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118918459B_ABST
    Figure CN118918459B_ABST
Patent Text Reader

Abstract

The application discloses a hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning, first, data preprocessing is performed on the obtained hyperspectral remote sensing image X to obtain superpixels and corresponding labels and a segmentation matrix; then, according to the segmentation matrix, a spatial adjacency matrix and a spectral adjacency matrix are respectively established from the multi-view angle to obtain multi-views; finally, the multi-views are input into an adaptive data enhancement module to obtain deep features for classification; the application fully utilizes unsupervised deep contrast learning for image classification, combines a multi-view form of graph construction, fully considers various representations of samples, comprehensively considers various levels of features, and improves the classification precision.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of remote sensing image processing technology, and relates to a hyperspectral remote sensing image classification method, and more particularly to a hyperspectral remote sensing image classification method, device and software product based on unsupervised multi-view image contrast learning. Background Technology

[0002] Hyperspectral remote sensing image processing plays a crucial role in material information detection and is an important topic in the field of remote sensing. Hyperspectral remote sensing images provide continuous radiometric spectral bands, carrying rich information about ground features, and can be used in various application areas, such as building change detection, crop assessment, and geological and mineral resource surveys (References 1, 2, 3). In the field of hyperspectral remote sensing image processing, classification is a major task; hyperspectral image classification involves determining the category of each pixel in a hyperspectral image. Existing classification methods are mainly based on convolutional neural networks (CNNs), using CNN models for end-to-end image classification.

[0003] Graph convolutional neural networks (GCNNs), as a deep learning model (references 4 and 5), have been successfully applied to hyperspectral remote sensing image classification. GCNN-based methods can process high-dimensional data and quickly aggregate information from surrounding nodes, enabling rapid extraction of deep abstract information. Existing hyperspectral image classification methods require a large number of prior training samples to achieve good results. However, this consumes significant human resources in hyperspectral image classification applications. Furthermore, most existing graph classification methods suffer from incomplete initial graph construction, resulting in the extraction of only local image information during training, neglecting the mining of long-range information.

[0004] References:

[0005] [1]He L,Li J,Liu C,Li S.Recent Advances on Spectral-SpatialHyperspectral Image Classification:An Overview and New Guidelines.IEEE TransGeosci Remote Sens.2018Mar;56(3):1579-97.

[0006] [2]Ahmad M,Shabbir S,Roy SK,Hong D,Wu X,Yao J,et al.HyperspectralImage Classification-Traditional to Deep Models:A Survey for FutureProspects.IEEE J Sel Top Appl Earth Observations Remote Sens.2022Oct 15;15:968-99.

[0007] [3]Ye J,He J,Peng X,Wu W,Qiao Y.Attention-Driven Dynamic GraphConvolutional Network for Multi-label Image Recognition.In:Vedaldi A,BischofH,Brox T,Frahm J-M,editors.Computer Vision-ECCV 2020[Internet].Cham:SpringerInternational Publishing;2020[cited 2021Apr 26].p.649-65.(Lecture Notes inComputer Science;vol.12366).Available from:https: / / link.springer.com / 10.1007 / 978-3-030-58589-1_39

[0008] [4]Liang H,Li Q.Hyperspectral Imagery Classification Using SparseRepresentations of Convolutional Neural Network Features.RemoteSens.2016Jan27;8(2):99.

[0009] [5]Zhang H, Zou J, and Zhang L, EMS-GCN: An End-to-end MixhopSuperpixel-based Graph Convolutional Network for Hyperspectral Image Classification. IEEETrans.Geosci.Remote Sens. 2022Mar 30;60:1-16. Summary of the Invention

[0010] To address the issues of existing technologies requiring a large number of training samples and having incomplete initial images, this invention provides a high-precision and high-efficiency hyperspectral image classification method based on unsupervised multi-view image contrast learning.

[0011] In a first aspect, the present invention provides a hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning, comprising the following steps:

[0012] Step 1: Preprocess the acquired hyperspectral remote sensing image X to obtain superpixels, corresponding labels, and segmentation matrix;

[0013] Step 2: Based on the segmentation matrix, establish the spatial adjacency matrix and spectral adjacency matrix from multiple perspectives to obtain multiple views;

[0014] Step 3: Input the multi-view data into the adaptive augmented contrastive learning model to obtain deep features for classification;

[0015] Step 3: Input the multi-view data into the adaptive data augmentation module to obtain deep features for classification;

[0016] The adaptive data augmentation module includes a spectral graph adaptive data augmentation branch and a spatial graph adaptive data augmentation branch set in parallel. Both the spectral graph adaptive data augmentation branch and the spatial graph adaptive data augmentation branch consist of a cascaded edge selection layer and a feature masking layer. The edge selection layer first uses a reference node centrality metric to measure the importance of the current node, and then uses the average of the parameters of the two endpoints of the edge to represent its importance. Next, the edges are sorted by importance, and the top K nodes are selected, their edge relationships are retained, and the remaining edges are deleted. The feature masking layer first sorts the importance of the nodes, then sets a discard probability, and sets the feature mask of unimportant nodes to 0 to achieve data augmentation.

[0017] Preferably, in step 1, principal component analysis (PCA) is used to perform dimensionality reduction and noise reduction on the hyperspectral remote sensing image.

[0018] Preferably, in step 2, the spectral adjacency matrix A is obtained using Euclidean distances with K-order nearest neighbors. spe :

[0019] N(v i ) = argsort(||v i -v j || 2 );

[0020]

[0021] Wherein, N(v) i ) is the Euclidean distance between node i and all other nodes j; N K (v i In the )K, K represents the filtering process for all nodes, and N K (v i ) represents the set of the first K nodes that are closest to node i in Euclidean distance; v i and v j These are the characteristics of the nodes in the graph; the argsort() function sorts the elements in the matrix from smallest to largest and returns the array indices of the corresponding sequence elements;

[0022] A spatial adjacency matrix A is constructed using a self-connected spatial second-order nearest neighbor model. spa :

[0023] A spa =A×A+I;

[0024] a=max(w),b=min(w),if a≠b,A ab =1;

[0025] In this process, a 3×3 template w is used to take the maximum value a and the minimum value b of the partition matrix to obtain the adjacency matrix A based on the first-order nearest neighbor in space; I is the identity matrix.

[0026] Preferably, in step 3, the obtained multi-view A is first subjected to adaptive edge discarding to obtain the edge matrix.

[0027] E = vstack(coo_matrix(A)),

[0028] Where J represents the number of edges, and coo_matrix and vstack are two matrix transformation functions in Python;

[0029] Then calculate the importance metric parameter of the edge between nodes a and b.

[0030]

[0031] Where θa (E) Calculate the degree matrix for node a, θ b (E) Calculate the degree matrix for node b, where e is an abbreviation for edge;

[0032] Calculate the normalized marginal sort index idx e ;

[0033]

[0034] Among them W e max and W e mean It is W e The maximum and average values, It specifically refers to the metric parameter w between nodes a and b. e This refers to the metric parameter of a specific node; the argsort() function sorts the elements in a matrix in ascending order and returns the array indices of the corresponding sequence elements; idx e This represents the output of all edge weights W. e Array index;

[0035] The edges are sorted by importance, and the edges with high weights are kept while the unimportant edges are removed, in order to reduce computation memory and improve efficiency.

[0036]

[0037] Where T = J × p is used to calculate the number of edges dropped, p is a hyperparameter representing the probability of dropping, and J represents the number of edges;

[0038] The multiple views are then passed through an adaptive node feature mask in a concatenated manner:

[0039] W f =x T θ(E),

[0040] Where, x T Let θ(E) be the eigenvector of a node, θ(E) be the degree matrix of the node, and W be the eigenvector of the node. f The importance metric for nodes;

[0041] The importance of the node features in the graph is ranked to obtain the normalized edge ranking index idx. f ;

[0042]

[0043] Among them, w f max and w f mean It is w f The maximum and average values;

[0044] Remove unimportant node features based on the discard probability p:

[0045]

[0046] Where T2 = N × p2, p2 ∈ [0, 1] represents the probability of discarding node features; N represents the number of bands after PCA dimensionality reduction; F i The characteristics of node i are represented.

[0047] Preferably, the adaptively enhanced contrastive learning model is a pre-trained network model; during training, the NT-Xent loss function is used for backpropagation.

[0048]

[0049] Where sim(·,·) represents the cosine similarity function, z spa,i , z spe,i Represents the spatial and spectral characteristics after MLP; τ represents the temperature parameter, n represents the number of nodes; ζ(z spa,i ,z spe,i ) represents the similarity between the spatial branch and the spectral branch corresponding to node i.

[0050] Secondly, this invention provides a hyperspectral remote sensing image classification system based on unsupervised multi-view image contrast learning, comprising the following modules:

[0051] The hyperspectral remote sensing image preprocessing module is used to preprocess the acquired hyperspectral remote sensing image X to obtain superpixels and corresponding labels, and segmentation matrix;

[0052] The multi-view construction module is used to build spatial adjacency matrices and spectral adjacency matrices from multiple perspectives based on the segmentation matrix, thereby obtaining multiple views;

[0053] The hyperspectral remote sensing image classification module is used to input multiple views into an adaptively enhanced contrastive learning model to obtain deep features for classification.

[0054] Thirdly, the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning.

[0055] Fourthly, the present invention provides a non-transitory computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning.

[0056] Fifthly, the present invention provides a computer program product, including a computer program that, when executed by a processor, implements the hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning.

[0057] This invention has the following advantages and positive effects:

[0058] 1. This invention innovatively proposes a deep learning network based on unsupervised graph contrastive learning, which for the first time achieves unsupervised hyperspectral image classification and can alleviate the model's dependence on labeled samples.

[0059] 2. This invention proposes a novel multi-view adjacency matrix construction method to improve the initial graph structure. This method can establish adjacency matrices from both spatial and spectral perspectives, acquiring long-range information while focusing on local node information, resulting in more comprehensive feature extraction.

[0060] 3. This invention designs an adaptive data augmentation module that discards unimportant data from both edge and node feature perspectives. This can reduce the loss of important information caused by random augmentation and improve network performance. Attached Figure Description

[0061] The technical solutions of the present invention will be further illustrated below using embodiments and specific implementation methods. In addition, some accompanying drawings are used in the description of the technical solutions. Those skilled in the art can obtain other drawings and the intent of the present invention from these drawings without any creative effort.

[0062] Figure 1 This is a schematic diagram of the specific process of the method of the present invention;

[0063] Figure 2 This is a schematic diagram of the adaptive data augmentation module structure of the method of the present invention. Detailed Implementation

[0064] To facilitate understanding and implementation of the present invention by those skilled in the art, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the embodiments described herein are for illustration and explanation only and are not intended to limit the present invention.

[0065] This invention leverages multi-view graph contrastive learning to maximize consistency within the same category and diversity between different categories. It is particularly effective in situations with limited sample sizes and an imbalanced number of positive and negative samples, compensating for the limitations of convolutional neural networks. Data augmentation is a crucial component of contrastive learning. To enable adaptive data augmentation when using graph structures and minimize the destruction of important information by random augmentation, this invention introduces an effective unsupervised multi-view graph contrastive learning method. Addressing the shortcomings of existing methods, it extracts local and long-range information from hyperspectral images from both spatial and spectral dimensions using a multi-view graph construction approach. Simultaneously, it selectively preserves important structures for data augmentation, thereby better accomplishing hyperspectral image classification tasks.

[0066] Please see Figure 1 This embodiment provides a hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning, which includes the following steps:

[0067] Step 1: Preprocess the acquired hyperspectral remote sensing image X to obtain superpixels, corresponding labels, and segmentation matrix;

[0068] For hyperspectral images Here, H, W, and B represent the length, width, and height of the hyperspectral image, respectively, which are input into the model. The hyperspectral image is a data cube, which can be represented by a tensor. The vector at each point in the tensor represents the pixel radiance value corresponding to each band. Because it consists of hundreds of very narrow continuous spectra, it contains rich information, but also a lot of redundant information. Furthermore, due to atmospheric, sensor, environmental interference, and human annotation, the spectral information may not be accurate. Therefore, this embodiment uses Principal Component Analysis (PCA) to perform dimensionality reduction and noise reduction on the image.

[0069] Step 2: Based on the segmentation matrix, establish the spatial adjacency matrix and spectral adjacency matrix from multiple perspectives to obtain multiple views;

[0070] In one implementation, graph convolutional neural networks compute adjacency matrices based on all nodes in the entire graph. However, directly converting image pixels into nodes consumes significant memory resources, making it unsuitable for large images such as remote sensing images. Therefore, a clustering function is needed to group similar pixels into superpixels to reduce memory usage. This requires an encoder and decoder that construct pixel-level and superpixel features, based on…

[0071]

[0072] Among them, Q ij Let Q represent the value of the correlation matrix at positions i and j, S be the superpixel, and Flatten(X) denote the expansion of the node feature X. Construct the correlation matrix, and then, based on the correlation matrix, construct the encoder and decoder for the superpixel and the pixel, i.e.

[0073]

[0074] Where Decoder(·;Q) is the superpixel feature decoder, Rshape(·) performs dimensionality transformation, and V is the superpixel feature. This transforms node data into raster data.

[0075] A hyperspectral spatial-spectral multi-view adjacency matrix is ​​constructed using the correlation matrix, thereby establishing a complete initial graph structure. Firstly, during the construction of the spectral adjacency matrix, pixel features are transformed into superpixel features using the relation matrix Q:

[0076] F S =Q T F P ,

[0077] Where F S For the features of superpixels, Q T F is the transpose of the incidence matrix. P The features of the pixels are then determined. Next, the Euclidean distance between any two nodes is calculated. This allows for the connection of spatially distant nodes based on spectral similarity, thus enabling the extraction of long-range information. Subsequently, the Euclidean distances of each node are sorted, and a spectral adjacency matrix A is constructed by using K-order nearest neighbors to determine whether nodes are connected. spe :

[0078] N(v i =argsort(||v i -v j || 2 );

[0079]

[0080] Wherein, N(v) i ) is the Euclidean distance between node i and all other nodes j; N K (v i In the )K, K represents the filtering process for all nodes, and N K (v i ) represents the set of the first K nodes that are closest to node i in Euclidean distance; v i and v j These are the characteristics of the nodes in the graph; the argsort() function sorts the elements in a matrix in ascending order and returns the array indices of the corresponding sequence elements; A spe This is the obtained spectral adjacency matrix.

[0081] Next, we construct a spatial adjacency matrix. Previous methods used a template on the partition matrix M to take the maximum and minimum values ​​to determine whether nodes are first-order nearest neighbors.

[0082] a=max(w),b=min(w),if a≠b,A ab =1.

[0083] In this method, a 3×3 template w is used to obtain the maximum and minimum values ​​of the segmentation matrix, resulting in an adjacency matrix A based on spatial first-order nearest neighbors. This implementation utilizes spatial self-connectivity-based second-order nearest neighbors to extract spatial information, thus allowing for the aggregation of information from surrounding nodes while focusing on pixels within a node.

[0084] A spa = A × A + I.

[0085] Where I is the identity matrix. A spa This is the obtained spatial adjacency matrix.

[0086] Step 3: Input the multi-view data into the adaptive data augmentation module to obtain deep features for classification;

[0087] Please see Figure 2 The adaptive data augmentation module in this embodiment includes a spectral graph adaptive data augmentation branch and a spatial graph adaptive data augmentation branch set in parallel. Both the spectral graph adaptive data augmentation branch and the spatial graph adaptive data augmentation branch consist of a cascaded edge selection layer and a feature masking layer. The edge selection layer first uses a reference node centrality metric to measure the importance of the current node, and then uses the average of the parameters of the two endpoints of the edge to represent its importance. Next, the edges are sorted by importance, and the top K nodes are selected, their edge relationships are retained, and the remaining edges are deleted. The feature masking layer first sorts the importance of the nodes, then sets a discard probability, and sets the feature mask of unimportant nodes to 0 to achieve data augmentation.

[0088] To enhance the contrastive learning framework's ability to extract key features, this embodiment employs an adaptive approach to selectively enhance graph edge and node features while discarding less important nodes and edges, thereby improving the model's feature extraction capabilities. The initial graph obtained from the multi-view construction module is first processed through adaptive edge discarding to obtain the edge matrix.

[0089] E = vstack(coo_matrix(A)),

[0090] Where J represents the number of edges, and coo_matrix and vstack are two matrix transformation functions in Python;

[0091] Then calculate the importance metric parameter of the edge between nodes a and b.

[0092]

[0093] Where θ a (E) Calculate the degree matrix for node a, θ b (E) Calculate the degree matrix for node b, where e is an abbreviation for edge;

[0094] Next, calculate the normalized marginal sort index idx. e ;

[0095]

[0096] Among them W e max and W e mean It is W e The maximum and average values, It specifically refers to the metric parameter w between nodes a and b. e This refers to the metric parameter of a specific node; the argsort() function sorts the elements in a matrix in ascending order and returns the array indices of the corresponding sequence elements; idx e This represents the output of all edge weights W. e Array index;

[0097] The edges are sorted by importance, and the edges with high weights are kept while the unimportant edges are removed, in order to reduce computation memory and improve efficiency.

[0098]

[0099] Where T = J × p is used to calculate the number of edges dropped, p is a hyperparameter representing the probability of dropping, and J represents the number of edges;

[0100] The multiple views are then passed through an adaptive node feature mask in a concatenated manner:

[0101] W f =x T θ(E),

[0102] Where, x T Let θ(E) be the eigenvector of a node, θ(E) be the degree matrix of the node, and W be the eigenvector of the node. f The importance metric for nodes;

[0103] The importance of the node features in the graph is ranked to obtain the normalized edge ranking index idx. f ;

[0104]

[0105] Among them, w f max and w f meanIt is w f The maximum and average values;

[0106] Remove unimportant node features based on the discard probability p:

[0107]

[0108] Where T2 = N × p2, p2 ∈ [0, 1] represents the probability of discarding node features; N represents the number of bands after PCA dimensionality reduction; F i The characteristics of node i are represented.

[0109] In this implementation, an adaptive contrastive learning framework is used for unsupervised feature extraction; during training, the MoCo structure is adopted, and the NT-Xent loss function is used for backpropagation.

[0110]

[0111] Where sim(·,·) represents the cosine similarity function, z spa,i , z spe,i Represents the spatial and spectral characteristics after MLP; τ represents the temperature parameter, n represents the number of nodes; ζ(z spa,i ,z spe,i ) represents the similarity between the spatial branch and the spectral branch corresponding to node i.

[0112] This embodiment also provides a hyperspectral remote sensing image classification system based on unsupervised multi-view image contrast learning, including the following modules:

[0113] The hyperspectral remote sensing image preprocessing module is used to preprocess the acquired hyperspectral remote sensing image X to obtain superpixels and corresponding labels, and segmentation matrix;

[0114] The multi-view construction module is used to build spatial adjacency matrices and spectral adjacency matrices from multiple perspectives based on the segmentation matrix, thereby obtaining multiple views;

[0115] The hyperspectral remote sensing image classification module is used to input multiple views into an adaptively enhanced contrastive learning model to obtain deep features for classification.

[0116] This embodiment also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning.

[0117] This embodiment also provides a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning.

[0118] This embodiment also provides a computer program product, including a computer program that, when executed by a processor, implements the hyperspectral remote sensing image classification method based on unsupervised multi-view graph contrast learning.

[0119] The key inventive point of this invention is the proposal of a hyperspectral image classification method based on unsupervised multi-view graph contrastive learning. This method combines multi-view adjacency matrix construction with adaptive pairwise learning to address the problem of lacking training samples. Based on a graph convolutional neural network framework, this method constructs multiple views, enabling the extraction of not only local spatial information from the image but also long-range information through distant node connections, thus facilitating the comprehensive acquisition of deep abstract information from the image. Simultaneously, adaptive data augmentation enhances the model's information extraction capabilities. Essentially, this method utilizes multi-view computation to refine the initial graph structure by calculating the connection features between nodes, and leverages an adaptive contrastive learning framework to improve classification efficiency.

[0120] The invention will be further illustrated by the following experiments.

[0121] This experiment was written in Python and implemented using the classic deep learning framework PyTORCH, with Python remote sensing image read / write functions as the foundation. The SPECTRAL remote sensing image processing library is called, and the filename of the remote sensing image to be read is input. The remote sensing image is then read into a tensor of size X. Each element in the tensor represents the pixel radiance value corresponding to each band, where H is the length of the remote sensing image, is the width of the remote sensing image, and B is the number of bands in the remote sensing image. The remote sensing image read / write functions are then called to read the hyperspectral remote sensing image to be analyzed into the tensor X. Python remote sensing image read / write functions are well-known in this field and will not be described in detail here.

[0122] The Salinas and Wuhan UAV hyperspectral imagery datasets (WHU-Hi) were used in this experiment to validate the model's effectiveness. The Salinas dataset, captured by the AVIRIS sensor in Salinas Valley, California, is 512×217 pixels with a spatial resolution of 3.7 meters. It contains 224 continuous bands; after removing 20 water absorption bands (bands 108-112, 154-167, and 224), 204 bands were actually used for training. The study area includes 16 land cover types, including vegetables, bare soils, and vineyard fields. The Wuhan UAV hyperspectral imagery dataset (WHU-Hi), collected and shared by the RSIDEA research group at Wuhan University, can serve as a benchmark dataset for accurate crop classification and hyperspectral image classification research. The WHU-Hi dataset comprises three independent UAV hyperspectral datasets: WHU-Hi-LongKou, WHU-Hi-HanChuan, and WHU-Hi-HongHu. All datasets were acquired using a Headwall Nano-Hyperspec sensor mounted on a drone platform in agricultural areas of different crop types in Hubei Province, China. Compared to spaceborne and airborne hyperspectral platforms, UAV-borne hyperspectral systems can acquire hyperspectral images with high spatial resolution. The Longkou dataset was collected in Longkou Town, Hubei Province, China, from 13:49 to 14:37 on July 17, 2018, using an 8mm focal length headwall nano-hyperspec imaging sensor mounted on a DJI Matrice 600 Pro drone platform. During data acquisition, the weather was clear and cloudless, with an temperature of approximately 36°C and a relative humidity of approximately 65%. The study area is a simple agricultural scene containing six crops: maize, cotton, sesame, broadleaf soybean, narrowleaf soybean, and rice. The drone flew at an altitude of 500m, the image size was 550×400, with 270 bands within the 40-1000nm range, and the spatial resolution of the UAV-borne hyperspectral images was approximately 0.463m.

[0123] Classification evaluation metrics: A quantitative evaluation method was adopted. For dataset 1, 1% of the pixels were selected as training samples, and the rest as test samples. For dataset 2, 0.5% of the pixels were selected as training samples, and the rest as test samples. The following two metrics were used for evaluation:

[0124] 1) Kappa coefficient:

[0125] The kappa coefficient is an authoritative evaluation metric for classifying problems. A higher kappa coefficient generally indicates higher accuracy. The kappa coefficient is calculated as follows:

[0126] The confusion matrix was obtained from the samples, as shown in Table 1.

[0127] Table 1 Confusion Matrix

[0128]

[0129] In Table 1, TTO represents the number of samples labeled as category 1 and predicted as category 1; TF represents the number of samples actually in category 2 but predicted as category 1; FT represents the number of samples actually in category 1 but predicted as category 2; TTT represents the number of samples labeled as category 2 and predicted as category 2; NCO is the sum of TTO and FT; NCT is the sum of TF and TTT; NRO is the sum of TTO and TF; NRT is the sum of FT and TTT; and N is the total number of samples.

[0130] The formula for calculating the kappa coefficient is:

[0131]

[0132] 2) Overall accuracy:

[0133] Overall Accuracy (OA) is an evaluation metric used to assess classification problems. Higher overall accuracy indicates higher detection precision. OA is calculated based on the confusion matrix shown in Table 1, and the formula for OA is:

[0134]

[0135] The kappa coefficient and overall accuracy evaluation methods 1-5 and the method of this invention were used to evaluate the change detection capability. The evaluation indicators are shown in Table 2.

[0136] Table 2 Comparative test results 1

[0137]

[0138] As shown in Table 2, the method of the present invention achieves higher overall accuracy and kappa value, indicating that the method of the present invention has stronger feature extraction and classification capabilities. Compared with the comparative method, the method of the present invention achieves the best results in both overall classification accuracy and kappa value, showing a significant improvement.

[0139] Therefore, it can be concluded that the method of this invention has higher classification accuracy compared with traditional and recent unsupervised deep learning classification methods. This invention fully utilizes unsupervised deep contrastive learning for image classification, combines multi-view forms of graph construction, fully considers various representations of samples, and comprehensively considers all levels of features, thereby improving classification accuracy.

[0140] It should be understood that the embodiments described above are only some, not all, of the embodiments of the present invention. Furthermore, the technical features of the various embodiments or individual embodiments provided by the present invention can be arbitrarily combined to form feasible technical solutions. Such combinations are not constrained by the order of steps and / or structural composition patterns, but must be based on the ability of those skilled in the art to implement them. When the combination of technical solutions is contradictory or cannot be implemented, it should be considered that such a combination of technical solutions does not exist and is not within the scope of protection claimed by the present invention.

[0141] It should be understood that the above description of the preferred embodiments is quite detailed, but it should not be considered as a limitation on the scope of protection of this invention. Those skilled in the art, under the guidance of this invention, can make substitutions or modifications without departing from the scope of protection of the claims of this invention, and all such substitutions or modifications fall within the scope of protection of this invention. The scope of protection of this invention should be determined by the appended claims.

Claims

1. A hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning, characterized in that, Includes the following steps: Step 1: Preprocess the acquired hyperspectral remote sensing image X to obtain superpixels, corresponding labels, and segmentation matrix; Step 2: Based on the segmentation matrix, establish the spatial adjacency matrix and spectral adjacency matrix from multiple perspectives to obtain multiple views; The spectral adjacency matrix is ​​obtained using Euclidean distance with K-order nearest neighbors. : ; ; in, It is a node i With all other nodes j The Euclidean distance; K represents the filtering process for all nodes. Indicates the preceding K Individual and Node i The set of nodes that are close in Euclidean distance; and These are the characteristics of the nodes in the graph; The function sorts the elements in a matrix in ascending order and returns the array indices of the elements in the corresponding sequence. A spatial adjacency matrix is ​​constructed using a self-connected spatial second-order nearest neighbor model. : ; One of them uses a 3×3 template. Take the maximum value of the partition matrix a and minimum value b This yields the adjacency matrix based on the first-order nearest neighbors in space. ; It is the identity matrix; Step 3: Input the multi-view data into the adaptive data augmentation module to obtain deep features for classification; The adaptive data augmentation module includes a spectral graph adaptive data augmentation branch and a spatial graph adaptive data augmentation branch set in parallel. Both the spectral graph adaptive data augmentation branch and the spatial graph adaptive data augmentation branch consist of a cascaded edge selection layer and a feature masking layer. The edge selection layer first uses a reference node centrality metric to measure the importance of the current node, and then uses the average of the parameters of the two endpoints of the edge to represent its importance. Next, the edges are sorted by importance, and the top K nodes are selected, their edge relationships are retained, and the remaining edges are deleted. The feature masking layer first sorts the importance of the nodes, then sets a discard probability, and sets the feature mask of unimportant nodes to 0 to achieve data augmentation.

2. The hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning according to claim 1, characterized in that: In step 1, principal component analysis (PCA) is used to perform dimensionality reduction and noise reduction on the hyperspectral remote sensing image.

3. The hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning according to claim 1, characterized in that: In step 3, the obtained multi-view A first undergoes adaptive edge discarding to obtain the edge matrix. ; in, Represents the number of edges; coo_matrix and vstack are two matrix transformation functions in Python. Then calculate the nodes. a and b Importance metric of the edges between ; in For nodes a Calculate the degree matrix. For nodes b Calculate the degree matrix. It is an abbreviation for edge; Calculate the normalized marginal sort index ; in and yes The maximum and average values, It specifically refers to nodes. a and b The measurement parameters between them It refers to the metric parameters of a specific node; The function sorts the elements in a matrix in ascending order and returns the array indices of the elements in the corresponding sequence. This represents the output of all edge weights. Array index; The edges are sorted by importance, and the edges with high weights are kept while the unimportant edges are removed, in order to reduce computation memory and improve efficiency. in, To calculate the number of edges discarded, It is a hyperparameter representing the probability of being dropped. Represents the number of edges; The multiple views are then passed through an adaptive node feature mask in a concatenated manner: in, For the feature vector of the node, Let be the degree matrix of the nodes. The importance metric for nodes; The importance of the node features in the graph is ranked to obtain the normalized edge ranking indices. ; in, and yes The maximum and average values; Based on the probability of discarding Remove unimportant node features: in, , This represents the probability of discarding node features; This represents the number of bands after PCA dimensionality reduction; Representative node i Its characteristics.

4. The hyperspectral remote sensing image classification method based on unsupervised multi-view image contrast learning according to any one of claims 1-3, characterized in that: The adaptively enhanced contrastive learning model is used for feature extraction; during training, the NT-Xent loss function is used for backpropagation. in, Represents the cosine similarity function. , Represents the spatial and spectral characteristics after MLP; Represents temperature parameter, Represents the number of nodes; represent i The similarity between the spatial branch and the spectral branch corresponding to the node.

5. A hyperspectral remote sensing image classification system based on unsupervised multi-view image contrast learning, characterized in that, Includes the following modules: The hyperspectral remote sensing image preprocessing module is used to preprocess the acquired hyperspectral remote sensing image X to obtain superpixels and corresponding labels, and segmentation matrix; The multi-view construction module is used to build spatial adjacency matrices and spectral adjacency matrices from multiple perspectives based on the segmentation matrix, thereby obtaining multiple views; The spectral adjacency matrix is ​​obtained using Euclidean distance with K-order nearest neighbors. : ; ; in, It is a node i With all other nodes j The Euclidean distance; K represents the filtering process for all nodes. Indicates the preceding K Individual and Node i The set of nodes that are close in Euclidean distance; and These are the characteristics of the nodes in the graph; The function sorts the elements in a matrix in ascending order and returns the array indices of the elements in the corresponding sequence. A spatial adjacency matrix is ​​constructed using a self-connected spatial second-order nearest neighbor model. : ; One of them uses a 3×3 template. Take the maximum value of the partition matrix a and minimum value b This yields the adjacency matrix based on the first-order nearest neighbors in space. ; It is the identity matrix; The hyperspectral remote sensing image classification module is used to input multiple views into an adaptively enhanced contrastive learning model to obtain deep features for classification. The adaptive data augmentation module includes a spectral graph adaptive data augmentation branch and a spatial graph adaptive data augmentation branch set in parallel. Both the spectral graph and spatial graph adaptive data augmentation branches consist of a cascaded edge selection layer and a feature masking layer. The edge selection layer first uses a reference node centrality metric to measure the importance of the current node, and then uses the average of the parameters of the two endpoints of the edge to represent its importance. Next, the edges are sorted by importance, and the top K nodes are selected, their edge relationships are retained, and the remaining edges are deleted. The feature masking layer first sorts the importance of the nodes, then sets a discard probability, setting the feature mask of unimportant nodes to 0 to achieve data augmentation.

6. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the hyperspectral remote sensing image classification method based on unsupervised multi-view graph contrast learning as described in any one of claims 1 to 4.

7. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the hyperspectral remote sensing image classification method based on unsupervised multi-view graph contrast learning as described in any one of claims 1 to 4.

8. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the hyperspectral remote sensing image classification method based on unsupervised multi-view graph contrast learning as described in any one of claims 1 to 4.