A structural topology-based missing completion deep multi-view subspace clustering method

By combining inter-view consistency and cross-view complementarity, and utilizing a self-expressive contrast alignment module, the problem of noise introduced by missing data in multi-view clustering is solved, achieving efficient clustering of incomplete multi-view data.

CN122244480APending Publication Date: 2026-06-19SOUTHWEAT UNIV OF SCI & TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SOUTHWEAT UNIV OF SCI & TECH
Filing Date
2026-04-07
Publication Date
2026-06-19

Smart Images

  • Figure CN122244480A_ABST
    Figure CN122244480A_ABST
Patent Text Reader

Abstract

This invention relates to a structure-topology-based missing-complete deep multi-view subspace clustering method, comprising: inputting multiple incomplete views into a multi-view subspace clustering network model to obtain clustering results; obtaining the multi-view subspace clustering network model through pre-training and fine-tuning using a training set; extracting latent features of any incomplete view based on a single encoder-decoder of the multi-view subspace clustering network model and reconstructing the features of the incomplete view; using a missing feature filling module to complete latent features in the embedding space based on view-specific complete features and cross-view common complete features, constructing a view-specific self-expression coefficient matrix in the latent features after learning and completion by a self-expression layer; performing consistency alignment on the self-expression layers of all views using a self-expression comparison and alignment module, and fusing the aligned self-expression coefficient matrices of each view to obtain a multi-view consensus self-expression relation matrix; and finally using spectral clustering for clustering segmentation to obtain clustering results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of spatial clustering technology, and in particular to a method for missing-fill depth multi-view subspace clustering based on structural topology. Background Technology

[0002] Three logically progressive methods for incomplete multi-view subspace clustering with noise contamination have been proposed. However, in practical applications, multi-view data is often incomplete due to sensor instability or acquisition costs, limiting the applicability of existing multi-view clustering methods. In practice, samples with missing items cannot be arbitrarily removed, nor can they be simply filled with specific values ​​(such as 0 or 1), as these methods may affect clustering results by discarding key information or introducing statistical noise. To address this, various Incomplete Multi-View Clustering (IMVC) methods have been proposed, broadly categorized as multi-kernel learning, matrix factorization, graph learning, subspace learning, and deep learning-based methods. Among these, Deep Incomplete Multi-View Clustering (DIMVC) has attracted significant attention due to its ability to effectively model and process complex high-dimensional nonlinear data. DIMVC methods are mainly divided into two categories: imputation-based and imputation-free.

[0003] Imputation-based methods complete missing data before clustering using data recovery techniques. Common techniques often employ Generative Adversarial Networks (GANs) to reconstruct missing data or generate shared latent representations through adversarial learning. Other methods fill in missing data based on data structures, such as using soft clustering assignment and a cross-view completion framework with global cluster centers to reconstruct latent features of missing views. To improve imputation quality, existing solutions propose a dual-mechanism approach combining K-Nearest Neighbors (KNN) view completion and a view-class interactive Transformer, optimizing imputation performance by leveraging complementary information between multi-view representations and classification semantics. Furthermore, a diffusion model is introduced for missing data prediction, and data augmentation techniques are combined to maintain generation quality under high missing rates. While imputation-based methods provide a practical solution for incomplete multi-view data, their reconstruction fidelity is highly dependent on the completeness of the observable views. These methods are prone to introducing approximate noise and reducing efficiency during the completion process, especially under high view missing rates. This sensitivity can lead to the propagation of completion bias, ultimately affecting clustering performance.

[0004] The imputation-free method effectively mitigates the impact of missing data on clustering performance by unifying the underlying representations of incomplete data across different views. This method primarily employs a contrastive learning-based framework, achieving cross-view prediction through a contrastive prediction mechanism. Figure 1To ensure consistency, a noise-resistant contrastive loss function was developed to reduce the impact of false negatives. This method combines Graph Convolutional Networks (GCNs) with instance-level contrastive learning to enhance feature representations by handling missing data. Another approach proposes a triple cross-view alignment method based on prototype, feature, and cluster assignment contrastive learning, effectively preserving multi-view consistency. Figure 1 Consistency. Other methods utilize the inherent information of incomplete multi-view data for supervision, projecting view embedding features into a high-dimensional weighted space and leveraging inter-view complementarity as high-confidence supervision to achieve cross-view consistency. Figure 1 Consistency mining. Other methods employ adaptive feature projection techniques to unify multi-view data into a common space, while simultaneously achieving view alignment by minimizing mean differences. High-quality pseudo-labels are generated based on a complete subset of view data, using local and global propagation networks to align feature representations with cluster assignments to provide guidance. While these data-incomplete methods reduce noise or bias from the data completion process, they often overlook the crucial impact of view-specific discrimination on learning consistent representations. This leads to inefficient utilization of cross-view consensus information and affects clustering performance, and also has limitations in terms of inherent interpretability and rigorous theoretical foundation.

[0005] Although deep subspace clustering methods for complete multi-view data have been extensively studied, existing methods cannot directly handle incomplete data because they cannot construct self-representation matrices among multi-view samples with missing values. Currently, only one self-supervised framework based on subspace learning has been proposed for incomplete multi-view clustering. However, this method fails to adequately consider the discriminative contributions of different views and the impact of missing view data on the learning of common subspace representations. Summary of the Invention

[0006] To address the problems existing in the prior art, the present invention aims to provide a structural topology-based missing value completion depth multi-view subspace clustering method, which constructs a joint framework and combines fast missing value completion technology to integrate inter-view consistency with cross-view complementarity.

[0007] To achieve the above objectives, the present invention provides the following solution: A structure-topology-based missing completion depth multi-view subspace clustering method includes: Multiple incomplete views are obtained, and the multiple incomplete views are input into a multi-view subspace clustering network model to obtain clustering results; the multi-view subspace clustering network model is obtained by pre-training and fine-tuning using a training set; Based on the multi-view subspace clustering network model, a single encoder-decoder extracts the latent features of any incomplete view and reconstructs the features of the incomplete view; Based on view-specific complete features and cross-view common complete features, the latent features are filled in the embedding space using a similar missing feature filling module. A view-specific self-expression coefficient matrix is ​​constructed in the latent features after the self-expression layer is learned and filled. The self-expression comparison and alignment module is used to perform consistency alignment on all view self-expression layers, and the aligned view self-expression coefficient matrices are fused to obtain a multi-view consensus self-expression relation matrix. Finally, spectral clustering is used for clustering and segmentation to obtain the clustering results.

[0008] Optionally, the reconstruction loss of the encoder includes: ; in, For the original view, For the encoder output results, It is the F-norm. This represents the number of views.

[0009] Optionally, completing the latent feature in the embedding space includes: The potential features of any view are categorized into feature categories, which include: specific complete features that exist only in the target view, missing features in the view, and common complete features in the view; Construct a common complete feature space for the multiple incomplete views, and search for the index set of similar K nearest neighbors of the view with missing features in the common complete feature space of another view to obtain the topological relationship between the views; Based on the topological relationship, extract the corresponding neighbor features from the general complete features corresponding to the index set of similar K nearest neighbors and concatenate them into the missing feature view: ; ; in, express The K-nearest neighbor index set, For sorting functions, Indicate distance relationship and , For the complete features of a specific view, Let K be the number of nearest neighbors. For view i Public and complete characteristic space, For view i Complete features, This is a missing feature.

[0010] Optionally, the self-expression loss function for all views in the self-expression layer includes: ; in, As a potential feature, This is the self-expression coefficient matrix.

[0011] Optionally, consistent alignment is performed on all view self-expression layers, and the aligned view self-expression coefficient matrices are fused, including: In the multi-view self-expression coefficient matrix, positive and negative sample pairs are identified. That is, the self-expression coefficients corresponding to the same sample under different views are regarded as positive sample pairs, while the coefficients corresponding to different samples within the same view or between different views are regarded as negative sample pairs. With the goal of maximizing the similarity of positive sample pairs while minimizing the similarity of negative sample pairs, the coefficients between samples of the same class are increased and the coefficients between samples of different classes are suppressed, resulting in multiple aligned view self-expression coefficient matrices. The multiple aligned view self-representation coefficient matrices are adaptively fused. Specifically, all aligned view self-representation coefficient matrices are stacked into a third-order tensor along the channel dimension. Global average pooling is used to compress the third-order tensor along the channel dimension to obtain the global descriptor of each view self-representation coefficient matrix. A two-layer unbiased fully connected network is used to capture the relationship between the view self-representation coefficient matrices to obtain a self-weighted self-representation tensor. The self-weighted self-representation tensor is then fused using convolutional kernels to obtain the multi-view consensus self-representation relationship matrix.

[0012] Optionally, the self-expressive contrast loss between views in the self-expressive contrast alignment module includes: ; in, This represents the total number of samples. For any sample, The self-representation coefficient represents the similarity between positive and negative samples. For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, For temperature parameters, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, It is an exponential function.

[0013] Optionally, the self-expression contrast loss accumulated across views for all samples in the self-expression contrast alignment module includes: ; in, This represents the self-expressive contrast loss between views.

[0014] Optionally, pre-training and fine-tuning using the training set includes: Set the first objective function: ; in, As a potential feature, For the original view, To reconstruct the original incomplete view, For encoder network parameters; During the pre-training phase, using the first objective function as the optimization objective, a multi-encoder-decoder without missing value imputation and self-expression layers is trained using the training set. Set a second objective function: ; in, To reconstruct the loss, The self-expressive loss function for all views. The self-expressive contrast loss accumulated across all views for all samples. For the original view, For the encoder output results, For any view The self-expression coefficient matrix, The consensus self-representation coefficient matrix for all views. and For balancing parameters, For any view The completed potential features For the sample size, For any sample, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, Temperature coefficient; During the fine-tuning phase, the second objective function is used as the optimization target, and the reconstruction loss, multi-view self-expression loss, and self-expression contrast loss are integrated to optimize the complete multi-encoder-decoder.

[0015] Optionally, obtaining the clustering results includes: Set a third objective function: ; in, For trace function, This is the transpose of the clustering indicator matrix. For Laplace matrix, It is the identity matrix; The affinity matrix is ​​calculated using the multi-view consensus self-representation relation matrix, and the third objective function is optimized to obtain the clustering result.

[0016] The beneficial effects of this invention are as follows: This invention designs an end-to-end incomplete multi-view deep subspace clustering method, which integrates complementary fusion techniques such as K-nearest neighbor-based missing feature completion, cross-view contrast consistency alignment, and attention-driven methods. For the first time, it utilizes cross-view nearest neighbors to efficiently complete missing view features in a self-supervised manner. Through contrast alignment consistency constraints, it can effectively eliminate interference caused by completion noise. Based on a self-expressive attention fusion mechanism, it fully mines cross-view complementary information, thereby effectively achieving efficient clustering of incomplete multi-view data. Attached Figure Description

[0017] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0018] Figure 1 This is a network structure diagram of a missing completion depth multi-view subspace clustering method based on structural topology according to an embodiment of the present invention. Detailed Implementation

[0019] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0020] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0021] like Figure 1As shown, this embodiment discloses a missing-complete deep multi-view subspace clustering method based on structural topology, including: acquiring multiple incomplete views; inputting the multiple incomplete views into a multi-view subspace clustering network model to obtain clustering results; obtaining the multi-view subspace clustering network model through pre-training and fine-tuning using a training set; extracting latent features of any incomplete view based on a single encoder-decoder of the multi-view subspace clustering network model and reconstructing the features of the incomplete view; using a missing feature filling module to complete latent features in the embedding space based on view-specific complete features and cross-view common complete features, constructing a view-specific self-expression coefficient matrix in the latent features after learning and completion by the self-expression layer; using a self-expression comparison and alignment module to perform consistency alignment on the self-expression layers of all views, and fusing the aligned self-expression coefficient matrices of each view to obtain a multi-view consensus self-expression relation matrix; and finally using spectral clustering for clustering segmentation to obtain clustering results. More specifically: Autoencoder reconstruction technique: A set of encoder-decoder is designed for each view. For simplicity, the encoder and decoder are implemented through a fully connected neural network. The encoder aims to extract view-specific latent features from the original incomplete view, thus preserving missing items. The decoder aims to reconstruct the original incomplete view. Specifically, due to the incompleteness of the original multi-view data, this invention first obtains a missing item indication matrix. Let represent its sample state. For any _th ... v One view, original view As an encoder ,in These represent the parameters of the encoder. Therefore, the latent features can be represented as... ,in These are the encoder parameters. Then, incomplete features are filled in the embedding space. The latent features of arbitrary view completion are represented as follows: and use it as a decoder The input, where This represents the decoder's parameters. Next, the decoder's output will be... and missing indicator matrix Integrate the data used to reconstruct the original incomplete view, represented as ,in for The column vectors reflect the state of the view samples (whether they are missing). Finally, the original view is minimized. and self-encoder output Reconstruction loss between views, obtaining embedding features for each view. To simplify parameter configuration and achieve a unified network, the encoder and decoder designed in this invention use the same number of neurons in both the hidden and output layers. Therefore, the deepest latent features of each view have the same dimension, represented as... Therefore, the incomplete multi-view reconstruction loss can be expressed as: .

[0022] KNN-based Missing Feature Completion Technique: To capture the similarity relationships between samples from different views, completing the missing latent features is crucial. Inspired by neighborhood-based completion methods, this invention designs a similarity missing feature completion module based on the K-Nearest Neighbors (KNN) method in the latent space, aiming to achieve efficient feature completion. In this module, the latent features of any view... Divided into three categories: those that only exist in the first... v Specific complete features in the first view, the first v Missing features of the first view, the first v The common complete features across views in a set of views. They are represented as follows: , and ,That N = N_O + N_M + N_C , N The total number of data samples. N_O This represents the total number of complete samples. N_M This represents the number of samples with missing data. N_C This represents the number of common complete samples across views. This invention utilizes the cross-view correlation between view-specific complete features and common complete features to fill in missing features.

[0023] This invention is first based on the missing indicator matrix and potential characteristics Identify three types of potential features across multiple views. Taking any view as an example, based on the view... i and view j Constructing a common complete feature space using a view-specific encoder and .view i and view j The unique complete features of China are represented as follows: and Next, this invention finds a set of indexes of similar K nearest neighbors for a view with missing features within the common complete feature space of another view. These index sets preserve the topological relationships between the views. For any missing feature... This invention utilizes elements with the same topological relationship. Determine The index set of KNN can be described as follows: ; in, express The K-nearest neighbor index set, For sorting functions, Indicate distance relationship and , for K Number of neighbors.

[0024] Then, based on the consistent topological relationships between different views, this invention utilizes these K-nearest neighbor index sets. Extract corresponding neighbor features from general complete features. And use these features to fill in missing features. It can be expressed as the following formula: ; In the formula, Let K be the number of K nearest neighbors.

[0025] After completing the missing feature imputation, the complete latent features By splicing , and The missing feature imputation module utilizes the topological relationships in the view sample space, requiring no training parameters, thus maintaining training stability and efficiency. Furthermore, it can dynamically impute missing features and automatically update these features as the latent feature representations extracted during the encoder-decoder process are optimized during forward propagation.

[0026] Self-representation contrastive alignment and discriminative fusion technique: After completing the missing features, a self-representation layer is embedded between the constructed latent features and the decoder to learn the representational relationships between samples from different perspectives. This layer is implemented using a fully connected neural network and does not contain non-linear activation functions or bias terms. The network parameters of this layer are represented as a self-representation coefficient matrix. This belongs to the subspace-oriented representation method. By minimizing the representation error between the input and output of the self-expression layer, the viewpoint-specific coefficient matrix can be dynamically learned through backpropagation. To capture the diversity and discriminative power of multiple viewpoints, the system employs a multi-layer self-expression structure in all viewpoints. The self-expression loss function for all viewpoints can be expressed as: ; The last term is the regularization loss term, calculated using the Frobenius norm. Apply subspace-oriented block diagonal structure constraints. any value The aim is to prevent the generation of trivial solutions.

[0027] Because noise is inevitably introduced during the completion of missing features, as well as interference from meaningless private information from views, the consistency of cross-view self-representation relationships is affected. To solve this problem, this invention designs a comparison alignment module to realize the self-representation layer of all views. Consistency. Alignment of self-representation relationships (rather than latent features of completion) is mainly based on the following two considerations: (1) self-representation coefficient matrices capture similarity relationships between samples, which is more directly applicable to clustering tasks; (2) the consistency goal of self-representation coefficient matrices helps to explore complementary information across views.

[0028] First, positive and negative sample pairs are identified in the multi-view self-expression coefficient matrix. Self-expression coefficients corresponding to the same sample from different views are considered positive sample pairs, while coefficients corresponding to different samples within the same view or between different views are considered negative sample pairs. The coefficients of a single sample are then used as the basis for further analysis. For example (where) The number of positive sample pairs and the number of negative sample pairs are respectively and The self-expressive contrastive alignment module aims to maximize the similarity of positive sample pairs while minimizing the similarity of negative sample pairs. To achieve this, it enhances the coefficients between samples of the same class and suppresses the coefficients between samples of different classes. By mining cross-view consistency and filtering out view-specific information and imputation noise, the effectiveness of clustering and segmentation is improved.

[0029] To achieve this goal, the present invention introduces a contrast loss to optimize the module. Given the first... i The and the first j When viewing a single view, the self-expressive contrastive loss is represented as follows: ; in, Here, n and m represent the number of samples in the view, respectively, and are temperature parameters. These represent the similarity between positive and negative pairs of coefficients, respectively. This module uses cosine distance to measure similarity: ; For including V Each view and N For a dataset of samples, the cumulative self-expressive contrastive loss of all samples across all views is represented as: ; After achieving consistency alignment of the cross-view self-representation relation matrices, a discriminative complementarity technique based on channel attention is used to adaptively fuse these self-representation relation matrices, further exploring the diversity and complementarity across views. The specific implementation is as follows: First, calculate the specific self-representation coefficient matrix of all views. Stacked along the channel dimension into a third-order tensor, defined as: ( Next, Global Average Pooling (GAP) is used to process the self-representation tensor along the channel dimension. Compression yields the global descriptor for each channel, i.e., its respective coefficient matrix, i.e.: ; To fully utilize the contribution of each channel's global descriptor analysis to its respective representation coefficient matrix, a two-layer unbiased fully connected network is used to capture the relationships between the view's self-representation coefficient matrices and activate them with weights between 0 and 1. , can be represented as: ; In the formula, the first fully connected (FC) layer is used to reduce dimensionality, and its network parameters are as follows: , As a dimensionality reduction factor, its activation function The network uses the ReLU function; the second fully connected (FC) layer is used to recover the dimensions, and its network parameters are... Its activation function It is a Sigmoid function. The weights of each channel are used. Remeasure self-representation tensor The self-representation tensor after obtaining the self-weights : ; Finally, the self-representation tensors after fusing the self-weights are used with convolutional kernels. To obtain a high-quality multi-view consensus self-representation relation matrix with maximum complementarity. : ; In the formula, This represents the size of the convolution kernel.

[0030] Model Construction and Optimization: This invention proposes that the DIMVSC method be optimized in two stages: a pre-training stage and a fine-tuning stage. In the pre-training stage, a multi-encoder-decoder pipeline without missing value imputation and self-expression layers is used to extract multi-view latent features, and optimization is performed using reconstruction loss. This stage aims to initialize the network parameters of the view-specific encoder and decoder while obtaining initial latent features. The objective function is defined as follows: ; In the fine-tuning phase, this invention optimizes the overall network by integrating reconstruction loss and missing feature imputation, multi-view self-expression loss, regularization term, and self-expression contrastive loss to obtain a cross-view consensus self-expression coefficient matrix. The objective function of DIMVSC becomes: ; Obtaining the consensus self-representation coefficient matrix from all perspectives Then, spectral clustering is used to generate clustering results. First, through... Calculate the affinity matrix. The Laplace matrix is ​​defined as follows: ,in It is a diagonal matrix, whose diagonal elements are formed by... The results are given. Subsequently, the clustering results can be obtained by solving the following relaxation optimization problem: ; in, This represents the clustering indicator matrix, which indicates... N The samples were divided into G Clustering. Apply constraints. This ensures that each sample is assigned to only one cluster.

[0031] To address incomplete multi-view data with missing information, this invention proposes a deep incomplete multi-view subspace clustering method based on K-nearest neighbor missing feature completion. It utilizes an autoencoder to extract latent features from each view. Starting from the perspective that different views of the same sample should have similar latent features, it uses the K-nearest neighbor method in the latent space to complete missing features by leveraging cross-view correlations between view-specific complete features and common complete features. Then, it constructs a self-representation network layer for each view to learn view-specific self-representation relation matrices and uses contrastive learning for cross-view clustering. Figure 1 Consistency-driven alignment is employed, and the self-representation relation matrices of each view are fused and compared based on a channel attention mechanism to obtain a high-quality incomplete multi-view consensus self-representation relation matrix. Clustering segmentation is then completed based on spectral clustering. Experimental results show that the DIMVSC method outperforms state-of-the-art incomplete multi-view clustering methods under most missing rates on six datasets. In particular, the average improvement in clustering metrics on the LandUse-21, Synthetic3d, CUB, and HandWritten datasets is approximately 7.88%, 20.47%, 14.79%, and 20.22%, respectively, fully demonstrating the superior clustering performance of the proposed method. The universality and effectiveness of the proposed method are verified through parameter sensitivity analysis and convergence analysis experiments.

[0032] The embodiments described above are merely preferred embodiments of the present invention and are not intended to limit the scope of the present invention. Various modifications and improvements made to the technical solutions of the present invention by those skilled in the art without departing from the spirit of the present invention should fall within the protection scope defined by the claims of the present invention.

Claims

1. A method for missing completion depth multi-view subspace clustering based on structural topology, characterized in that, include: Obtain multiple incomplete views, input the multiple incomplete views into a multi-view subspace clustering network model, and obtain the clustering results; The multi-view subspace clustering network model is obtained through pre-training and fine-tuning using a training set; Based on the multi-view subspace clustering network model, a single encoder-decoder extracts the latent features of any incomplete view and reconstructs the features of the incomplete view. Based on view-specific complete features and cross-view common complete features, the latent features are filled in the embedding space using a similar missing feature filling module. A view-specific self-expression coefficient matrix is ​​constructed in the latent features after the self-expression layer is learned and filled. The self-expression comparison and alignment module is used to perform consistency alignment on all view self-expression layers, and the aligned view self-expression coefficient matrices are fused to obtain a multi-view consensus self-expression relation matrix. Finally, spectral clustering is used for clustering and segmentation to obtain the clustering results.

2. The missing completion depth multi-view subspace clustering method based on structural topology according to claim 1, characterized in that, The reconstruction loss of the encoder includes: ; in, For the original view, For the encoder output results, It is the F-norm. This represents the number of views.

3. The missing completion depth multi-view subspace clustering method based on structural topology according to claim 1, characterized in that, Completing the latent features in the embedding space includes: The potential features of any view are categorized into feature categories, which include: specific complete features that exist only in the target view, missing features in the view, and common complete features in the view; Construct a common complete feature space for the multiple incomplete views, and search for the index set of similar K nearest neighbors of the view with missing features in the common complete feature space of another view to obtain the topological relationship between the views; Based on the topological relationship, extract the corresponding neighbor features from the general complete features corresponding to the index set of similar K nearest neighbors and concatenate them into the missing feature view: ; ; in, express The K-nearest neighbor index set, For sorting functions, Indicate distance relationship and , For the complete features of a specific view, Let K be the number of nearest neighbors. For view i Public and complete characteristic space, For view i Complete features, This is a missing feature.

4. The method for missing completion depth multi-view subspace clustering based on structural topology according to claim 1, characterized in that, The self-expression loss function for all views in the self-expression layer includes: ; in, As a potential feature, This is the self-expression coefficient matrix.

5. The method for missing completion depth multi-view subspace clustering based on structural topology according to claim 1, characterized in that, Perform consistent alignment on all view self-expression layers, and then fuse the aligned view self-expression coefficient matrices, including: In the multi-view self-expression coefficient matrix, positive and negative sample pairs are identified. That is, the self-expression coefficients corresponding to the same sample under different views are regarded as positive sample pairs, while the coefficients corresponding to different samples within the same view or between different views are regarded as negative sample pairs. With the goal of maximizing the similarity of positive sample pairs while minimizing the similarity of negative sample pairs, the coefficients between samples of the same class are increased and the coefficients between samples of different classes are suppressed, resulting in multiple aligned view self-expression coefficient matrices. The multiple aligned view self-representation coefficient matrices are adaptively fused. Specifically, all aligned view self-representation coefficient matrices are stacked into a third-order tensor along the channel dimension. Global average pooling is used to compress the third-order tensor along the channel dimension to obtain the global descriptor of each view self-representation coefficient matrix. A two-layer unbiased fully connected network is used to capture the relationship between the view self-representation coefficient matrices to obtain a self-weighted self-representation tensor. The self-weighted self-representation tensor is then fused using convolutional kernels to obtain the multi-view consensus self-representation relationship matrix.

6. The method for missing completion depth multi-view subspace clustering based on structural topology according to claim 1, characterized in that, The self-expressive contrast alignment module's self-expressive contrast loss between views includes: ; in, This represents the total number of samples. For any sample, The self-representation coefficient represents the similarity between positive and negative samples. For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, For temperature parameters, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, It is an exponential function.

7. The method for missing completion depth multi-view subspace clustering based on structural topology according to claim 1, characterized in that, The self-expression contrastive loss accumulated across views for all samples in the self-expression contrastive alignment module includes: ; in, This represents the self-expressive contrast loss between views.

8. The missing completion depth multi-view subspace clustering method based on structural topology according to claim 1, characterized in that, Pre-training and fine-tuning using the training set includes: Set the first objective function: ; in, As a potential feature, For the original view, To reconstruct the original incomplete view, For encoder network parameters; During the pre-training phase, using the first objective function as the optimization objective, a multi-encoder-decoder without missing value imputation and self-expression layers is trained using the training set. Set a second objective function: ; in, To reconstruct the loss, The self-expressive loss function for all views. The self-expressive contrast loss accumulated across all views for all samples. For the original view, For the encoder output results, For any view The self-expression coefficient matrix, The consensus self-representation coefficient matrix for all views. and For balancing parameters, For any view The completed potential features For the sample size, For any sample, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, For view The Middle The self-representation coefficients of each sample, Temperature coefficient; During the fine-tuning phase, the second objective function is used as the optimization target, and the reconstruction loss, multi-view self-expression loss, and self-expression contrast loss are integrated to optimize the complete multi-encoder-decoder.

9. The missing completion depth multi-view subspace clustering method based on structural topology according to claim 1, characterized in that, Obtaining the clustering results includes: Set a third objective function: ; in, For trace function, This is the transpose of the clustering indicator matrix. For Laplace matrix, It is the identity matrix; The affinity matrix is ​​calculated using the multi-view consensus self-representation relation matrix, and the third objective function is optimized to obtain the clustering result.