Multi-view clustering method based on adaptive weighted tensor

The multi-view clustering method using adaptive weighted tensors solves the problem of fusion and clustering of multi-view image data, achieving high-precision image analysis and robust clustering results under noise interference, and is suitable for automatic classification of multi-source heterogeneous image data.

CN121074451BActive Publication Date: 2026-06-16NANJING UNIV OF INFORMATION SCI & TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NANJING UNIV OF INFORMATION SCI & TECH
Filing Date
2025-08-21
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively integrate multi-view image data. Traditional methods suffer from blurred boundaries, insufficient segmentation accuracy, and sensitivity to noise when processing multimodal or multi-view images. Furthermore, existing clustering methods are not robust to noise and outliers, and cannot fully utilize the feature information of multi-view data.

Method used

A multi-view clustering method based on adaptive weighted tensors is adopted. Image segmentation is used as a preprocessing step to extract the target region. Combined with low-rank constraints and reconstruction error optimization, the feature fusion weights of each view are dynamically adjusted. Redundant information is filtered through a sparse modeling mechanism to achieve adaptive weighted fusion and clustering of multi-view data.

🎯Benefits of technology

It improves the accuracy and stability of image analysis, and can achieve high-quality clustering results under noise interference and complex sample distribution. It is suitable for automatic classification and analysis of multi-source heterogeneous image data.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121074451B_ABST
    Figure CN121074451B_ABST
Patent Text Reader

Abstract

The application discloses a multi-view clustering method based on adaptive weighted tensor and belongs to the technical field of image clustering analysis. The method comprises the following steps: acquiring a multi-view data set, pre-processing images in the multi-view data set; extracting features of different views from each initial image; performing standardization processing on initial view features; constructing a target function with data reconstruction error and low rank constraint minimization as targets, converting the target function into an augmented Lagrange multiplier method form, solving a Lagrange function through an alternating direction multiplication-division method; performing feature transformation on image data to be processed through an optimal projection matrix, and performing a clustering operation in a fused feature space to obtain clustered images. The application does not need to rely on label information, can realize a stable clustering effect under the conditions of noise interference and complex sample distribution through multi-view feature fusion and low-dimensional projection modeling, and is suitable for automatic classification and analysis requirements of actual multi-source heterogeneous image data.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image clustering analysis technology, and in particular to an image multi-view clustering method based on adaptive weighted tensors. Background Technology

[0002] With the rapid development of artificial intelligence technology, the demand for image data processing is growing rapidly across various fields. The high complexity, structural diversity, and interference during the acquisition, storage, and transmission of image data pose challenges to computer vision tasks. Multi-view image data, in particular, contains information from different perspectives or modalities, with varying features and structures. Effectively fusing data from different perspectives has become a significant technical challenge.

[0003] To improve the accuracy and efficiency of image analysis, image segmentation technology is widely used in the preprocessing stage to achieve effective separation of the target region from the background. However, traditional image segmentation methods often face problems such as blurred boundaries, insufficient segmentation accuracy, and sensitivity to noise when processing multimodal or multi-view images, making it difficult to provide a stable and reliable feature base for subsequent clustering and recognition.

[0004] After image segmentation, effectively fusing image features from different viewpoints or modalities and further achieving accurate clustering is a key technical challenge in current image understanding. Existing clustering methods such as KMeans and GMM typically assume that the data comes from a single viewpoint and employ a fixed clustering strategy, failing to fully utilize the feature information of multi-view data. These methods have poor robustness to noise and outliers, resulting in insufficient clustering accuracy and stability.

[0005] Weighted clustering methods attempt to assign weights to different perspectives, but they ignore the differences between perspectives. In clustering multi-perspective data, how to adaptively adjust weights based on the data quality of each perspective remains a challenge. Graph-based clustering methods, such as spectral clustering, while capable of handling relationships between perspectives, have shortcomings in noise handling, high computational complexity, and are difficult to cope with large-scale data.

[0006] Feature extraction methods combining low-rank and low-dimensional approaches have been proposed, which can remove noise, reduce dimensionality, and learn projection matrices, thereby improving the accuracy of feature extraction and clustering. However, current low-rank methods cannot handle image data fusion from multiple perspectives and are inefficient in high-dimensional data. Summary of the Invention

[0007] Purpose of the invention: To address the above problems, the purpose of this invention is to provide a multi-view clustering method based on adaptive weighted tensors to achieve accurate preprocessing and robust clustering analysis of complex image scenes.

[0008] Technical solution: The multi-view clustering method based on adaptive weighted tensors of the present invention includes the following steps:

[0009] Obtain a multi-view dataset, which includes images of different perspectives and different categories of scenes;

[0010] Preprocess the images in the multi-view dataset to obtain initial images;

[0011] Extract features from different perspectives from each initial image to obtain the initial perspective features of each image;

[0012] The initial viewpoint features are standardized to obtain a data matrix. All data matrices are then merged to obtain an augmented matrix.

[0013] The objective function is constructed with the goal of minimizing data reconstruction error and low-rank constraint. The objective function is then transformed into the augmented Lagrange multiplier method form. The Lagrange function is solved by multiplication and division in alternating directions to obtain the optimal projection matrix.

[0014] The optimal projection matrix is ​​used to perform feature transformation on the image data to be processed, and clustering operation is performed in the fused feature space to obtain the clustered image.

[0015] Furthermore, the steps of extracting features from different viewpoints from each initial image to obtain the initial viewpoint features of each image include:

[0016] The initial image is dimensionality reduced by principal component analysis, and the first n0 principal components are retained. The resulting image features are denoted as PCA features, where n0 is a natural number.

[0017] The initial image is processed into grayscale to obtain a grayscale image. Local binary pattern features are extracted from the grayscale image and denoted as LBP features.

[0018] Feature extraction of grayscale images is performed using histogram of oriented gradients (HOG), resulting in histogram of oriented gradient features, denoted as HOG features.

[0019] Initial viewpoint features are constructed using PCA features, LBP features, and HOG features.

[0020] Furthermore, the data matrices obtained after standardizing the initial viewpoint features are denoted as X. rgb ,X lbp ,X hog The augmented matrix obtained by merging the data matrices is denoted as X = [X rgb ,X lbp ,X hog ], where X rgb X represents the standardized PCA features. lbp X represents the standardized LBP features. hog This represents the standardized HOG features.

[0021] Furthermore, the expression for the objective function is:

[0022]

[0023] In the formula, the first term after the equal sign The second term represents the data reconstruction error term. This represents the low-rank constraint term, the third term. X represents the L2 norm constraint term; v Let W represent the normalized feature matrix of the v-th viewpoint, and let W represent the projection matrix. v Let V represent the projection matrix of the v-th viewpoint, where V represents the number of viewpoints, and λ and β represent regularization parameters. The square of the Frobenius norm is used to represent the data reconstruction error; || || * This represents the nuclear norm, which is the sum of singular values, used to implement low-rank constraints. This represents the square of the L2 norm, used for parameter regularization penalties.

[0024] Furthermore, the steps to transform the objective function into the augmented Lagrange multiplier form include:

[0025] For each viewpoint's low-rank constraint term and L2 norm constraint term, a Lagrange multiplier Y is introduced. v Z corresponds to the low-rank constraint term and the L2 norm constraint term, and the Lagrangian function is constructed as follows:

[0026]

[0027] In the formula, W represents the projection matrix of the v-th viewpoint. v initial value, Let W be the initial value of the projection matrix, and <,> denote the inner product between matrices.

[0028] Furthermore, the steps to obtain the optimal projection matrix by solving the Lagrange function through alternating directions of multiplication and division include:

[0029] Step 51, set the maximum number of iterations;

[0030] Step 52, fix variable W, and iteratively update the projection matrix W of the v-th viewpoint. v The Lagrange function at this point is expressed as:

[0031]

[0032] Expanding the first term of the Lagrange function, the reconstruction error term, the Lagrange function is expressed as:

[0033]

[0034] In the formula, T represents the transpose, trace represents the trace of the matrix, and represents the sum of the diagonal elements of the matrix;

[0035] matrix W v Perform singular value decomposition, decomposed into: W v =UΣV T In the formula, U represents the left singular vector matrix, ∑ represents the diagonal matrix formed by the singular values, and V T Represents a right singular vector matrix;

[0036] Then, the diagonal matrix ∑ formed by the singular values ​​is thresholded, with the formula: Σ′=max(Σ-λ,0);

[0037] The reconstructed variables are:

[0038] Step 53, fix variable W v If the projection matrix W is updated iteratively, then the Lagrange function can be expressed as:

[0039]

[0040] Expanding the Lagrange function yields:

[0041]

[0042] In the formula, the first term after the equal sign is the reconstruction error term, which represents the reconstruction error when the augmented matrix X is mapped to the low-dimensional space through the projection matrix W; the second term is the L2 regularization term, which controls the size of the projection matrix W.

[0043] By solving the linear equations using gradient descent, we can obtain:

[0044]

[0045] In the formula, I represents the identity matrix;

[0046] Step 54, fix variable W v And W, update the Lagrange multipliers, the formula is:

[0047]

[0048] Z * =min( <Z,W-W * >),

[0049] Step 55, set the convergence condition as: ||f(W) (t+1) )-f(W (t) )|| F <∈,

[0050] In the formula, f(·) is the objective function, which describes the optimization process of the model; t represents the iteration step; represents the projection matrix output in the t-th iteration; and W represents the projection matrix output in the t-th iteration. (t) It is the projection matrix output at step t, where a threshold is set.

[0051] Step 56: Iterate through steps 52 to 54 according to the set projection matrix until the calculated projection matrix satisfies the convergence condition, which is the optimal projection matrix.

[0052] Furthermore, the preprocessing steps for images in the multi-view dataset include:

[0053] All images are processed to a standard size and then normalized.

[0054] Beneficial effects: Compared with the prior art, the significant advantages of this invention are:

[0055] 1. Introducing image segmentation as a preprocessing step effectively improves feature extraction quality: By introducing image segmentation into the image processing flow, this invention can accurately extract target regions in the image, reduce background redundancy and interference information, thereby providing a clearer and more stable input foundation for subsequent multi-view feature extraction and clustering, and improving overall analysis accuracy.

[0056] 2. Able to achieve adaptive adjustment of perspective differences based on weighted tensor model: This invention designs an independent mapping matrix for each perspective and dynamically adjusts the contribution weight of each perspective in feature fusion through joint optimization of low-rank constraints and reconstruction error, thereby improving the accuracy and effectiveness of multi-view data fusion.

[0057] 3. Possesses sparse structure guidance capability, enhancing the discriminative expressiveness of multi-view features: This invention introduces a sparse modeling mechanism during the optimization process to perform structural compression and selection on the feature mapping results, effectively filtering redundant information, retaining highly correlated features, and improving the stability and generalization ability of the clustering model.

[0058] 4. High-quality clustering achieved through unsupervised learning mechanism, adapting to complex image scenarios: This invention does not rely on label information. Through multi-view feature fusion and low-dimensional projection modeling, it can achieve robust clustering results under conditions of noise interference and complex sample distribution, and is suitable for the automatic classification and analysis needs of real-world multi-source heterogeneous image data. Attached Figure Description

[0059] Figure 1 A flowchart illustrating an image clustering method provided in an embodiment of the present invention;

[0060] Figure 2This is a comparison chart of the clustering effects of different methods on the Scene-15 dataset. Detailed Implementation

[0061] The embodiments of the present invention will be further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the present invention and not intended to limit the scope of the invention. Furthermore, it should be noted that, for ease of description, the accompanying drawings show only the parts relevant to the embodiments of the present invention, and not all structures.

[0062] In the following description, specific details such as target system architecture and techniques are set forth for illustrative purposes and not for limitation, in order to provide a thorough understanding of the embodiments of this application. However, those skilled in the art will understand that this application may also be implemented in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, apparatuses, circuits, and methods are omitted so as not to obscure the description of this application with unnecessary detail.

[0063] It should be understood that, when used in this application specification and the appended claims, the term "comprising" indicates the presence of the described features, integrals, steps, operations, elements and / or components, but does not exclude the presence or addition of one or more other features, integrals, steps, operations, elements, components and / or a collection thereof.

[0064] It should also be understood that the term “and / or” as used in this application specification and the appended claims means any combination of one or more of the associated listed items and all possible combinations, and includes such combinations.

[0065] Furthermore, in the description of this application and the appended claims, the terms "first," "second," etc., are used only to distinguish descriptions and should not be construed as indicating or implying relative importance.

[0066] References to "one embodiment" or "some embodiments" in this specification mean that one or more embodiments of this application include the target features, structures, or characteristics described in connection with that embodiment. Therefore, the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in still other embodiments," etc., appearing in different parts of this specification do not necessarily refer to the same embodiment, but rather mean "one or more, but not all, embodiments," unless otherwise specifically emphasized.

[0067] The multi-view clustering method based on adaptive weighted tensors described in this embodiment combines... Figure 1 As shown, it includes the following steps:

[0068] Step 1: Obtain a multi-view dataset, which includes images of different perspectives and different categories of scenes.

[0069] We acquire images from different perspectives, such as front, side, or back views, and images from different scene categories to form a multi-view dataset.

[0070] Step 2: Preprocess the images in the multi-view dataset to obtain the initial images.

[0071] Furthermore, the preprocessing steps for images in the multi-view dataset include:

[0072] All images are processed to a standard size and then normalized.

[0073] In one example, the Scene-15 dataset contains scene images of 15 different categories, with multiple samples in each category. All images are processed to a standard size, such as 224×224, to ensure data consistency before normalization.

[0074] Step 3: Extract features from different perspectives from each initial image to obtain the initial perspective features of each image.

[0075] Furthermore, the steps of extracting features from different viewpoints from each initial image to obtain the initial viewpoint features of each image include:

[0076] The initial image is dimensionality reduced by principal component analysis, and the first n0 principal components are retained. The resulting image features are denoted as PCA features, where n0 is a natural number.

[0077] The initial image is processed into grayscale to obtain a grayscale image. Local binary pattern features are extracted from the grayscale image and denoted as LBP features.

[0078] Feature extraction of grayscale images is performed using histogram of oriented gradients (HOG), resulting in histogram of oriented gradient features, denoted as HOG features.

[0079] Initial viewpoint features are constructed using PCA features, LBP features, and HOG features.

[0080] In one example, dimensionality reduction of the RGB image is performed using Principal Component Analysis (PCA), retaining the first 50 principal components to obtain the PCA features of the RGB image. Local Binary Pattern (LBP) features are extracted from the grayscale image using an 8-neighborhood and a radius of 1 to obtain LBP features. Histogram of Oriented Gradients (HOG) is then used to extract features from the grayscale image, with a cell size of 8×8 pixels and a block size of 1×1 to obtain HOG features. After feature extraction from all images, an initial viewpoint feature set is obtained, which includes features from three different viewpoints.

[0081] Step 4: Standardize the initial viewpoint features to obtain a data matrix, and merge all the data matrices to obtain an augmented matrix.

[0082] Furthermore, the data matrices obtained after standardizing the initial viewpoint features are denoted as X. rgb ,X lbp ,X hog The augmented matrix obtained by merging the data matrices is denoted as X = [X rgb ,X lbp ,X hog ], where X rgb X represents the standardized PCA features. lbp X represents the standardized LBP features. hog This represents the standardized HOG features.

[0083] Step 5: Construct an objective function with the goal of minimizing data reconstruction error and low-rank constraints. Transform the objective function into the augmented Lagrange multiplier method form and solve the Lagrange function by alternating direction multiplication and division to obtain the optimal projection matrix.

[0084] Furthermore, the expression for the objective function is:

[0085]

[0086] In the formula, the first term after the equal sign The second term represents the data reconstruction error term. This represents the low-rank constraint term, the third term. X represents the L2 norm constraint term; v The normalized feature matrix of the v-th viewpoint is used to represent the augmented matrix X = [X... rgb ,X lbp ,X hog ], W represents the projection matrix, W v Let V represent the projection matrix of the v-th viewpoint, where V represents the number of viewpoints, and λ and β represent regularization parameters. The square of the Frobenius norm is used to represent the data reconstruction error; ||||* This represents the nuclear norm, which is the sum of singular values, used to implement low-rank constraints. This represents the square of the L2 norm, used for parameter regularization penalties.

[0087] Furthermore, the steps to transform the objective function into the augmented Lagrange multiplier form include:

[0088] For each viewpoint's low-rank constraint term and L2 norm constraint term, a Lagrange multiplier Y is introduced. v Z corresponds to the low-rank constraint term and the L2 norm constraint term, and the Lagrangian function is constructed as follows:

[0089]

[0090] In the formula, W represents the projection matrix of the v-th viewpoint. v initial value, Let W be the initial value of the projection matrix, and <,> denote the inner product between matrices.

[0091] Furthermore, the steps to obtain the optimal projection matrix by solving the Lagrange function through alternating directions of multiplication and division include:

[0092] Step 51, set the maximum number of iterations;

[0093] Step 52, fix variable W, and iteratively update the projection matrix W of the v-th viewpoint. v The Lagrange function at this point is expressed as:

[0094]

[0095] Expanding the first term of the Lagrange function, the reconstruction error term, the Lagrange function is expressed as:

[0096]

[0097] In the formula, T represents the transpose, trace represents the trace of the matrix, and represents the sum of the diagonal elements of the matrix;

[0098] matrix W v Perform singular value decomposition, decomposed into: W v =UΣV T In the formula, U represents the left singular vector matrix, ∑ represents the diagonal matrix formed by the singular values, and V T Represents a right singular vector matrix;

[0099] Then, the diagonal matrix ∑ formed by the singular values ​​is thresholded, with the formula: Σ′=max(Σ-λ,0). After this process, the singular values ​​less than λ are set to zero, thereby achieving sparsity and low rank.

[0100] The reconstructed variables are:

[0101] Step 53, fix variable W v If the projection matrix W is updated iteratively, then the Lagrange function can be expressed as:

[0102]

[0103] Expanding the Lagrange function yields:

[0104]

[0105] In the formula, the first term after the equal sign is the reconstruction error term, which represents the reconstruction error when the augmented matrix X is mapped to the low-dimensional space through the projection matrix W; the second term is the L2 regularization term, which controls the size of the projection matrix W to avoid overfitting.

[0106] By solving the linear equations using gradient descent, we can obtain:

[0107]

[0108] In the formula, I represents the identity matrix;

[0109] Step 54, fix variable W v And W, update the Lagrange multipliers, the formula is:

[0110]

[0111] Z * =min( <Z,W-W * >),

[0112] Step 55, set the convergence condition as: ||f(W) (t+1) )-f(W (t) )|| F <∈,

[0113] In the formula, f(·) is the objective function, which describes the optimization process of the model; t represents the iteration step; represents the projection matrix output in the t-th iteration; and W represents the projection matrix output in the t-th iteration. (t) It is the projection matrix output at step t, where a threshold is set.

[0114] Step 56: Iterate through steps 52 to 54 according to the set projection matrix until the calculated projection matrix satisfies the convergence condition, which is the optimal projection matrix.

[0115] Step 6: Perform feature transformation on the image data to be processed using the optimal projection matrix, and perform clustering operation in the fused feature space to obtain the clustered image.

[0116] Image data is transformed using an optimal projection matrix, mapping the original image data to a low-dimensional feature space. Features from different viewpoints are then fused using a weighted tensor to form a unified feature representation. Clustering algorithms (such as KMeans) are then applied to this fused feature space to cluster the images, yielding the final clustering results.

[0117] The process iterates based on the set projection matrix until the optimal projection matrix is ​​found. The optimal projection matrix is ​​the one that meets the convergence condition. The extracted features from each viewpoint (PCA, LBP, HOG) are standardized and then concatenated column-wise to generate an augmented matrix. This augmented matrix is ​​multiplied by the optimal projection matrix to obtain the feature matrix Y, which contains the low-dimensional features of the image. Clustering is then performed on the feature matrix Y, assigning a category label to each image, thus achieving multi-view image clustering.

[0118] In this embodiment, the multi-view clustering algorithm of the present invention was tested using the Scene-15 dataset and compared with other classic clustering methods. The Scene-15 dataset contains scene images of 15 different categories, with multiple samples in each category. All images were resized to standard size and normalized during preprocessing. Then, noise was added to the training data to simulate interference in the real environment, and the robustness of the algorithm under noise interference was tested.

[0119] During the noise addition process, salt-and-pepper noise was randomly added to each image in the dataset, with the noise proportion ranging from 2% to 20% of the image pixels and the noise being randomly distributed. 50% of the images were used as training data, and the remaining 50% were used as test data to evaluate the performance of the algorithm.

[0120] Table 1 shows the clustering results on the Scene-15 dataset.

[0121] method ACC NMI Purity KMeans 0.2080 0.1582 0.7900 GMM 0.2080 0.1582 0.7900 Hierarchical 0.2380 0.2021 0.8040 Spectral 0.2660 0.1712 0.7900 DBSCAN 0.5140 0.1100 0.5680 Mean Shift 0.5180 0.0000 0.5180 BIRCH 0.2500 0.1597 0.7920 Co-Reg 0.8520 0.4012 0.8520 This invention 0.8880 0.5208 0.8880

[0122] As shown in Table 1, in this experiment, on the Scene-15 dataset, traditional single-view clustering methods (KMeans, GMM, Hierarchical, Spectral, BIRCH) can only utilize a single feature, resulting in very low accuracy (ACC) and information consistency (NMI) (ACC≈0.20–0.27, NMI≈0.15–0.20). In contrast, density clustering methods (DBSCAN, Mean...)... While the Shift method improves accuracy (ACC≈0.51), it sacrifices purity and NMI. In contrast, the Co-Reg method with multi-view collaboration significantly improves clustering performance by fusing information from different perspectives through mutual regularization (ACC≈0.85, NMI≈0.40, Purity≈0.85). This invention, under adaptive weighting and low-rank sparsity constraints, most fully integrates the features from the three perspectives, achieving the highest ACC (0.8880), NMI (0.5208), and Purity (0.8880), verifying its significant advantages in multi-source feature fusion and robust clustering.

[0123] Figure 2 This invention demonstrates the clustering results of eight clustering algorithms in the same PCA dimensionality reduction space. Figure 2 AWTLRC clearly and cleanly separates the two classes of samples with almost no misclassification; KMeans, GMM, and hierarchical clustering have more overlapping boundaries; Spectral can partially identify non-spherical structures but still has confusion; DBSCAN and MeanShift either identify sparse points as noise or over-subdivide clusters; BIRCH performs reasonably well in the central area but has obvious misclassification at the edges. Overall, only the clustering method described in this invention is closest to the true class distribution.

Claims

1. A multi-view clustering method based on adaptive weighted tensors, characterized in that, Includes the following steps: Obtain a multi-view dataset, which includes images of different perspectives and different categories of scenes; Preprocess the images in the multi-view dataset to obtain initial images; Extract features from different perspectives from each initial image to obtain the initial perspective features of each image; The initial viewpoint features are standardized to obtain a data matrix. All data matrices are then merged to obtain an augmented matrix. The objective function is constructed with the goal of minimizing data reconstruction error and low-rank constraint. The objective function is then transformed into the augmented Lagrange multiplier method form. The Lagrange function is solved by the multiplier method in alternating directions to obtain the optimal projection matrix. The image data to be processed is transformed using the optimal projection matrix, and clustering is performed in the fused feature space to obtain the clustered image. The steps to extract features from different viewpoints from each initial image and obtain the initial viewpoint features for each image include: The initial image is dimensionality reduced by principal component analysis, and the first n0 principal components are retained. The resulting image features are denoted as PCA features, where n0 is a natural number. The initial image is processed into grayscale to obtain a grayscale image. Local binary pattern features are extracted from the grayscale image and denoted as LBP features. Feature extraction of grayscale images is performed using histogram of oriented gradients (HOG), resulting in histogram of oriented gradient features, denoted as HOG features. Initial viewpoint features are constructed using PCA features, LBP features, and HOG features. The data matrices obtained after standardizing the initial viewpoint features are denoted as follows: The augmented matrix obtained after merging the data matrices is denoted as . ,in This represents the standardized PCA features. This represents the standardized LBP features. This represents the standardized HOG features; The steps for solving the Lagrangian function using the alternating direction multiplier method to obtain the optimal projection matrix include: Step 51, set the maximum number of iterations; Step 52, fix variable W, iteratively update the... Projection matrix of each viewpoint The Lagrange function at this point is expressed as: , Expanding the first term of the Lagrange function, the reconstruction error term, the Lagrange function is expressed as: , In the formula, T represents transpose. The trace of a matrix is ​​represented by , and the sum of the diagonal elements of a matrix is ​​represented by . matrix Perform singular value decomposition, which results in: In the formula, Let represent the left singular vector matrix, and ∑ represent the diagonal matrix formed by the singular values. Represents a right singular vector matrix; Next, thresholding is performed on the diagonal matrix ∑ formed by the singular values, using the following formula: ; The reconstructed variables are: ; Step 53, fix variables Iteratively update the projection matrix Then the Lagrange function is expressed as: , Expanding the Lagrange function yields: , In the formula, the first term after the equal sign is the reconstruction error term, which represents the reconstruction error when the augmented matrix X is mapped to the low-dimensional space through the projection matrix W; the second term is the L2 regularization term, which controls the size of the projection matrix W. By solving the linear equations using gradient descent, we can obtain: , In the formula, Represents the identity matrix; Step 54, fix variables And W, update the Lagrange multipliers, the formula is: , , Step 55, set the convergence condition as follows: , In the formula, is the objective function, which describes the optimization process of the model. t represents the iteration number, and represents the projection matrix output in the t-th iteration. It is the projection matrix output at iteration step t. To set a threshold; Step 56: Iterate through steps 52 to 54 based on the set projection matrix until the calculated projection matrix satisfies the convergence condition, which is the optimal projection matrix. The steps involved in performing feature transformation on the image data to be processed using the optimal projection matrix and then performing clustering operations in the fused feature space to obtain the clustered image include: After standardizing the extracted features from each viewpoint, they are concatenated column by column to generate an augmented matrix. The augmented matrix is ​​multiplied by the optimal projection matrix to obtain the feature matrix Y. Clustering is then performed on the feature matrix Y, and each image is assigned a category label.

2. The multi-view clustering method based on adaptive weighted tensors according to claim 1, characterized in that, The expression for the objective function is: , In the formula, the first term after the equal sign The second term represents the data reconstruction error term. This represents the low-rank constraint term, the third term. Represents the L2 norm constraint term; Indicates the first Standardized feature matrix from each perspective Represents the projection matrix. Indicates the first A projection matrix for each viewpoint, where V represents the number of viewpoints. and Represents the regularization parameter; The square of the Frobenius norm is used to represent the data reconstruction error. This represents the nuclear norm, which is the sum of singular values, used to implement low-rank constraints. This represents the square of the L2 norm, used for parameter regularization penalties.

3. The multi-view clustering method based on adaptive weighted tensors according to claim 1, characterized in that, The steps to transform the objective function into the augmented Lagrange multiplier form include: For the low-rank constraint term and L2 norm constraint term of each perspective, Lagrange multipliers are introduced. Z corresponds to the low-rank constraint term and the L2 norm constraint term, and the Lagrangian function is constructed as follows: , In the formula, Indicates the first Projection matrix of each viewpoint initial value, Projection matrix initial value, This represents the inner product between matrices.

4. The multi-view clustering method based on adaptive weighted tensors according to any one of claims 1 to 3, characterized in that, The steps for preprocessing images in a multi-view dataset include: All images are processed to a standard size and then normalized.