Robust classification method and device for multi-view binary ordered image data
By constructing a robust loss function and optimizing the model, and by leveraging the consistency and complementarity of multi-view visual features, the overfitting problem of multi-view binary ordered image data was solved, achieving higher classification accuracy and stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANJING UNIV OF INFORMATION SCI & TECH
- Filing Date
- 2024-12-30
- Publication Date
- 2026-06-23
Smart Images

Figure CN120047716B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image recognition technology, and more specifically to a robust classification method and apparatus for multi-view binary ordered image data. Background Technology
[0002] In recent years, supervised learning methods based on precisely labeled samples have made significant progress in pattern recognition. However, in many practical scenarios, precisely labeled data often requires substantial time and manpower. Furthermore, directly labeling private or confidential data may be subject to legal or ethical restrictions. Against this backdrop, incomplete labeling learning has attracted considerable attention due to its lower labeling cost and broad application potential. Binary ordered data learning is an emerging incomplete labeling learning method where training samples consist of feature pairs, with the former having a higher positive similarity probability than the latter. Compared to traditional precise labeling, obtaining the ordering relationships between data is generally more economical and practically feasible. For binary ordered image data, how to properly handle the relationship between overfitting and empirical risk, and effectively utilize the multi-view visual features of images, remains a pressing issue. Therefore, it is necessary to design a novel multi-view binary ordered data classifier to more accurately mine and utilize the ordered relationships between multi-view features, thereby improving the robustness of image recognition and meeting practical application needs. Summary of the Invention
[0003] Purpose of the invention: The purpose of this invention is to provide a robust classification method and apparatus for multi-view binary ordered image data. The method uses multi-view features of image samples to train a binary ordered data classifier to improve the generalization ability of the classifier and significantly improve the classification accuracy of multi-view binary ordered image data.
[0004] Technical solution: In a first aspect, the present invention provides a robust classification method for multi-view binary ordered image data, comprising the following steps:
[0005] Extracting multi-view visual features from binary ordered image data in the training set;
[0006] The extracted features are fed into a multi-view binary ordered data classification model, and the weight coefficients and mapping matrices for each view are obtained through training. The binary ordered data classification model is represented as the following optimization problem:
[0007] Question P1:
[0008]
[0009] Constraints:
[0010] γ (v) ≥0
[0011]
[0012]
[0013]
[0014] In the formula, Let represent the visual features of the i-th binary ordered image sample at the v-th viewpoint, where and , respectively, are the visual feature vectors extracted from the first and second images, with the first image having a higher positive similarity than the second; n represents the number of binary ordered image samples used for training; m represents the number of different viewpoints; φ(·) represents the high-dimensional feature mapping function for each viewpoint; π + Let π and γ represent the probabilities of the positive and negative classes, respectively; γ(v) represents the weight of the v-th viewpoint; and w(v) can be expressed as... It is the characteristic matrix. It is a mapping matrix. and γ (v) The variable to be optimized; and λ is an intermediate variable, and a>0, b>0, c>0, d>0 and λ>0 are hyperparameters;
[0015] Based on the learned weight coefficients and mapping matrix, the image to be identified is classified and recognized.
[0016] Secondly, the present invention also provides a robust classification device for multi-view binary ordered image data, comprising:
[0017] The data preprocessing module is used to extract multi-view visual features from the binary ordered image data in the training set;
[0018] The model training module is used to feed the extracted features into a multi-view binary ordered data classification model to train and obtain the weight coefficients and mapping matrices for each view. The binary ordered data classification model is represented as the following optimization problem:
[0019] Question P1:
[0020]
[0021] Constraints:
[0022] γ (v) ≥0
[0023]
[0024]
[0025] In the formula, Let represent the visual features of the i-th binary ordered image sample at the v-th viewpoint, where and , respectively, are the visual feature vectors extracted from the first and second images, with the first image having a higher positive similarity than the second; n represents the number of binary ordered image samples used for training; m represents the number of different viewpoints; φ(·) represents the high-dimensional feature mapping function for each viewpoint; π + and π - These represent the probabilities of the positive and negative classes, respectively; γ (v) Represents the weight of the v-th viewpoint; w (v) Represented as It is the characteristic matrix. It is a mapping matrix. and γ (v) The variable to be optimized; and λ is an intermediate variable, and a>0, b>0, c>0, d>0 and λ>0 are hyperparameters;
[0026] The image classification module is used to classify and recognize images based on the learned weight coefficients and mapping matrix.
[0027] Thirdly, the present invention also provides a computer device comprising: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, wherein when the programs are executed by the processors, they implement the steps of the robust classification method for multi-view binary ordered image data as described in the first aspect of the present invention.
[0028] Fourthly, the present invention also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the robust classification method for multi-view binary ordered image data as described in the first aspect of the present invention.
[0029] Beneficial Effects: This invention proposes a robust classification method and apparatus for multi-view binary ordered image data. A binary ordered data classification model based on a robust loss function is designed. By imposing a larger penalty on binary ordered data samples with positive loss, the model's classification accuracy on the training set is ensured; by imposing a smaller penalty on binary ordered data samples with negative loss, overfitting is effectively avoided. Furthermore, by utilizing the consistency and complementarity information in the multi-view visual features of images, the model adaptively learns the weight coefficients between viewpoints, achieving deep fusion between multi-view mapping models and effectively improving the model's generalization performance. From a microscopic perspective, this invention explores the connotation of unbiased risk estimation for binary ordered data classification, achieving a good balance between minimizing empirical risk and avoiding overfitting. It also fully utilizes the multi-view visual information of image data, significantly improving the classification accuracy of multi-view binary ordered image data. This method is simple and efficient, showing broad application prospects in computer vision, machine learning, pattern recognition, and data mining. Attached Figure Description
[0030] Figure 1 This is a flowchart of a robust classification method for multi-view binary ordered image data according to an embodiment of the present invention;
[0031] Figure 2 This is a flowchart illustrating the alternating direction multiplier method for solving optimization problems according to an embodiment of the present invention. Detailed Implementation
[0032] The technical solutions in the embodiments of the present invention will now be clearly and completely described in conjunction with the accompanying drawings.
[0033] This invention proposes a multi-perspective framework for learning binary ordered data. Its core idea is to construct a robust loss function suitable for binary ordered data classification, integrate feature distributions from different perspectives, and, guided by the principles of consistency and complementarity, achieve a more efficient classification learning task. For example... Figure 1 As shown, a robust classification method for multi-view binary ordered image data according to the present invention includes the following steps:
[0034] Step S1: Extract multi-view visual features from the binary ordered image data in the training set.
[0035] Binary ordered images refer to two images as a single sample, hence the term "binary"; and the first image has a higher positive similarity probability than the second image, hence the term "ordered." In other words, the features extracted from binary ordered images contain visual information extracted from both images. Visual features of image samples are extracted from multiple perspectives. In this embodiment, for black and white image data, a 512-dimensional oriented gradient histogram (HOG) and an 81-dimensional GIST descriptor are used as two different feature perspectives; for color image data, a 144-dimensional HOG and a 100-dimensional DenseHue descriptor are used as two feature perspectives. These features construct a multi-perspective representation of each dataset based on specific visual features, which helps to comprehensively mine the local and global feature information of the data.
[0036] Step S2: Input the multi-view features extracted in S1 into the established multi-view binary ordered data classification model to train and obtain the weight coefficients and mapping matrix for each view.
[0037] In this invention, the binary ordered data classification model is represented as the following optimization problem:
[0038] Question P1:
[0039]
[0040] Constraints:
[0041] (1.1)γ (v) ≥0
[0042] (1.2)
[0043] (1.3)
[0044] (1.4)
[0045] In fact, a binary ordered image sample contains visual information extracted from two separate images. Let represent the visual features of the i-th binary ordered image sample at the v-th viewpoint, where and , respectively, are the visual feature vectors extracted from the first and second images, with the first image having a higher positive similarity than the second; n represents the number of binary ordered samples used for training; m represents the number of different viewpoints; φ(·) represents the high-dimensional feature map for each viewpoint; π + and π - These represent the probabilities of the positive and negative classes, respectively; γ (v) Represents the weight of the v-th viewpoint; w (v) Represented as
[0046] It is the characteristic matrix. It is a mapping matrix. and γ (v) The variable to be optimized; and λ is an intermediate variable, and a>0, b>0, c>0, d>0 and λ>0 are hyperparameters.
[0047] Constraint (1.1) is a non-negativity constraint, ensuring that each perspective contributes positively to the final result; constraint (1.2) is a normalization constraint, maintaining a reasonable balance among perspectives during optimization to prevent one perspective from having an excessive weight while ignoring others; constraints (1.3)-(1.4) are soft constraints, exerting indirect influence through the constraint form of the optimization problem. The input variables for problem P1 are... The variable to be solved is w (v) and γ (v) This invention constructs a loss mechanism suitable for binary ordered data classification based on the above constraints, establishes collaborative constraints on the robust loss function from different perspectives, and promotes consistency and complementarity among them.
[0048] After establishing the optimization problem P1, according to the representation theorem, we have:
[0049] w=φ(x)α+φ(x′)α′
[0050] α and α′ represent the mapping matrices corresponding to the binary ordered image samples;
[0051] Define a block vector:
[0052]
[0053] Define the block feature matrix:
[0054]
[0055] Mapping the expanded input matrix to the feature space yields:
[0056]
[0057] Therefore, the block form of the kernel matrix is defined as follows:
[0058]
[0059] Wherein, the kernel function k(·) is the inner product of the high-dimensional feature mapping function φ(·), and K(v), K′(v), K′′(v) are respectively derived from the kernel function. and The nuclear matrix formed by these components.
[0060] Therefore, it can be further derived using the representation theorem:
[0061]
[0062] According to the representation theorem, the kernel model is obtained through transformation.
[0063] After establishing the optimization problem P1, according to the representation theorem, the model derived from the binary ordered data classification model is as follows:
[0064] Question P2:
[0065]
[0066] Constraints:
[0067] γ (v) ≥0
[0068]
[0069] in, It is the block form of the kernel matrix at the v-th viewpoint; γ(v) represents the weight of the v-th viewpoint; γ(v) are the variables to be optimized; and λ is an intermediate variable, a>0, b>0, c>0, d>0 and λ>0 are hyperparameters, and v and μ represent different viewpoints.
[0070] For problem P2, to quickly find the optimal solution, this invention employs the alternating direction multiplier method instead of the conventional method for solving quadratic programming problems. It combines gradient descent and a momentum strategy related to the number of iterations to progressively optimize the objective function to achieve rapid convergence. The algorithm inputs the number of viewpoints m and the probability of positive samples π. + Negative class sample probability π-, training set data Parameters a, b, c, d, and λ represent the step size parameter μ (i.e., the penalty parameter) for the alternating direction multiplier method. (See reference...) Figure 2 The solution process for problem P2 specifically includes:
[0071] a) Initialization Let the number of iterations l = 0, and determine the convergence threshold ε;
[0072] b) Take in problem P2 Solving for γ using the alternating direction multiplier method (v) The optimization problem is obtained.
[0073] c) Take in problem P2 Seeking using a momentum-based gradient descent method The optimal solution is denoted as
[0074] d) If ||γ (l+1) -γ (l) ||>ε or Let l = l + 1, then return to step b); otherwise, output the optimal solution. in
[0075] In step b), γ is updated. (v) The method is as follows:
[0076] Define a vector representing the complexity of the classifier model for each viewpoint:
[0077] Define the viewpoint weight vector:
[0078]
[0079] For γ (v) In this regard, problem P2 can be transformed into problem P3:
[0080] Question P3:
[0081]
[0082] Constraints:
[0083] τ≥0
[0084] Where p is the Lagrange multiplier vector, F = (1 T I m ) (m+1)×m H = (0 T -I m×m ) (m+1)×m , 1 is a vector whose components are all 1s, I m It is an m×m identity matrix, 0 is a vector whose components are all 0, and μ is the penalty parameter of the alternating direction multiplier method.
[0085] Solve problem P3 using the following steps:
[0086] i) Initialize τ and p, and determine the convergence threshold;
[0087] ii) Update γ using the following formula:
[0088]
[0089] Among them, I m It is an m×m identity matrix.
[0090] iii) Update τ using the following formula:
[0091]
[0092] τ = max(τ, 0)
[0093] iv) Update p using the following formula:
[0094] p = p + μ(Fγ + Hτ - d)
[0095] v) Dynamically adjust parameter μ:
[0096] μ = μ * β, where β is an adjustment coefficient greater than 1. In this example, β is set to 1.05.
[0097] vi) If ||Fγ+Hτ-d||2 and the difference between the changes of γ and τ in two adjacent iterations are both less than the set threshold, then γ is obtained. (l+1) =γ; otherwise, return to step ii).
[0098] Update in step c) above The method is as follows:
[0099] for In this regard, problem P2 can be transformed into problem P4:
[0100] Question P4:
[0101]
[0102] in:
[0103]
[0104] Solve problem P4 using the following steps:
[0105] i) Initialization q(v) = 0, iteration number s = 0, determine the convergence threshold;
[0106] ii) Calculate the objective function for problem P4 gradient:
[0107]
[0108] in:
[0109]
[0110]
[0111]
[0112]
[0113] iii) Dynamically modify the step size 1 / L based on the gradient norm (v) To adapt to the rate of change of the objective function, where:
[0114]
[0115] D = abcn
[0116] iv) Calculate the cumulative gradient:
[0117]
[0118] v) Updated using the following formula
[0119]
[0120] vi) If If the difference between the changes in two adjacent iterations is less than a set threshold, then... Otherwise, let s = s + 1 and return to step ii).
[0121] Step S3, using the learned weight coefficients γ* and mapping matrix Classify the images to be recognized:
[0122] Classification is performed using a classifier based on the v-th viewpoint, where Kernel function representation:
[0123]
[0124] The classification results f(v) from each perspective are weighted. The weighted summation yields the final prediction result:
[0125]
[0126] in, It is the visual feature of the image to be identified from the v-viewpoint.
[0127] To verify the effectiveness and performance of the proposed recognition method, comparative experiments were conducted using the black-and-white image datasets MNIST, Kuzushiji, and Fashion, and the color image dataset CIFAR-10. For the black-and-white image data, a 512-dimensional oriented gradient histogram (HOG) and an 81-dimensional GIST descriptor were used as two different feature perspectives; for the color image data, a 144-dimensional HOG and a 100-dimensional DenseHue descriptor were used as two feature perspectives. These features construct multi-view representations based on specific visual features in each dataset, which helps to comprehensively mine the local and global feature information of the data.
[0128] Table 1-4 shows the different π values.+ The following table presents experimental results for five binary ordered data classification methods on different image datasets. The data in the table represent the average classification accuracy and standard deviation obtained from ten runs of each method. The experimental results show that, compared to methods 1-4, the method of this invention achieves the highest accuracy on all datasets (last column), exceeding the second-best method (in bold) by up to 17% (MNIST dataset π). + =0.2). This fully demonstrates the superior performance of the method of this invention in balancing overfitting and empirical risk, as well as in the fusion of multi-view visual features of images. In addition, the accuracy standard deviation of the method of this invention is generally about one order of magnitude lower than that of other methods, showing high training stability.
[0129] Table 1 Different π + Comparison of the recognition results of the following five methods on the MNIST dataset
[0130]
[0131]
[0132] Table 2 Different π + Comparison of the recognition results of the following five methods on the Kuzushiji dataset.
[0133] Class_prior Pcomp-ABS Pcomp-ReLU Pcomp-Unbiased Pcomp-Teacher Method of the present invention <![CDATA[π + =0.2]]> 0.8101±0.0096 0.8121±0.0105 0.8103±0.0138 0.7557±0.0936 0.862±0.0055 <![CDATA[π + =0.5]]> 0.8073±0.0185 0.806±0.0119 0.8196±0.0137 0.5972±0.2304 0.8443±0.0016 <![CDATA[π + =0.8]]> 0.8121±0.0087 0.8074±0.0123 0.8064±0.005 0.778±0.1128 0.8855±0.0058
[0134] Table 3 Different π + Comparison of the recognition results of the following five methods on the Fashion dataset.
[0135] Class_prior Pcomp-ABS Pcomp-ReLU Pcomp-Unbiased Pcomp-Teacher Method of the present invention <![CDATA[π + =0.2]]> 0.8029±0.0035 0.8039±0.0093 0.8018±0.0035 0.8046±0.2485 0.9462±0.0148 <![CDATA[π + =0.5]]> 0.9417±0.0278 0.9376±0.0193 0.9517±0.0143 0.704±0.3953 0.9662±0.001 <![CDATA[π + =0.8]]> 0.8078±0.0092 0.8102±0.0113 0.806±0.0105 0.7914±0.2135 0.9693±0.0016
[0136] Table 4 Different π + Comparison of the recognition results of the following five methods on the CIFAR-10 dataset
[0137] Class_prior Pcomp-ABS Pcomp-ReLU Pcomp-Unbiased Pcomp-Teacher Method of the present invention <![CDATA[π + =0.2]]> 0.8005±0.0008 0.8016±0.004 0.8008±0.0014 0.6918±0.1311 0.8803±0.0008 <![CDATA[π + =0.5]]> 0.8159±0.0183 0.8147±0.0117 0.8192±0.0095 0.6349±0.1779 0.8375±0.0008 <![CDATA[π + =0.8]]> 0.8017±0.0038 0.8012±0.0027 0.8024±0.0038 0.7134±0.1156 0.848±0.0121
[0138] Based on the same technical concept as the method embodiments, the present invention also provides a robust classification device for multi-view binary ordered image data, comprising:
[0139] The data preprocessing module is used to extract multi-view visual features from the binary ordered image data in the training set;
[0140] The model training module is used to feed the extracted features into a multi-view binary ordered data classification model to train and obtain the weight coefficients and mapping matrices for each view. The binary ordered data classification model is represented as the following optimization problem:
[0141] Question P1:
[0142]
[0143] Constraints:
[0144] γ (v) ≥0
[0145]
[0146]
[0147] In the formula, Let represent the visual features of the i-th binary ordered image sample at the v-th viewpoint, where and , respectively, are the visual feature vectors extracted from the first and second images, with the first image having a higher positive similarity than the second; n represents the number of binary ordered image samples used for training; m represents the number of different viewpoints; φ(·) represents the high-dimensional feature mapping function for each viewpoint; π + and π - These represent the probabilities of the positive and negative classes, respectively; γ (v) Represents the weight of the v-th viewpoint; w (v) Represented as It is the characteristic matrix. It is a mapping matrix. and γ (v) The variable to be optimized; and λ is an intermediate variable, and a>0, b>0, c>0, d>0 and λ>0 are hyperparameters;
[0148] The image classification module is used to classify and recognize images based on the learned weight coefficients and mapping matrix.
[0149] It should be understood that the robust classification device for multi-view binary ordered image data in the embodiments of the present invention can implement all the technical solutions in the above method embodiments. The functions of each functional module can be specifically implemented according to the methods in the above method embodiments. The specific implementation process can be referred to the relevant descriptions in the above embodiments, which will not be repeated here.
[0150] The present invention also provides a computer device comprising: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, wherein when the programs are executed by the processors, they implement the steps of the robust classification method for multi-view binary ordered image data as described above.
[0151] The present invention also provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the robust classification method for multi-view binary ordered image data as described above.
[0152] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0153] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0154] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0155] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
Claims
1. A robust classification method for multi-view binary ordered image data, characterized in that, Includes the following steps: Extracting multi-view visual features from binary ordered image data in the training set; The extracted features are fed into a multi-view binary ordered data classification model, and the weight coefficients and mapping matrices for each view are obtained through training. The binary ordered data classification model is represented as the following optimization problem: Question P1: Constraints: In the formula, Indicates the first The binary ordered image sample at the th Visual features from various perspectives, among which and These are the visual feature vectors extracted from the first and second images, respectively, with the first image having a greater positive similarity than the second image; Indicates the number of binary ordered image samples used for training; Indicates the number of different viewpoints; This represents a high-dimensional feature mapping function for each viewpoint; and These represent the probabilities of the positive and negative classes, respectively. Indicates the first The weight of each perspective; Represented as , It is the characteristic matrix. It is a mapping matrix. The variable to be optimized; and As an intermediate variable, and It's a hyperparameter; The training yields weight coefficients and mapping matrices for each perspective, including: Based on the representation theorem, the binary ordered data classification problem P1 is transformed into a kernel model as follows: Question P2: Constraints: in, It is the first The block form of the kernel matrix from each perspective is as follows: , where kernel function It is a high-dimensional feature mapping function The inner product, These are respectively composed of kernel functions , and The kernel matrix formed; and The variable to be optimized; For problem P2, the alternating direction multiplier method is adopted, combined with gradient descent and momentum strategy related to the number of iterations, to gradually optimize the objective function to achieve fast convergence; Based on the learned weight coefficients and mapping matrix, the image to be identified is classified and recognized.
2. The method according to claim 1, characterized in that, Extracting multi-view visual features from the training set's binary ordered image data, including: For black and white image data, a 512-dimensional oriented gradient histogram and an 81-dimensional GIST descriptor were used as visual features from two different perspectives; for color image data, a 144-dimensional HOG and a 100-dimensional DenseHue descriptor were used as visual features from two different perspectives.
3. The method according to claim 1, characterized in that, Solving problem P2 involves the following steps: S201, Input the number of viewpoints Probability of positive class samples negative class sample probability Training set data ,parameter and Step size parameters of the alternating direction multiplier method and convergence threshold ;initialization Let the number of iterations be... ; S202, take from problem P2 Solving for the problem using the alternating direction multiplier method The optimization problem is obtained. ; S203, take from problem P2 Seeking the gradient descent method based on momentum strategy The optimal solution is denoted as ; S204, if ,make Return to step S202; otherwise, output the optimal solution. , .
4. The method according to claim 3, characterized in that, In step S202, update The method is as follows: Define a vector representing the complexity of the classifier model for each viewpoint. Define the view weight vector ,for In other words, problem P2 is transformed into problem P3: Constraints: in, It is a Lagrange multiplier vector. , , , It is a vector whose components are all 1s. yes The identity matrix, It is a vector whose components are all 0; Solve problem P3 using the following steps: i) Initialization and Determine the convergence threshold; ii) Update by the following formula : ; :iii) Update using the following formula : ; iv) Update by the following formula : ; v) Dynamically adjust parameters : , It is an adjustment factor greater than 1; vi) If as well as and If the difference between the changes in two adjacent iterations is less than a set threshold, then we obtain... Otherwise, return to step ii).
5. The method according to claim 3, characterized in that, In step S203, the update The method is as follows: for In other words, problem P2 is transformed into problem P4: in: Solve problem P4 using the following steps: i) Initialization , Number of iterations Determine the convergence threshold; ii) Calculate the objective function for problem P4. gradient: in: iii) Dynamically adjust the step size based on the gradient norm To adapt to the rate of change of the objective function, where: iv) Calculate the cumulative gradient: v) Updated using the following formula : vi) If If the difference between the changes in two adjacent iterations is less than a set threshold, then... Otherwise, let And return to step ii).
6. The method according to claim 1, characterized in that, Based on the learned weight coefficients and mapping matrix, the image to be identified is classified and recognized, including: Using the first Classifiers from multiple perspectives are used for classification, among which... Kernel function representation: Classification results from various perspectives By weight The weighted summation yields the final prediction result: in, The image to be recognized is in Visual features from a perspective.
7. A robust classification device for multi-view binary ordered image data, characterized in that, include: The data preprocessing module is used to extract multi-view visual features from the binary ordered image data in the training set; The model training module is used to feed the extracted features into a multi-view binary ordered data classification model to train and obtain the weight coefficients and mapping matrices for each view. The binary ordered data classification model is represented as the following optimization problem: Question P1: Constraints: In the formula, Indicates the first The binary ordered image sample at the th Visual features from various perspectives, among which and These are the visual feature vectors extracted from the first and second images, respectively, with the first image having a greater positive similarity than the second image; Indicates the number of binary ordered image samples used for training; Indicates the number of different viewpoints; This represents a high-dimensional feature mapping function for each viewpoint; and These represent the probabilities of the positive and negative classes, respectively. Indicates the first The weight of each perspective; Represented as , It is the characteristic matrix. It is a mapping matrix. The variable to be optimized; and As an intermediate variable, and It's a hyperparameter; The training yields weight coefficients and mapping matrices for each perspective, including: Based on the representation theorem, the binary ordered data classification problem P1 is transformed into a kernel model as follows: Question P2: Constraints: in, It is the first The block form of the kernel matrix from each perspective is as follows: , where kernel function It is a high-dimensional feature mapping function The inner product, These are respectively composed of kernel functions , and The kernel matrix formed; and The variable to be optimized; For problem P2, the alternating direction multiplier method is adopted, combined with gradient descent and momentum strategy related to the number of iterations, to gradually optimize the objective function to achieve fast convergence; The image classification module is used to classify and recognize images based on the learned weight coefficients and mapping matrix.
8. A computer device, characterized in that, include: One or more processors; Memory; And one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, wherein when the programs are executed by the processors, they implement the steps of the robust classification method for multi-view binary ordered image data as described in any one of claims 1-6.
9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the steps of the robust classification method for multi-view binary ordered image data as described in any one of claims 1-6.