Image processing method and apparatus therefor

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By introducing Sobolev geometric space and adversarial preference optimization into image super-resolution technology, the problems of spectral mismatch and insufficient sample alignment in existing technologies are solved, achieving high-fidelity image restoration and improving image quality and the performance of downstream tasks.

CN122265037APending Publication Date: 2026-06-23INST OF AUTOMATION CHINESE ACAD OF SCI +1

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: INST OF AUTOMATION CHINESE ACAD OF SCI
Filing Date: 2026-04-10
Publication Date: 2026-06-23

AI Technical Summary

Technical Problem

Existing image super-resolution techniques suffer from spectral mismatch, lack of spatially aligned poor negative samples, and inherent false confidence blind spots when reconstructing high-quality images. As a result, the generated images are not effective in restoring high-frequency details and cannot meet the fidelity and usability requirements of real-world complex scenes.

Method used

By constructing a Sobolev geometric space and introducing an adversarial preference optimization mechanism, the geometric structure of the optimization process is reshaped. The structured spectral operator and adversarial network are used to dynamically capture the structural artifacts of the model, ensuring strict alignment of positive and negative samples in spatial semantics. This breaks the isotropic assumption in traditional methods and achieves high-fidelity image restoration.

Benefits of technology

It significantly improves the ability to restore high-frequency details in generated images. The generated images surpass existing models in terms of visual realism and pixel-level structural fidelity, and improve the performance of downstream tasks such as OCR, object detection and instance segmentation.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122265037A_ABST

Patent Text Reader

Abstract

The present disclosure provides an image processing method and apparatus thereof. An image processing method can include constructing a conditional probability path regarding a noise distribution and a high resolution data distribution conditioned on an image to be processed; constructing a velocity network based on the conditional probability path and the image to be processed; constructing a Sobolev geometry space for preference optimization of the velocity network; preference optimizing the velocity network in the Sobolev geometry space; and obtaining a high resolution image of the image to be processed based on the optimized velocity network.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computer vision, and more particularly to an image processing method, an image processing apparatus, an electronic device, a computer-readable storage medium, and a computer program product. Background Technology

[0002] Image super-resolution (SR), a fundamental and crucial inverse image problem in computer vision, aims to recover and reconstruct high-quality (HQ) sharp images containing rich high-frequency details from damaged, blurred, or degraded low-quality (LQ) image observations. In practical applications, image degradation processes in the real world are often unknown, irreversible, and highly ill-conditioned, making high-fidelity super-resolution reconstruction with real physical meaning a consistently challenging task. Summary of the Invention

[0003] According to a first aspect of the present disclosure, an image processing method is provided, the method comprising: constructing a conditional probability path relating a noise distribution and a high-resolution data distribution conditioned on an image to be processed; constructing a velocity network based on the conditional probability path and the image to be processed; constructing a Sobolev geometric space for preference optimization of the velocity network; performing preference optimization of the velocity network in the Sobolev geometric space; and obtaining a high-resolution image of the image to be processed based on the optimized velocity network.

[0004] Optionally, constructing a Sobolev geometric space for preference optimization of the velocity network includes: obtaining the Sobolev geometric space by replacing the identity matrix in the noise distribution with a structured spectral operator, wherein the structured spectral operator is determined based on the frequency of the image to be processed.

[0005] Optionally, preference optimization of the velocity network in the Sobolev geometric space includes: determining the Sobolev energy difference between the velocity network and the reference network in the Sobolev geometric space; obtaining semantically consistent preference data pairs; and performing preference optimization of the velocity network based on the Sobolev energy difference and the preference data pairs.

[0006] Optionally, determining the Sobolev energy difference between the velocity network and the reference network in the Sobolev geometric space includes: obtaining the Sobolev energy difference based on the residual between the velocity network and the target conditional vector field in the Sobolev geometric space and the residual between the reference network and the target conditional vector field, wherein the target conditional vector field is obtained based on the conditional probability path.

[0007] Optionally, obtaining semantically consistent preference data pairs includes: determining an intermediate flow state from the conditional probability path as a positive sample; predicting a degradation trajectory based on the positive sample to determine a degradation estimate of the high-resolution image; obtaining a negative sample corresponding to the positive sample based on the degradation estimate and the noise distribution corresponding to the positive sample; and constructing the preference data pair based on the positive sample and the negative sample.

[0008] Optionally, predicting a degradation trajectory based on the positive samples to determine a degradation estimate of the high-resolution image includes: determining another velocity network using a constructed adversarial network; generating trajectory deflection values using the other velocity network and the positive samples; determining the direction of the negative samples based on the structured spectral operator, the partial derivative of the conditional probability path, and the residual energy of the adversarial network; and determining a degradation estimate of the high-resolution image based on the positive samples, the directions of the negative samples, and the trajectory deflection values.

[0009] According to a second aspect of the present disclosure, an image processing apparatus is provided, the apparatus comprising: a network construction module configured to: construct a conditional probability path with respect to a noise distribution and a high-resolution data distribution conditioned on an image to be processed; construct a velocity network based on the conditional probability path and the image to be processed; a space construction module configured to: construct a Sobolev geometric space for preference optimization of the velocity network; and an optimization module configured to perform preference optimization of the velocity network in the Sobolev geometric space; and obtain a high-resolution image of the image to be processed based on the optimized velocity network.

[0010] Optionally, the space construction module is configured to: obtain the Sobolev geometric space by replacing the identity matrix in the noise distribution with a structured spectral operator, wherein the structured spectral operator is determined based on the frequency of the image to be processed.

[0011] Optionally, the optimization module is configured to: determine the Sobolev energy difference between the velocity network and the reference network in the Sobolev geometric space; obtain semantically consistent preference data pairs; and perform preference optimization on the velocity network based on the Sobolev energy difference and the preference data pairs.

[0012] Optionally, the optimization module is configured to obtain the Sobolev energy difference based on the residual between the velocity network and the target conditional vector field in the Sobolev geometric space and the residual between the reference network and the target conditional vector field, wherein the target conditional vector field is obtained based on the conditional probability path.

[0013] Optionally, the optimization module is configured to: determine an intermediate flow state from the conditional probability path as a positive sample; predict a degradation trajectory based on the positive sample to determine a degradation estimate of the high-resolution image; obtain a negative sample corresponding to the positive sample based on the degradation estimate and the noise distribution corresponding to the positive sample; and construct the preference data pair based on the positive sample and the negative sample.

[0014] Optionally, the optimization module is configured to: determine another velocity network using the constructed adversarial network; generate trajectory deflection values using the other velocity network and the positive samples; determine the direction of the negative samples based on the structured spectral operator, the partial derivative of the conditional probability path, and the residual energy of the adversarial network; and determine the degradation estimate of the high-resolution image based on the positive samples, the direction of the negative samples, and the trajectory deflection values.

[0015] According to a third aspect of the present disclosure, an electronic device is provided, the electronic device may include: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the image processing method as described above.

[0016] According to a fourth aspect of the present disclosure, a computer-readable storage medium is provided that stores instructions which, when executed by at least one processor, cause the at least one processor to perform the image processing method as described above.

[0017] According to a fifth aspect of the present disclosure, a computer program product is provided, wherein instructions in the computer program product are executed by at least one processor in an electronic device to perform the image processing method as described above.

[0018] The technical solutions provided by the embodiments of this disclosure bring at least the following beneficial effects: by performing preference optimization on the velocity network used to obtain high-resolution images in the constructed Sobolev geometric space, the obtained high-resolution images are more accurate and natural.

[0019] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description

[0020] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure, and are not intended to unduly limit this disclosure.

[0021] Figure 1 This is a flowchart of an image processing method according to an embodiment of the present disclosure.

[0022] Figure 2 This is another flowchart of an image processing method according to an embodiment of the present disclosure.

[0023] Figure 3 This is a block diagram of an image processing apparatus according to an embodiment of the present disclosure.

[0024] Figure 4 This is a schematic diagram of the structure of an image processing device in the hardware operating environment of an embodiment of this disclosure.

[0025] Figure 5 This is a block diagram of an electronic device according to an embodiment of the present disclosure.

[0026] Throughout the accompanying drawings, it should be noted that the same reference numerals are used to denote the same or similar elements, features, and structures. Detailed Implementation

[0027] The following description, provided with reference to the accompanying drawings, is intended to aid in a full understanding of embodiments of the present disclosure as defined by the claims and their equivalents. Various specific details are included to aid understanding, but these details are to be considered exemplary only. Therefore, those skilled in the art will recognize that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the present disclosure. Furthermore, for clarity and brevity, descriptions of well-known functions and structures are omitted.

[0028] It should be noted that the terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this disclosure are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this disclosure described herein can be implemented in orders other than those illustrated or described herein. The embodiments described in the following examples do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.

[0029] It should be noted that the phrase "at least one of several items" in this disclosure refers to three parallel cases: "any one of the several items", "a combination of any number of the several items", and "all of the several items". For example, "including at least one of A and B" includes the following three parallel cases: (1) including A; (2) including B; (3) including A and B. Another example is "performing at least one of step one and step two", which means the following three parallel cases: (1) performing step one; (2) performing step two; (3) performing both step one and step two.

[0030] With the rapid development of deep learning, especially large-scale generative models (such as diffusion models and flow matching models), image super-resolution technology has achieved significant breakthroughs in synthesizing realistic textures and improving visual perception quality by leveraging the powerful generative priors inherent in large models. However, despite these advances, current mainstream supervised training paradigms still face a fundamental theoretical ceiling in achieving truly high-fidelity image restoration. This is because existing standard methods typically rely on artificially synthesized degraded data pairs and enforce strict pixel-level alignment. This rigid dependency forces the model to overfit to artificially set degradation assumptions during optimization, causing the model to tend to "memorize" the synthesized degradation patterns rather than truly capturing and learning the underlying manifold and authentic texture features of real natural images.

[0031] To overcome this bottleneck in supervised training, researchers have begun exploring the introduction of human preference alignment mechanisms into the field of super-resolution. Among these, the Direct Preference Optimization (DPO) algorithm is considered a highly promising approach. By introducing preference optimization, random biases in the model's generation process can be explicitly penalized, and the generated images can be forced to follow the true data manifold of natural images, thereby better aligning the generation prior with human subjective perception of quality.

[0032] However, in-depth theoretical and experimental analysis reveals a fatal geometric and spectral flaw when the standard DPO method is directly applied to super-resolution tasks: existing alignment methods typically rely on the naive assumption of isotropic Gaussian parameterization. From a spectral analysis perspective, this isotropic objective function implicitly assumes a "spectral flatness," which leads to severe spectral misalignment compared to the inherent exponential decay of the spectral distribution in natural images (i.e., the energy of natural images is mainly concentrated in low frequencies, with high-frequency energy decaying rapidly).

[0033] While such a spectral flatness prior is acceptable in coarse-grained text-to-image generation tasks, it is highly destructive for super-resolution tasks requiring strict high-frequency fidelity. Lacking an inductive bias that effectively distinguishes between genuine high-frequency details and spurious noise, this isotropic optimization objective inevitably leads to numerous high-frequency structural artifacts (hallucinations) that violate the natural data manifold. The model incorrectly generates high-frequency noise as texture, severely limiting the fidelity and usability of image super-resolution in complex real-world scenes. Therefore, a novel architecture or optimization paradigm is urgently needed to break the isotropic assumption of existing models and guide the super-resolution model's generation process within a geometric space that strictly adheres to the spectral characteristics of natural images.

[0034] Existing image super-resolution techniques primarily treat this task as an inverse problem, aiming to reconstruct high-quality images from low-quality images. Currently, the powerful generative priors of large-scale visual models (such as diffusion models and flow matching models) are widely used to synthesize realistic image textures. To better align the prior distribution of the generative model with human subjective visual perception, a direct preference optimization paradigm is often introduced.

[0035] In their mathematical implementation, generative super-resolution frameworks (such as those based on flow matching or continuous-time ordinary differential equations) typically assume an isotropic Gaussian prior and use standard scalar variance in the local transfer kernel. To make the objective function solvable, the DPO algorithm implicitly relies on a flat Euclidean space when evaluating the model-generated trajectory; that is, it constructs the log-likelihood ratio objective function by calculating the least-squares error (norm) between the predicted and target velocity fields. On the data side for preference learning, it typically relies on supervised regression training with static data pairs or directly borrows preference datasets from text-to-image generation tasks for fine-tuning.

[0036] The above-mentioned solutions still have the following theoretical and engineering shortcomings when pursuing high-fidelity image super-resolution: 1. The isotropic Gaussian assumption leads to a severe “spectral mismatch”.

[0037] The DPO framework heavily relies on naive isotropic Gaussian parameterization. From an information theory and signal processing perspective, this parameterization essentially assumes a "spectrally flat" prior distribution, while the real natural image manifold exhibits significant spectral attenuation. The adopted Euclidean space objective (norm loss) imposes a completely uniform and indiscriminate weight penalty on the entire spectrum in the frequency domain. This "frequency-agnostic" characteristic cannot counteract the inherent spectral bias of neural networks, causing the model to be unable to distinguish between real high-frequency details and spurious generated noise during reconstruction. Ultimately, the generated distribution deviates significantly from the statistical regularities of natural images in the frequency domain, inevitably producing structural artifacts that violate the data manifold or resulting in the loss of fine textures.

[0038] 2. Lack of high-quality “hard negatives” with strict spatial alignment.

[0039] DPO-based methods face a severe "information scarcity" bottleneck in super-resolution tasks because standard super-resolution datasets only provide regression-guided static data pairs, lacking preference triples for contrastive learning. Text-to-image preference datasets exhibit extremely weak spatial correspondences; the differences in preferences primarily stem from variations in semantic layout rather than differences in image restoration quality. Therefore, "negative samples" selected from such datasets cannot achieve strict spatial alignment with real images, masking genuine structural degradation and preventing the model from learning the subtle structural differences unique to super-resolution tasks.

[0040] 3. The model's inherent "misaligned confidence" blind spot has not been corrected.

[0041] Generative models commonly exhibit an ill-posed characteristic: they not only assign high likelihood to real data but also assign extremely low residual energy (high confidence) to erroneous samples containing significant structural degradations (such as texture distortion, aliasing, and other spectral artifacts). This is because the Euclidean objective function (norm) cannot effectively penalize these coherent structural illusions, leading the model to perceive these artifacts as "reasonable" details. Conventional adversarial learning or simple noise injection strategies merely maximize loss to generate meaningless random noise, failing to expose these hidden blind spots in the model and thus failing to provide valuable gradient guidance signals for preference learning.

[0042] To address the aforementioned problems, this disclosure proposes an image super-resolution method based on adversarial Sobolev alignment. This disclosure breaks the inherent "isotropic" flat geometry assumption in traditional generative preference optimization, constructing a Sobolev geometric space that conforms to the natural spectral decay law of images. Furthermore, this disclosure dynamically captures and corrects the structural failure modes of the model through an adversarial learning mechanism. The technical concept of this disclosure will be described in detail below.

[0043] Figure 1 This is a flowchart of an image processing method according to an embodiment of the present disclosure. Figure 1 The method shown can be performed by an electronic device with image processing capabilities.

[0044] Reference Figure 1 In step S101, a conditional probability path is constructed regarding the noise distribution and the high-resolution data distribution conditioned on the image to be processed.

[0045] According to an embodiment, the underlying generation process of an image can be established on a conditional flow matching framework, and the mapping from noise to a high-resolution image can be achieved through continuous-time ordinary differential equations.

[0046] Assumption Indicates low-quality image c High-resolution data distribution (also known as high-resolution image or target image) conditioned on (i.e., the image to be processed). Represents the (prior) noise distribution, where, Let represent the identity matrix. In this case, the conditional probability path can be defined as a linear interpolation function, as shown in equation (1) below: (1) Among them, time .

[0047] The above-described conditional probability paths are merely exemplary, and this disclosure is not limited thereto.

[0048] In step S102, a velocity network is constructed based on the conditional probability path and the image to be processed.

[0049] According to the embodiment, the above conditional probability path can be adjusted with respect to time. Taking the derivative, we obtain the target conditional vector field. Speed networks can be built and trained using any neural network. Enable speed network By approximating the aforementioned target conditional vector field, the gradual generation of the image is achieved. In this disclosure, the velocity network may be referred to as the velocity field.

[0050] In step S103, a Sobolev geometric space is constructed for preference optimization of the velocity network.

[0051] To address the spectral mismatch problem caused by the indiscriminate penalty of the L2 norm for all frequencies in traditional methods, this disclosure performs spectral correction on the local transfer kernel.

[0052] As an example, the identity matrix in the noise distribution can be replaced with a structured spectral operator to obtain the Sobolev geometric space, where the structured spectral operator can be determined based on the frequency of the image to be processed.

[0053] According to the embodiment, a flow matching paradigm (i.e., a conditional flow matching framework) is adopted, by combining the ideal posterior distribution q with the strategy. The local transition is represented by a Gaussian approximation of the trajectory of a deterministic ordinary differential equation, thereby achieving parameterization of the local transition, as shown in equation (2) below: (2) in, , Indicates reference. This represents the auxiliary variance parameter in the likelihood definition. This isotropic parameterization implicitly incorporates the underlying Euclidean geometry.

[0054] make The residual between the velocity network and the target conditional vector field is represented by the following equation (3): (3) in, Indicates speed network, This represents the target conditional vector field.

[0055] Therefore, the log-likelihood ratio objective function can be simplified to the difference of the squared L2 norm, as shown in equation (4) below: (4) However, the root cause of spectral misalignment lies precisely in the... Norm dependency. According to Parseval's theorem, the spatial error can be expressed in frequency domain form, as shown in equation (5) below: (5) in, Indicates Fourier transform, Frequency index k Spectral error components at that location. M and N Let be the spatial dimension of the image. This equation shows that the optimization process imposes a uniform weight across the entire spectrum. This "spectral indifference" is disastrous in practice. Standard The objective function cannot offset the inherent spectral bias of neural networks, resulting in a significant deviation between the learned distribution in the low-frequency region and the statistical characteristics of natural images. This spectral deficiency not only exists at the theoretical level but also manifests directly in the loss of fine textures and the generation of artifacts.

[0056] According to an embodiment, this disclosure proposes a Sobolev Spectral Rectification method. This method reshapes the underlying optimization geometry by replacing the original isotropic noise assumption with colored Gaussian noise. Specifically, the transfer kernel can be generalized, that is, the identity matrix I can be replaced with a structured spectral operator. ,Right now . It can also be called a structured covariance matrix, as shown in equation (6) below: (6) in, s Indicates hyperparameters, This represents the image frequency, for example, the frequency of the image to be processed during the model's processing.

[0057] Due to the inverse covariance matrix (Right now The inverse matrix of the matrix imposes a higher penalty on high-frequency errors, thus this transformation elevates the optimized underlying geometry from a flat Euclidean space to a weighted Sobolev manifold. The above, that is, transforming into Sobolev geometric space.

[0058] In step S104, the velocity network is optimized in the Sobolev geometric space according to preferences.

[0059] According to the embodiment, the Sobolev energy difference between the velocity network and the reference network is determined in the Sobolev geometric space, semantically consistent preference data pairs are obtained, and preference optimization is performed on the velocity network based on the Sobolev energy difference and preference data pairs.

[0060] The above structured covariance matrix This induces a fundamental shift in optimization metrics. Since the Gaussian likelihood is governed by the Mahalanobis distance, the learned signal is determined by the precision matrix. Shaping. Although It functions as a low-pass filter, but its inverse matrix This amplifies high-frequency components, thus imposing a more severe penalty on fine-grained differences. Therefore, The introduced Sobolev inner product operator is restored, effectively elevating the optimization process from a flat Euclidean space to a weighted Sobolev manifold. Therefore, the log-likelihood ratio in the log-likelihood (i.e., equation (4) above) can be explicitly reconstructed as the difference in squared Sobolev norms, as shown in equation (7) below: (7) In this manifold space, the Sobolev energy difference can be obtained based on the residuals between the velocity network and the target conditional vector field, and the residuals between the reference network and the target conditional vector field. For example, the log-likelihood ratio between the residuals of the velocity network and the residuals of the reference network can be transformed into the Sobolev energy gap. As shown in equation (8) below: (8) To ensure that the positive and negative samples in the direct preference optimization process are absolutely consistent in spatial semantics, this disclosure designs a coupled sampling mechanism, which can determine an intermediate flow state from the conditional probability path as a positive sample, predict the degradation trajectory based on the positive sample to determine the degradation estimate of the high-resolution image, obtain the negative sample corresponding to the positive sample based on the degradation estimate and the noise distribution corresponding to the positive sample, and form a preference data pair based on the positive and negative samples.

[0061] As an example, first, obtain positive samples. For instance, an intermediate flow state can be extracted as a positive sample along the conditional probability path, denoted as . Then, the degradation trajectory is predicted. As an example, the adversarial network constructed in this disclosure can be used to determine another velocity network, which, along with positive samples, generates trajectory deflection values. Based on the structured spectral operator, the partial derivatives of the conditional probability path, and the residual energy of the adversarial network, the direction of negative samples is determined. Based on the directions of the positive and negative samples and the trajectory deflection values, a degradation estimate of the high-resolution image is determined.

[0062] For example, using adversarial networks to predict velocity fields. The trajectory deflection is generated, and the degradation estimate is obtained through linear extrapolation. As shown in equation (9) below: (9) Among them, superscript w The superscript 'a' indicates the direction to be learned; the superscript 'a' indicates the direction to be learned away from; and the inverted V symbol indicates "the predicted sample".

[0063] Implemented using the exact same initial noise as the positive sample branch. The degradation estimate is then reprojected back to the flow state, as shown in equation (10) below: (10) This mechanism isolates the interference caused by random variations, making the synthesized preference data more sensitive to... Precise semantic alignment was achieved.

[0064] According to the embodiments, in order to address the ill-conditioned limitation of models assigning high likelihood (i.e., "false confidence") to structural artifacts, this disclosure constructs a parameterized adversarial network. The adversarial network constructed in this disclosure can be used to derive the direction of negative samples, i.e., the direction that needs to be learned to move away from.

[0065] As an example, let's define the adversarial optimization objective: This disclosure does not employ the conventional adversarial method of generating random noise, but rather... Find negative samples that can simulate the model's "blind spot". Set the adversarial network to minimize the Sobolev Trust Region constraint. Euclidean residual energy .

[0066] Optimization goal: Constraints: Residual energy representing the model in image space. This is the preset disturbance threshold.

[0067] Derivation of the optimal adversarial direction: Based on the Riesz representation theorem and frequency domain duality, the optimal perturbation solution to the above constrained optimization problem is... The direction points to the Sobolev gradient in the worst-case scenario. Since directly calculating closed-form solutions is intractable, this disclosure utilizes parameterized adversarial networks. To approximate the solution for this direction, we can use the following equation (11): (11) By introducing a structured spectral operator adversarial networks The model is forced to learn coherent structural artifacts (such as texture distortion) that mimic the model's inherent failure modes, rather than generating meaningless random white noise that is easily filtered out, thus synthesizing realistic artifacts.

[0068] According to the embodiment, the synthesized strictly aligned data pairs can be incorporated into the preference dataset D, and combined with the derived Sobolev energy difference, a joint loss function can be constructed, as shown in the following equation (12): (12) in, It is a temperature hyperparameter. By minimizing this loss function, it is equivalent to maximizing the energy gap, forcing the model to automatically reject the "structural failure manifold" identified by the adversarial network during inference, and thus strictly generate high-fidelity images along the Sobolev geometric trajectory that conforms to the spectral characteristics of natural images.

[0069] In step S105, a high-resolution image of the image to be processed is obtained based on the optimized velocity network.

[0070] After optimizing the velocity network using the aforementioned loss function, the optimized velocity network can be used to obtain a high-resolution image corresponding to the image to be processed.

[0071] The images generated according to embodiments of this disclosure have higher resolution and higher performance in advanced vision downstream tasks.

[0072] Figure 2 This is another flowchart of an image processing method according to an embodiment of the present disclosure.

[0073] Reference Figure 2 In step S201, a basic generative backbone network based on Conditional Flow Matching is constructed.

[0074] According to an embodiment, the underlying generation process of the image can be built on a conditional flow matching framework, and the mapping from noise to a high-resolution image can be achieved through continuous-time ordinary differential equations.

[0075] First, conditional probability paths can be constructed. Let Indicates low-quality image c For high-resolution data distribution under certain conditions, The noise distribution is represented by the conditional probability path between the high-resolution data distribution and the noise distribution, which is defined as a linear interpolation, as shown in equation (1) above.

[0076] Then, the target conditional vector field can be calculated. This applies to the path described above with respect to time. t Differentiate to obtain the target conditional vector field.

[0077] Next, a velocity network can be trained. For example, a velocity network can be built and trained. This allows the image to be generated gradually, making it approximate the target conditional vector field.

[0078] In step S202, the Sobolev spectral correction (SSR) mechanism is introduced to reshape the geometric metric space of the optimization process.

[0079] First, a structured covariance matrix can be designed. For example, the scalar variance I in the standard Gaussian parameterization can be replaced with a structured covariance matrix that includes a priori information on the spectral density of the natural image. Due to the inverse covariance matrix This mathematical transformation imposes a higher penalty on high-frequency errors, elevating the underlying geometry of the optimization from a flat Euclidean space to a weighted Sobolev manifold. This achieves spatial geometric elevation.

[0080] Then, the Sobolev energy difference metric is established. For example, in this manifold space, the log-likelihood ratio between the velocity network prediction residual and the reference model residual is converted into the Sobolev energy difference, as shown in equation (8) above.

[0081] In step S203, an adversarial manifold guidance (AMG) module is constructed to dynamically capture the inherent structural blind spots of the model.

[0082] To address the ill-conditioned limitation of models in assigning high likelihood (i.e., "false confidence") to structural artifacts, this disclosure constructs a parameterized adversarial network. .

[0083] Setting an adversarial optimization objective: The optimization objective of the adversarial network is to satisfy the Sobolev trust region constraint. Under the premise of minimizing Euclidean residual energy Counter-disturbance .

[0084] Derivation of the optimal adversarial direction: According to the Riesz representation theorem, the optimal perturbation solution of this constrained optimization problem is the worst-case Sobolev gradient direction, as shown in equation (11) above.

[0085] Synthesizing realistic artifacts: By introducing a structured covariance matrix adversarial networks Instead of generating meaningless random white noise that is easily filtered out, the model is forced to learn coherent structural artifacts (such as texture distortion) that mimic the model's inherent failure modes.

[0086] In step S204, a coupled sampling strategy is applied to synthesize semantically strictly aligned preference data pairs.

[0087] As an example, we first obtain positive samples. For instance, we extract an intermediate flow state along the conditional path as a positive sample, denoted as . .

[0088] Then, predict the degradation trajectory. For example, using adversarial networks. Predicting the velocity field The generated trajectory deflection is obtained by linear extrapolation, as shown in equation (9) above.

[0089] Next, a strict noise reprojection is performed. For example, this is achieved using the exact same initial noise as the positive sample branch. The degradation estimate is then reprojected back to the flow state, as shown in equation (10) above.

[0090] This mechanism isolates the interference caused by random variations, making the synthesized data more sensitive to interference. Precise semantic alignment was achieved.

[0091] In step S205, Adversarial Sobolev Direct Preference Optimization (AS-DPO) is performed to achieve end-to-end high-fidelity training. First, integrate the preference data with the loss function. For example, incorporate the strictly aligned data pairs synthesized in step S204 into the preference dataset D, and combine them with the Sobolev energy difference derived in step S202 to construct the AS-DPO joint loss function, as shown in equation (12) above.

[0092] Then, the velocity network is optimized and updated. For example, by minimizing the loss function, which is equivalent to maximizing the energy gap, the model is forced to automatically reject the "structural failure manifold" identified by the adversarial network during inference, and thus generate high-fidelity images strictly along the Sobolev geometric trajectory that conforms to the spectral characteristics of natural images.

[0093] According to the embodiments, this disclosure breaks with the conventional path of traditional generative super-resolution models in low-level parameterization and preference learning, and obtains high-resolution images by adopting the following method.

[0094] (1) Reshaping the optimized geometric space, i.e., from "Euclidean isotropic" to "Sobolev anisotropic". Traditional methods rely heavily on flat Euclidean space metrics (such as the L2 norm) and the isotropic Gaussian assumption. This assumption treats all frequency components equally, which seriously deviates from the inherent law that the energy of natural images follows a power-law distribution with spatial frequency, thus inevitably producing high-frequency artifacts and spectral mismatch. This disclosure introduces Sobolev space guidance into the super-resolution generation framework and upgrades the underlying parameterization to the anisotropic Gaussian assumption. This injects a strong inductive bias into the model that distinguishes real textures from false noise. Frequency domain analysis shows that this disclosure significantly reduces the slope deviation of traditional methods from as high as 0.85 to 0.06, perfectly fitting the natural image manifold and fundamentally eliminating structural high-frequency artifacts.

[0095] (2) An "adversarial preference alignment" mechanism is created, which precisely targets the "false confidence" blind spots of the model. Existing generative models have an "false confidence" pathological feature, that is, they assign high likelihood to erroneous samples containing significant structural degradation (such as texture distortion, aliasing, etc.). Conventional adversarial methods can only generate meaningless random noise and cannot reach and correct these hidden structural blind spots. This disclosure designs a dynamic adversarial preference optimization (Adversarial DPO) mechanism. This mechanism includes an adversarial network specifically designed to explore the model's blind spots, which can accurately synthesize coherent structural artifacts that "the model seems confident but actually has visual degradation". These synthesized artifacts serve as highly valuable "hard negatives" to drive DPO training, forcing the model to cross the original false confidence blind spots.

[0096] (3) Overcoming the alignment bottleneck in preference learning, namely, achieving "strict spatial alignment" of positive and negative samples. In existing text-to-image (T2I) preference datasets, the differences between positive and negative samples are mostly due to different semantic layouts caused by random seeds (i.e., spatially separated). This masks the true perceptual degradation, causing DPO to fail to learn effective structural recovery features. This disclosure establishes a novel manifold projection alignment framework to ensure that generated negative samples and positive samples (GT) maintain absolute strict spatial alignment (co-located) during adversarial preference exploration. This mechanism successfully removes the interference of semantic layout, allowing the optimization gradient of preference learning to focus 100% on correcting the perceptual degradation and structural distortion of the image content itself.

[0097] (4) Breaking the "perception-distortion" trade-off, i.e., fully empowering advanced downstream vision tasks. Existing super-resolution technologies generally face the dilemma of a trade-off between perceptual quality (such as NIQE) and structural fidelity (such as peak signal-to-noise ratio (PSNR) / structural similarity index (SSIM)). The generated images are often just fake textures that are "pleasing to the human eye". Once connected to machine vision tasks such as optical character recognition (OCR) or object detection, the performance drops sharply. This disclosure benefits from the high-frequency accurate restoration of the Sobolev space and the blind spot elimination of adversarial preference learning, successfully balancing extreme visual realism and pixel-level structural fidelity. It comprehensively surpasses the existing state-of-the-art models in multiple downstream task indicators such as OCR, object detection, and instance segmentation, proving that the images reconstructed by this disclosure are not only visually realistic images, but also "machine-readable" data with high-confidence structural information.

[0098] Figure 3 This is a block diagram of an image processing apparatus according to an embodiment of the present disclosure.

[0099] Reference Figure 3The image processing device 300 may include a network construction module 301, a spatial construction module 302, and an optimization module 303. Figure 3 The number and names of the modules described are merely exemplary, and this disclosure is not limited thereto. Modules may be added, removed, merged, or split as needed.

[0100] According to an embodiment, the network construction module 301 can construct a conditional probability path with respect to the noise distribution and the high-resolution data distribution conditioned on the image to be processed; and construct a velocity network based on the conditional probability path and the image to be processed.

[0101] According to an embodiment, the space construction module 302 can construct a Sobolev geometric space for preference optimization of the velocity network.

[0102] According to an embodiment, the optimization module 303 can perform preference optimization on the velocity network in the Sobolev geometric space; and obtain a high-resolution image of the image to be processed based on the optimized velocity network.

[0103] According to an embodiment, the space construction module 301 can obtain the Sobolev geometric space by replacing the identity matrix in the noise distribution with a structured spectral operator, wherein the structured spectral operator is determined based on the frequency of the image to be processed.

[0104] According to an embodiment, the optimization module 303 can determine the Sobolev energy difference between the velocity network and the reference network in the Sobolev geometric space; obtain semantically consistent preference data pairs; and perform preference optimization on the velocity network based on the Sobolev energy difference and the preference data pairs.

[0105] According to an embodiment, the optimization module 303 can obtain the Sobolev energy difference based on the residual between the velocity network and the target conditional vector field in the Sobolev geometric space and the residual between the reference network and the target conditional vector field, wherein the target conditional vector field is obtained based on the conditional probability path.

[0106] According to an embodiment, the optimization module 303 can determine an intermediate flow state from the conditional probability path as a positive sample; predict the degradation trajectory based on the positive sample to determine the degradation estimate of the high-resolution image; obtain a negative sample corresponding to the positive sample based on the degradation estimate and the noise distribution corresponding to the positive sample; and construct the preference data pair based on the positive sample and the negative sample.

[0107] According to an embodiment, the optimization module 303 can use the constructed adversarial network to determine another velocity network; use the other velocity network and the positive sample to generate trajectory deflection values; determine the direction of the negative sample based on the structured spectral operator, the partial derivative of the conditional probability path and the residual energy of the adversarial network; and determine the degradation estimate of the high-resolution image based on the positive sample, the direction of the negative sample and the trajectory deflection value.

[0108] Figure 4 This is a schematic diagram of the structure of an image processing device in the hardware operating environment of an embodiment of this disclosure. Figure 4 The described image processing apparatus can perform image processing methods according to embodiments of the present disclosure.

[0109] like Figure 4 As shown, the image processing device 1000 may include: a processing component 1001, a communication bus 1002, a network interface 1003, an input / output interface 1004, a memory 1005, and a power supply component 1006. The communication bus 1002 is used to enable communication between these components. The input / output interface 1004 may include a video display (such as a liquid crystal display), a microphone and speaker, and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). Optionally, the input / output interface 1004 may also include a standard wired interface or a wireless interface. The network interface 1003 may optionally include a standard wired interface or a wireless interface (such as a Wi-Fi interface). The memory 1005 may be a high-speed random access memory or a stable non-volatile memory. The memory 1005 may also optionally be a storage device independent of the aforementioned processing component 1001.

[0110] Those skilled in the art will understand that Figure 4 The structure shown does not constitute a limitation on the image processing device 1000, and may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0111] like Figure 4 As shown, the memory 1005, which serves as a storage medium, may include an operating system (such as a MAC operating system), a data storage module, a network communication module, a user interface module, a program implementing this disclosure, and a database.

[0112] exist Figure 4In the image processing device 1000 shown, the network interface 1003 is mainly used for data communication with external electronic devices / terminals; the input / output interface 1004 is mainly used for data interaction with users; the processing component 1001 and the memory 1005 in the image processing device 1000 can be disposed in the image processing device 1000. The image processing device 1000 calls the program stored in the memory 1005 and various APIs provided by the operating system through the processing component 1001 to execute the image processing method provided in the embodiments of this disclosure.

[0113] Processing component 1001 may include at least one processor, and memory 1005 stores a set of computer-executable instructions. When the set of computer-executable instructions is executed by at least one processor, an image processing method according to an embodiment of the present disclosure is performed. However, the above examples are merely exemplary, and the present disclosure is not limited thereto.

[0114] The processing component 1001 can control the components included in the image processing device 1000 by executing a program.

[0115] As an example, the image processing device 1000 may be a PC, tablet, personal digital assistant, smartphone, or other device capable of executing the aforementioned set of instructions. Here, the image processing device 1000 is not necessarily a single electronic device, but may be any collection of devices or circuits capable of executing the aforementioned instructions (or instruction sets) individually or in combination. The image processing device 1000 may also be part of an integrated control system or system manager, or may be configured to interconnect with a portable electronic device locally or remotely (e.g., via wireless transmission) through an interface.

[0116] In the image processing device 1000, the processing component 1001 may include a central processing unit (CPU), a graphics processing unit (GPU), a programmable logic device, a dedicated processor system, a microcontroller, or a microprocessor. By way of example and not limitation, the processing component 1001 may also include an analog processor, a digital processor, a microprocessor, a multi-core processor, a processor array, a network processor, etc.

[0117] The processing component 1001 can execute instructions or code stored in memory, wherein memory 1005 can also store data. Instructions and data can also be sent and received over a network via network interface 1003, wherein network interface 1003 can employ any known transport protocol.

[0118] The memory 1005 can be integrated with the processing component 1001, for example, by arranging RAM or flash memory within an integrated circuit microprocessor. Alternatively, the memory 1005 can include a separate device, such as an external disk drive, a storage array, or other storage device that can be used by any database system. The memory and the processing component 1001 can be operatively coupled, or can communicate with each other, for example, via I / O ports, network connections, etc., enabling the processing component 1001 to read data stored in the memory 1005.

[0119] According to embodiments of this disclosure, an electronic device may be provided. Figure 5 This is a block diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 1100 may include at least one memory 1102 and at least one processor 1101. The at least one memory 1102 stores a set of computer-executable instructions. When the set of computer-executable instructions is executed by the at least one processor 1101, an image processing method according to an embodiment of the present disclosure is performed.

[0120] Processor 1101 may include a central processing unit (CPU), a graphics processing unit (GPU), a programmable logic device, a dedicated processor system, a microcontroller, or a microprocessor. By way of example and not limitation, processor 1101 may also include an analog processor, a digital processor, a microprocessor, a multi-core processor, a processor array, a network processor, etc.

[0121] The memory 1102, which serves as a storage medium, may include an operating system, a data storage module, a network communication module, a user interface module, a program for executing the methods of this disclosure, and a database.

[0122] The memory 1102 may be integrated with the processor 1101; for example, RAM or flash memory may be arranged within an integrated circuit microprocessor. Alternatively, the memory 1102 may include a separate device, such as an external disk drive, a storage array, or other storage device that can be used by any database system. The memory and processor may be operatively coupled, or may communicate with each other, for example, via I / O ports, network connections, etc., enabling the processor to read files stored in the memory.

[0123] In addition, electronic device 1100 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of electronic device 1100 can be interconnected via a bus and / or network.

[0124] As an example, electronic device 1100 may be a PC, tablet, personal digital assistant, smartphone, or other device capable of executing the aforementioned set of instructions. Here, electronic device 1100 is not necessarily a single electronic device, but may be a collection of any devices or circuits capable of executing the aforementioned instructions (or instruction sets) individually or in combination. Electronic device 1100 may also be part of an integrated control system or system manager, or may be configured to interconnect with a portable electronic device locally or remotely (e.g., via wireless transmission) through an interface.

[0125] As will be understood by those skilled in the art, Figure 5 The structure shown does not constitute a limitation on the structure and may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0126] According to embodiments of this disclosure, a computer-readable storage medium storing instructions may also be provided, wherein when the instructions are executed by at least one processor, they cause at least one processor to perform an image processing method according to this disclosure. Examples of computer-readable storage media herein include: read-only memory (ROM), random access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD+R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD+R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or optical disc storage, hard disk drive (HDD), solid-state drive (SSD), card storage (such as multimedia cards, secure digital (SD) cards, or ultra-fast digital (XD) cards), magnetic tape, floppy disk, magneto-optical data storage device, optical data storage device, hard disk, solid-state drive, and any other device configured to store a computer program and any associated data, data files, and data structures in a non-transitory manner and to provide the computer program and any associated data, data files, and data structures to a processor or computer so that the processor or computer can execute the computer program. The computer program in the aforementioned computer-readable storage medium can run in an environment deployed in computer devices such as clients, hosts, agent devices, servers, etc. Furthermore, in one example, the computer program and any associated data, data files, and data structures are distributed across a networked computer system, such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed manner through one or more processors or computers.

[0127] According to embodiments of this disclosure, a computer program product may also be provided, wherein the instructions in the computer program product can be executed by a processor of a computer device to perform the above-described image processing method.

[0128] According to the embodiments of this disclosure, this disclosure mainly solves the following problems: 1. Solve the problems of spectral mismatch and high-frequency artifacts caused by the isotropic assumption.

[0129] Existing generative super-resolution and direct preference optimization paradigms rely excessively on flat Euclidean space metrics (such as...) during the underlying parameterization. The geometric assumptions (norm) and isotropic Gaussian assumptions deviate significantly from the inherent law of energy decay with frequency in real natural images. This disclosure breaks through this geometric limitation by reshaping the geometric space on which the optimization objective depends, giving the model an inductive bias that distinguishes real high-frequency textures from spurious generated noise, thereby fundamentally eliminating structural high-frequency artifacts that violate the data manifold.

[0130] 2. Overcome the bottleneck of "difficult negative samples" in preference learning due to the lack of spatially strictly aligned samples.

[0131] Standard super-resolution datasets provide regression-oriented data pairs, while existing text-to-image preference datasets, due to the influence of random seeds, show differences between positive and negative samples primarily stemming from semantic layout, lacking spatial correspondence. This masks the true structural degradation of images, preventing preference optimization from acquiring effective structural degradation features. This disclosure constructs a dynamic generation mechanism that ensures the generated negative samples are strictly semantically aligned with the positive samples, thereby accurately stripping away and exposing the perceptual degradation of the image content itself.

[0132] 3. Correct the "false confidence" blind spot of generative models for structural failures.

[0133] Existing generative models generally exhibit an ill-posed characteristic: they assign extremely low residual energy or high likelihood to erroneous samples containing significant structural degradations (such as texture distortion, aliasing, and other spectral artifacts), demonstrating "false confidence." Conventional adversarial methods only generate meaningless random noise, failing to address these hidden blind spots in the model. This disclosure specifically locates and explores these inherent blind spots in the model, synthesizing coherent structural artifacts that appear confident but actually contain visual degradations, using these as high-value training signals to drive preference learning.

[0134] 4. Comprehensively improve the realism, structural fidelity, and downstream task performance of super-resolution images.

[0135] The alignment framework provided in this disclosure, based on rigorous theoretical support, can not only generate visually realistic natural textures in various degradation scenarios, but also maintain extremely high spectral consistency and structural fidelity. Furthermore, by accurately preserving high-resolution structural information, this disclosure significantly improves the performance of processed images in subsequent advanced visual tasks (such as optical character recognition, object detection, and semantic segmentation).

[0136] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.

[0137] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.

Claims

1. An image processing method, comprising: Construct a conditional probability path for the noise distribution and the high-resolution data distribution conditioned on the image to be processed; A velocity network is constructed based on the conditional probability path and the image to be processed; Construct a Sobolev geometric space for preference optimization of the velocity network; The velocity network is optimized in the Sobolev geometric space according to preferences. as well as A high-resolution image of the image to be processed is obtained based on the optimized velocity network.

2. The method according to claim 1, wherein, Constructing a Sobolev geometric space for preference optimization of the velocity network includes: The Sobolev geometric space is obtained by replacing the identity matrix in the noise distribution with a structured spectral operator. The structured spectrum operator is determined based on the frequency of the image to be processed.

3. The method according to claim 1, wherein, Preference optimization of the velocity network in the Sobolev geometric space includes: Determine the Sobolev energy difference between the velocity network and the reference network in the Sobolev geometry space; Obtain semantically consistent preference data pairs; Based on the Sobolev energy difference and the preference data pair, the velocity network is optimized according to preferences.

4. The method according to claim 3, wherein, Determining the Sobolev energy difference between the velocity network and the reference network in the Sobolev geometric space includes: The Sobolev energy difference is obtained based on the residual between the velocity network and the target conditional vector field in the Sobolev geometric space and the residual between the reference network and the target conditional vector field, wherein the target conditional vector field is obtained based on the conditional probability path.

5. The method according to claim 3, wherein, Obtain semantically consistent preference data pairs, including: An intermediate flow state is determined as a positive sample from the conditional probability path; Based on the positive samples, the degradation trajectory is predicted to determine the degradation estimate of the high-resolution image; Based on the degradation estimate and the noise distribution corresponding to the positive sample, obtain the negative sample corresponding to the positive sample; The preference data pair is constructed based on the positive samples and the negative samples.

6. The method according to claim 5, wherein, Based on the predicted degradation trajectory of the positive samples, the degradation estimate of the high-resolution image is determined, including: Another velocity network is determined using the constructed adversarial network; The trajectory deflection value is generated using the other velocity network and the positive sample; The direction of the negative sample is determined based on the structured spectral operator, the partial derivative of the conditional probability path, and the residual energy of the adversarial network. Based on the directions of the positive and negative samples and the trajectory deflection values, the degradation estimate of the high-resolution image is determined.

7. An image processing apparatus, comprising: The network building module is configured to: construct conditional probability paths about the noise distribution and the high-resolution data distribution conditioned on the image to be processed; A velocity network is constructed based on the conditional probability path and the image to be processed; The space construction module is configured to: construct a Sobolev geometric space for preference optimization of the velocity network; as well as An optimization module is configured to perform preference optimization on the velocity network in the Sobolev geometric space; and obtain a high-resolution image of the image to be processed based on the optimized velocity network.

8. The apparatus according to claim 7, wherein, The space construction module is configured as follows: The Sobolev geometric space is obtained by replacing the identity matrix in the noise distribution with a structured spectral operator. The structured spectrum operator is determined based on the frequency of the image to be processed.

9. The apparatus according to claim 7, wherein, The optimization module is configured as follows: Determine the Sobolev energy difference between the velocity network and the reference network in the Sobolev geometry space; Obtain semantically consistent preference data pairs; Based on the Sobolev energy difference and the preference data pair, the velocity network is optimized according to preferences.

10. The apparatus according to claim 9, wherein, The optimization module is configured as follows: The Sobolev energy difference is obtained based on the residual between the velocity network and the target conditional vector field in the Sobolev geometric space and the residual between the reference network and the target conditional vector field, wherein the target conditional vector field is obtained based on the conditional probability path.

11. The apparatus according to claim 9, wherein, The optimization module is configured as follows: An intermediate flow state is determined as a positive sample from the conditional probability path; Based on the positive samples, the degradation trajectory is predicted to determine the degradation estimate of the high-resolution image; Based on the degradation estimate and the noise distribution corresponding to the positive sample, obtain the negative sample corresponding to the positive sample; The preference data pair is constructed based on the positive samples and the negative samples.

12. The apparatus according to claim 11, wherein, The optimization module is configured as follows: Another velocity network is determined using the constructed adversarial network; The trajectory deflection value is generated using the other velocity network and the positive sample; The direction of the negative sample is determined based on the structured spectral operator, the partial derivative of the conditional probability path, and the residual energy of the adversarial network. Based on the directions of the positive and negative samples and the trajectory deflection values, the degradation estimate of the high-resolution image is determined.

13. An electronic device, characterized in that, include: At least one processor; At least one memory that stores computer-executable instructions. The computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the image processing method as described in any one of claims 1 to 6.

14. A computer-readable storage medium for storing instructions, characterized in that, When the instructions are executed by at least one processor, they cause the at least one processor to perform the image processing method as described in any one of claims 1 to 6.

15. A computer program product, wherein instructions in the computer program product are executed by at least one processor in an electronic device to perform the image processing method as claimed in any one of claims 1 to 6.