An image super-resolution reconstruction method and system based on an outlier exception table
By identifying anomalous weights in the DiT model and constructing an exception table, low-bit quantization and sparsity compensation are performed on the backbone network layer, solving the problem of detail loss and resource waste caused by outliers in low-bit image super-resolution reconstruction, and achieving efficient image reconstruction under low-bit conditions.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANGHAI JIAOTONG UNIV
- Filing Date
- 2026-03-05
- Publication Date
- 2026-06-23
Smart Images

Figure CN122265032A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the fields of artificial intelligence and computer vision technology, specifically to an image super-resolution reconstruction method and system based on an outlier exception table. Background Technology
[0002] The Real-World Image Super-Resolution (RDR) task aims to restore high-resolution images from low-resolution images captured in real-world scenes. Unlike idealized downsampling, this type of input typically contains complex and unknown degradation factors, such as sensor noise, optical blur, compression artifacts, and irregular sampling, making the model more sensitive to details and structural consistency during inference. In recent years, the Diffusion Transformer (DiT) architecture has performed exceptionally well in this task; however, its large model size and high inference cost often necessitate low-bit quantization in practical deployments to reduce storage and computational overhead. Common techniques used in current deployments include: 1) Unified low-bit uniform quantization: Use fixed-bit uniform quantization for all or most weights in the layer; 2) Hybrid precision allocation scheme: Allocate different numbers of bits to different layers or modules to reduce errors in critical layers; 3) Channel-level quantization based on quantization scale: Set scaling factors by output channel or group to adapt to dynamic range.
[0003] However, the aforementioned existing technologies still have the following drawbacks in DiT deployments with real-world degradation superresolution: At low bit depths, detail loss and artifacts are likely to occur: In fixed low bit uniform quantization, a small number of large-value elements (outliers) in the weights will significantly increase the quantization range, forcing most normal-value weights to use a coarser quantization step size, resulting in smoothing of texture details, deformation of edge structures, and even gradual accumulation of errors during diffusion inference, leading to unstable output.
[0004] Low resource utilization efficiency: Although channel-level or hierarchical scaling can partially alleviate the dynamic range problem, it is still often necessary to expand the quantization range when dealing with outliers, thereby sacrificing the representation accuracy of the main weights and causing a waste of resources by "paying the overall accuracy price for a small number of outliers".
[0005] Hybrid precision methods have high engineering costs: existing hybrid precision solutions usually require calibration data, sensitivity assessments, or multiple rounds of testing to determine the configuration of each layer; in real-world degradation super-resolution scenarios, degradation types are diverse and their distribution varies greatly, making configuration migration and reuse difficult and the deployment process complex.
[0006] A literature search of existing technologies revealed a Chinese patent with publication number CN121092907A, which proposes a low-bit, large-model quantization method and system based on sensitivity analysis and outlier handling. This application employs a two-stage weighted quantization approach to collaboratively address outlier and error accumulation issues, identifying key weights sensitive to quantization errors and thus using a differentiated strategy during the quantization process to minimize model performance loss. However, this method cannot store and compensate for outlier weights with high precision, nor can it effectively handle a small number of outliers in the weights without relying on or minimizing additional calibration costs.
[0007] Therefore, there is an urgent need for a new image super-resolution reconstruction method and system that can handle outliers in the weights as much as possible without relying on additional calibration costs, avoid amplifying quantization errors, and thus maintain the detail restoration capability and inference stability of the DiT model in real degraded image super-resolution tasks under low bit conditions. Summary of the Invention
[0008] To address the shortcomings of existing technologies, the purpose of this application is to provide an image super-resolution reconstruction method and system based on an outlier exception table.
[0009] According to a first aspect of this application, an image super-resolution reconstruction method based on an outlier exception table is provided, comprising: Obtain the real degraded low-resolution image to be reconstructed and determine the pre-trained DiT model; Obtain the weight matrix of the backbone network layer in the DiT model and define the exception table; Identify anomalous weights for each weight matrix and generate a set of anomalous indices; A low-bit weight matrix and an exception table for each backbone network layer are generated based on the abnormal weight index set. The exception table is used to compensate for the output of the backbone network layer. The inference configuration of the DiT model is defined based on a low-bit weight matrix and an exception table; Based on a low-bit weight matrix, exception table, and inference configuration, the DiT model is used to perform super-resolution reconstruction of real degraded low-resolution images, outputting high-resolution reconstructed images.
[0010] Optionally, obtaining the weight matrix of the backbone network layer in the DiT model and defining an exception table includes: Obtain the weight matrix of the backbone network layer in the DiT model; Set the target number of bits for quantization in the backbone network layer; Define an exception table, which is used to store the abnormal weights in the weight matrix corresponding to each backbone network layer; The exception table defines a selection strategy for exception weights, and the selection strategy is as follows: For each backbone network layer, its weight matrix is grouped, and weight elements whose absolute weight value is greater than a preset amplitude threshold are selected from each group and used as abnormal weights of that backbone network layer. Alternatively, for each backbone network layer, its weight matrix can be grouped, and a predetermined number of weight elements with the largest absolute weight value can be selected from each group as the abnormal weights of that backbone network layer.
[0011] Optionally, for each backbone network layer, its weight matrix is grouped according to any of the following methods: Output channel grouping: Using the output channels of the backbone network layer weight matrix as the basic unit, the weight elements corresponding to a single output channel are divided into a group; Grouping by input dimension: Using the input dimension of the backbone network layer weight matrix as the basic unit, the weight elements corresponding to a single input dimension are divided into a group; Block grouping: The weight matrix of the backbone network layer is divided into several non-overlapping block regions, and the weight elements in a single block region are used as a group.
[0012] Optionally, the step of identifying anomalous weights for each weight matrix and generating an anomalous index set includes: For each backbone network layer, take the absolute value of each weight element in the weight matrix to obtain the absolute weight matrix. Based on the absolute value matrix of weights, abnormal weights are selected from each backbone network layer according to the selection strategy of abnormal weights. Count the number of abnormal weights in each backbone network layer; If the number of abnormal weights exceeds the preset abnormal number threshold, then the preset number of abnormal weights with the largest absolute weight value are selected from all abnormal weights as the final abnormal weights. If the number of abnormal weights does not exceed the preset abnormal number threshold, then the abnormal weights of the backbone network layer will not be processed. Generate an abnormal weight index set for each backbone network layer, wherein the coordinates of each abnormal weight are stored in the abnormal weight index set.
[0013] Optionally, generating the low-bit weight matrix and exception table for each backbone network layer based on the abnormal weight index set includes: For each backbone network layer, abnormal weights are removed from the abnormal weight index set from the weight matrix to obtain the target weight matrix; Each target weight matrix is subjected to low-bit quantization to obtain a low-bit weight matrix, and the scaling factor corresponding to the low-bit quantization is obtained. The quantization bit width of the low-bit quantization is configured to the target number of bits. An exception table is generated for each backbone network layer. The exception table stores the abnormal weight values in the abnormal weight index set and the coordinates of the abnormal weights.
[0014] Optionally, the inference configuration of the DiT model defined based on the low-bit weight matrix and exception table includes: The equivalent linear calculation result of the low-bit weight matrix and the input features of the backbone network layer is used as the initial output value of the corresponding backbone network layer. The initial output of the backbone network layer is corrected based on the abnormal weights in the exception table. The specific correction method is as follows: traverse each abnormal weight in the exception table. Where i is the output channel index and j is the input component index, the input component corresponding to the j-th input dimension in the input features of the backbone network layer is obtained. , the input component with abnormal weights The product is accumulated and added to the output channel of the backbone network layer. This completes the correction of the output values of the backbone network layer.
[0015] Optionally, the step of using a DiT model to perform super-resolution reconstruction of the real degraded low-resolution image based on a low-bit weight matrix, exception table, and inference configuration, and outputting a high-resolution reconstructed image, includes: The low-bit weight matrix, scaling factor, exception table, and inference configuration of each backbone network layer are encapsulated into a model deployment package; Load the model deployment package into the inference framework of the pre-trained DiT model; The real degraded low-resolution image is input into the DiT model loaded with the model deployment package, and the high-resolution reconstructed image is output through the diffusion inference process of the DiT model.
[0016] According to a second aspect of this application, an image super-resolution reconstruction system based on an outlier exception table is provided, comprising: The acquisition module is used to acquire the real degraded low-resolution image to be reconstructed and to determine the completed pre-trained DiT model; The weight extraction module is used to obtain the weight matrix of the backbone network layer in the DiT model and define the exception table. An anomaly construction module is used to identify anomalous weights for each weight matrix and generate an anomaly index set. The exception table generation module is used to generate low-bit weight matrices and exception tables for each backbone network layer based on the set of exception weight indexes. The inference configuration module is used to define the inference configuration of the DiT model based on the low bit weight matrix and the exception table. The reconstruction module is used to perform super-resolution reconstruction of real degraded low-resolution images based on a low bit weight matrix, exception table, and inference configuration using a DiT model, and output a high-resolution reconstructed image.
[0017] According to a third aspect of this application, a non-transitory computer-readable storage medium is provided, on which a computer program is stored, which, when executed by a processor, implements the steps of the image super-resolution reconstruction method based on an outlier exception table as provided in the first aspect of this application.
[0018] According to a fourth aspect of this application, an electronic device is provided, comprising: At least one memory for storing program instructions; At least one processor is configured to invoke program instructions stored in the memory and execute the steps of the image super-resolution reconstruction method based on an outlier exception table as provided in the first aspect of this application, according to the obtained program instructions.
[0019] This application provides an image super-resolution reconstruction method based on an outlier exception table. By identifying the weight matrix, an outlier index set is obtained, which allows outlier weights to be separated from the quantization range estimation of the backbone network layer. Most weights can be uniformly quantized with low bits to obtain a low-bit weight matrix, solving the problem in existing technologies where a small number of outliers widen the quantization range and cause the quantization step size of the main weights to become coarser. Based on the outlier weights, an exception table is established for separate storage and compensation, which can improve inference stability with less additional computational overhead. This solves the problem in existing technologies where it is necessary to increase the overall bit size or rely on complex mixed precision search to avoid the influence of outliers, resulting in high deployment costs.
[0020] Other technical effects resulting from the additional features will be further illustrated in the corresponding embodiments. Attached Figure Description
[0021] Other features, objects, and advantages of this application will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings: Figure 1 This is a flowchart of an image super-resolution reconstruction method based on an outlier exception table in one embodiment of this application; Figure 2 This is a schematic diagram of weight quantization based on an exception table in one embodiment of this application; Figure 3 This is a schematic diagram of reasoning compensation based on an exception table in one embodiment of this application; Figure 4 This is a comparison image of the super-resolution reconstruction results of an embodiment of this application and a comparative scheme; Figure 5This is a schematic diagram of an image super-resolution reconstruction system based on an outlier exception table in one embodiment of this application. Detailed Implementation
[0022] The present application will now be described in detail with reference to specific embodiments. These embodiments will help those skilled in the art to further understand the present application, but do not limit the present application in any way. It should be noted that those skilled in the art can make several modifications and improvements without departing from the concept of the present application, and these all fall within the protection scope of the present application. Parts not described in detail in the following embodiments can be implemented using existing technology.
[0023] It should be noted that all information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of related data must comply with relevant regulations.
[0024] The real-world degraded image super-resolution task aims to restore high-resolution images from low-resolution images acquired in real-world scenes. In recent years, the DiT architecture has performed exceptionally well in this task, but its large model size and high inference cost mean that practical deployment often requires low-bit quantization to reduce storage and computational overhead. Existing technologies often only set a uniform quantization bit or perform layer-by-layer mixing precision configuration; existing low-bit quantization typically determines the quantization interval directly based on the maximum amplitude of the entire layer, leading to a small number of outliers widening the interval; traditional quantization does not model a small number of outliers separately, causing the overall accuracy to be dragged down by extreme values; existing technologies either use overall high bitness to avoid outlier problems or introduce complex layer-by-layer mixing precision search. Existing technologies still have the following drawbacks in the deployment of DiT for real-world degraded super-resolution: low bitness easily leads to detail loss and artifacts, low resource utilization efficiency, and high engineering cost of mixing precision methods. Based on the above problems, this application provides an image super-resolution reconstruction method based on an outlier exception table to solve the aforementioned problems.
[0025] Reference Figure 1 As shown, this application provides an image super-resolution reconstruction method based on an outlier exception table, including: S1. Obtain the real degraded low-resolution image to be reconstructed and determine the pre-trained DiT model; S2. Obtain the weight matrix of the backbone network layer in the DiT model and define the exception table; S3. Identify abnormal weights for each weight matrix and generate an abnormal index set; S4. Generate low-bit weight matrices and exception tables for each backbone network layer based on the abnormal weight index set. The exception tables are used to compensate for the output of the backbone network layers. S5. Define the inference configuration of the DiT model based on the low bit weight matrix and exception table; S6. Based on the low bit weight matrix, exception table and inference configuration, the DiT model is used to perform super-resolution reconstruction of real degraded low-resolution images and output high-resolution reconstructed images.
[0026] In the embodiments described above, by identifying the weight matrix to obtain an anomaly index set, abnormal weights can be separated from the quantization range estimation of the backbone network layer. Most weights can be uniformly quantized with low bits to obtain a low-bit weight matrix, which solves the problem in the prior art where a small number of outliers widen the quantization range and cause the quantization step size of the main weights to become coarser. Based on the abnormal weights, an exception table is established for separate storage and compensation, which can improve inference stability with a small additional computational overhead. This solves the problem in the prior art where it is necessary to increase the overall bit size or rely on complex mixed precision search to avoid the influence of outliers, resulting in high deployment costs.
[0027] In some specific embodiments of this application, obtaining the weight matrix of the backbone network layer in the DiT model and defining an exception table may further include: S21. Obtain the weight matrix of the backbone network layer in the DiT model; S22. Set the target number of bits for quantization of the backbone network layer; S23. Define an exception table, which is used to store the abnormal weights in the weight matrix corresponding to each backbone network layer. S24. In the exception table, define the selection strategy for exception weights. The selection strategy is as follows: For each backbone network layer, its weight matrix is grouped, and weight elements whose absolute weight value is greater than a preset amplitude threshold are selected from each group and used as abnormal weights of that backbone network layer. Alternatively, for each backbone network layer, its weight matrix can be grouped, and a predetermined number of weight elements with the largest absolute weight value can be selected from each group as the abnormal weights of that backbone network layer.
[0028] For example, the above steps are used to obtain the model to be deployed and the quantization configuration, specifically including: Obtain the parameter set of the DiT model for super-resolution tasks on real degraded images. The parameters should include at least the weight matrices of several backbone network layers. The backbone network layer includes at least one of a linear transformation layer, a projection layer, and a feedforward layer, wherein the subscript... l Indicates the level index; Define quantitative deployment specifications, including: Target number of bits for backbone network layer quantization (e.g., 4 bits); exception table storage precision (e.g., FP16 or INT8); exception ratio or exception cap for each backbone network layer's corresponding abnormal weights (e.g., channel balancing Top-K or full-layer ratio). ); Exception selection threshold strategy (amplitude threshold or contribution threshold).
[0029] The embodiments described above in this application introduce an exception table mechanism in addition to the unified low-bit backbone, serving as a dedicated channel for handling outliers. Because existing fixed low-bit uniform quantization forces the quantization range to expand when a small number of large-value weights are present, resulting in a coarser quantization step size for the main weights and a significant decrease in representation accuracy; this application aims to suppress the pulling effect of outliers on the quantization range under low-bit conditions, solving the problem of outliers dominating the quantization range in low-bit uniform quantization.
[0030] In some specific embodiments of this application, for each backbone network layer, its weight matrix is grouped according to any of the following methods: Output channel grouping: Using the output channels of the backbone network layer weight matrix as the basic unit, the weight elements corresponding to a single output channel are divided into a group; Grouping by input dimension: Using the input dimension of the backbone network layer weight matrix as the basic unit, the weight elements corresponding to a single input dimension are divided into a group; Block grouping: The weight matrix of the backbone network layer is divided into several non-overlapping block regions, and the weight elements in a single block region are used as a group.
[0031] For example, the output channels (output dimensions) of the weight matrix correspond to the row direction of the matrix, and a single row vector represents the weight element corresponding to an output channel; the input dimensions of the weight matrix correspond to the column direction of the matrix, and a single column vector represents the weight element corresponding to an input dimension.
[0032] The embodiments described above in this application group the weight matrix to avoid the problem of insufficient compensation caused by outliers being concentrated in a few channels.
[0033] In some specific embodiments of this application, identifying anomalous weights and generating an anomalous index set for each weight matrix may further include: S31. For each backbone network layer, take the absolute value of each weight element in the weight matrix to obtain the weight absolute value matrix. S32. Based on the absolute value matrix of weights, select abnormal weights from each backbone network layer according to the selection strategy of abnormal weights. S33. Count the number of abnormal weights in each backbone network layer; If the number of abnormal weights in this layer exceeds the preset abnormal number threshold, then select the preset number of abnormal weights with the largest absolute weight from all abnormal weights as the final abnormal weights. If the number of abnormal weights in this layer does not exceed the preset abnormal number threshold, then the abnormal weights of this backbone network layer will not be processed. S34. Generate an abnormal weight index set for each backbone network layer, storing the coordinates of each abnormal weight in the abnormal weight index set.
[0034] For example, the above steps are for each layer of weight matrix Identify its abnormal weight index set The abnormal weight index set The method for recording the position index of abnormal weight elements includes the following steps: Step 3.1: Calculate the anomaly criteria: Calculate anomaly criteria for the weight elements in the weight matrix or their groupings (by output channel / by input dimension / by block). Anomaly criteria must include at least one of the following: Amplitude criterion: Absolute value of weighted elements ,in The preset amplitude threshold can be the quantile value of the absolute value distribution of the weights in this layer (such as the 99.9th percentile), the sum of the mean and twice the standard deviation, or a preset proportion of the maximum absolute value of the weights. Top- Criterion: Select the layer with the largest absolute value. Each weighted element is used as an anomaly set; Grouping criteria: Perform Top- on each output channel (or each group). Criteria or amplitude criteria are used to avoid anomalies being concentrated in a few channels, leading to insufficient compensation.
[0035] Step 3.2: Constrain the size of the anomaly set: To meet deployment budget and acceleration requirements, anomaly weight index set was added. The number of elements is set to an upper limit. When the upper limit is exceeded, the most significant part is retained from high to low according to the exception criteria, and the remaining elements are not included in the exception table.
[0036] The embodiments described above in this application explicitly identify the set of anomalous weight elements, preventing outliers from dominating the main quantization range. This application employs anomaly weight identification and index set construction technology (screening a small number of high-amplitude / high-impact elements in each layer of weights to form a limited-size anomalous set), which can separate outliers from the main quantization range estimation, solving the technical problem in existing technologies where a small number of outliers widen the quantization range, leading to a coarser quantization step size for the main weights.
[0037] In some specific embodiments of this application, generating the low-bit weight matrix and exception table for each backbone network layer based on the abnormal weight index set may further include: S41. For each backbone network layer, remove the abnormal weights from the abnormal weight index set from the weight matrix to obtain the target weight matrix. S42. Perform low-bit quantization on each target weight matrix to obtain a low-bit weight matrix, and obtain the scaling factor corresponding to the low-bit quantization. Configure the quantization bit width of the low-bit quantization to the target number of bits. S43. Generate an exception table for each backbone network layer. The exception table stores the abnormal weight values in the abnormal weight index set and the coordinates of the abnormal weights.
[0038] For example, the above steps are used to construct the weight representation of the backbone network layer quantization weights and exception table compensation. Each backbone network layer is represented by two parts: a low-bit weight matrix and an exception table. Specifically, the steps include: Step 4.1: Remove outliers from the original weight matrix to form the target weight matrix. (i.e., the backbone weight matrix): in, For the set of abnormal weight indices The corresponding binary mask (the position where the anomaly weight is located is 1, and the rest are 0). This indicates element-wise multiplication.
[0039] Step 4.2: Perform low-bit quantization on the target weight matrix: For the target weight matrix Low-bit weight matrix is obtained by using symmetric uniform quantization or group quantization. Record the corresponding scaling factor, which can be recorded by layer, by output channel, or by group: Recorded by layer: The entire backbone network layer shares a single scaling factor; Recorded by output channel: Each output channel corresponds to an independent scaling factor; Recorded by group: Each weighted group corresponds to an independent scaling factor; Step 4.3: Create an exception table and store outlier information: Build an exception table Used to store the set of abnormal weight indexes In: Index information (e.g., two-dimensional coordinates of each anomaly weight, linear index, or compressed index); High-precision values corresponding to anomaly weights (FP16 / INT8, etc.); Optional compression encoding (such as row sorting, differential encoding, run-length encoding) can reduce index storage overhead.
[0040] In the embodiments described above in this application, during low-bit quantization, the quantization interval is determined by the target weight matrix. The statistic is determined, rather than by a statistic that includes outlier weights. This decision allows for finer quantization steps in the backbone network layer weights. This application achieves a smaller main body quantization error with the same backbone bit count by using a structured representation that quantizes the backbone after removing outliers and accurately compensates for exceptions.
[0041] To address the issue that DiT super-resolution models are highly sensitive to fine-grained textures and edge structures under complex degradation conditions such as realistic blurring, noise, and compression artifacts, and that low-bit quantization easily leads to over-smoothing, pseudo-textures, and structural shifts, this application aims to improve detail fidelity and structural consistency during quantization inference, solving the problem that details and structures are easily destroyed by quantization errors in super-resolution tasks of realistically degraded images. This application employs a technique of separating backbone low-bit quantization and outlier exception table storage (performing uniform low-bit quantization on the backbone weights after outlier removal, while simultaneously establishing an exception table with high precision for the outlier set and optionally performing index compression). This achieves improved weight expression accuracy and detail fidelity while maintaining low backbone bitness, solving the technical problems of texture detail loss, edge structure deformation, and obvious artifacts under low-bit quantization in existing technologies.
[0042] In some specific embodiments of this application, the inference configuration of the DiT model defined based on the low bit weight matrix and exception table may further include: S51. Use the equivalent linear calculation result of the low-bit weight matrix and the input features of the backbone network layer as the initial output value of the corresponding backbone network layer. S52. Based on the abnormal weights in the exception table, correct the initial output of the backbone network layer. The correction method is as follows: traverse each abnormal weight in the exception table. Where i is the output channel index and j is the input component index, the input component corresponding to the j-th input dimension in the input features of the backbone network layer is obtained. , the input component with abnormal weights The product is accumulated and added to the output channel of the backbone network layer. This completes the correction of the output values of the backbone network layer.
[0043] For example, the above steps are used to configure the DiT model during the inference phase. During the inference phase, the model performs a linear computation combining the backbone output and exception compensation. For each backbone network layer, the input features are processed... The output is obtained by performing the following calculations. : Among them, the first item The linear output of the main low-bit weight matrix, the second term For example, appearance compensation item, The specific method for obtaining the output of the backbone network layer is as follows: Step 5.1: Linear calculation of the low-bit backbone matrix: using the low-bit weight matrix Input features Perform matrix multiplication or convolution operations, or other equivalent linear calculations, to obtain the output of the backbone network layer.
[0044] Step 5.2: Calculation of sparsity compensation for exception tables: Based on the exception tables... The index and value in the output are used to perform sparse summation correction: For each abnormal weight (Located in the output channel) Input dimensions ),Will Accumulated to the output channel ,in, This represents the input component corresponding to the j-th input dimension in the input features of the backbone network layer; To improve efficiency, abnormal weight items can be aggregated by output channel, or multiple abnormal weight items of the same output channel can be vectorized for calculation.
[0045] The embodiments described above in this application perform sparsity compensation only on a very small number of outlier connections, significantly improving detail and stability while maintaining a low bit depth in the backbone. Existing methods often rely on calibration data, layer-by-layer search, or multiple rounds of experimentation to determine the quantization configuration, resulting in complex engineering processes that are difficult to reuse under different models / data distributions. This application aims to provide a quantization deployment mechanism that can be implemented without or with minimal additional calibration and parameter tuning, improving engineering usability and portability.
[0046] This application employs a sparse compensation superposition computation technique during the inference stage (superimposing the output of the low-bit matrix multiplication of the backbone with the sparse correction terms corresponding to the exception table, and supporting aggregation or vectorization by output channel to reduce additional overhead). This technique can restore the contribution of key connections and improve inference stability with relatively small additional computational overhead, solving the technical problems in the prior art where it is necessary to increase the overall bit count or rely on complex mixed precision search to avoid the influence of outliers, resulting in high deployment costs.
[0047] In some specific embodiments of this application, based on a low-bit weight matrix, exception table, and inference configuration, a DiT model is used to perform super-resolution reconstruction of real degraded low-resolution images and output a high-resolution reconstructed image. This may further include: The low-bit weight matrix, scaling factor, exception table, and inference configuration of each backbone network layer are encapsulated into a model deployment package; Load the model deployment package into the inference framework of the pre-trained DiT model; The real degraded low-resolution image is input into the DiT model loaded with the model deployment package, and the high-resolution reconstructed image is output through the diffusion inference process of the DiT model.
[0048] The embodiments described above in this application output a deployment package for super-resolution inference of real degraded images. Specifically, the low-bit weight matrix, scaling factor, exception table (index, high-precision value), and inference computation configuration of each backbone network layer are encapsulated into a deployment package. In the application of super-resolution of real degraded images, a low-resolution image is input, and a high-resolution reconstructed image is output through the DiT model's diffusion inference process, achieving high-quality detail restoration with low storage and low computational cost.
[0049] This application relates to model compression and inference deployment for deep neural networks, specifically a low-bit weight quantization and inference acceleration method for DiT models used in real-world degraded image super-resolution. This application deploys the DiT model for real-world degraded image super-resolution under low-bit conditions, balancing inference efficiency and detail fidelity. The core principle is to use uniform low-bit quantization for the vast majority of weights, while establishing an exception table for a small number of high-amplitude / high-impact outlier weights and storing and compensating them separately with high precision. This avoids outliers widening the overall quantization range and reducing the accuracy of the main weight representation.
[0050] Furthermore, this application addresses a small number of outliers in the weights through specialized processing, thereby reducing storage and computational overhead while maintaining inference stability and high-quality detail reconstruction. This application enables low-bit, high-stability quantization deployment of DiT models for super-resolution of real degraded images, significantly reducing model storage and inference costs while maintaining good detail reconstruction and structural consistency. It overcomes the problems of outliers dominating the quantization range, impaired main quantization accuracy, and instability of low-bit inference in existing technologies.
[0051] The present application will be further described below with reference to specific embodiments in order to better understand the above technical solutions of the present application. It should be understood that the following are only some examples and are not intended to limit the present application.
[0052] Example 1: Low-bit deployment of DiT model for super-resolution of real degraded images 1) Input data and experimental setup Model input: A sequence of low-resolution images of real-world scenes, magnified to [value missing]. The input image contains real degradation factors, such as blur, noise, compression artifacts, and non-ideal downsampling caused by complex imaging.
[0053] Model: DiT model for super-resolution of real degraded images (loaded with pre-trained full-precision weights).
[0054] Deployment goal: To significantly reduce weight storage and inference overhead while maintaining texture detail, edge structure, and overall stability.
[0055] Quantitative allocation: Target number of bits: 4-bit symmetric uniform quantization (INT4); Exception table weights: FP16 storage (or INT8 storage is also acceptable); Exception selection strategy: Each layer is selected by "Top- for each output channel". "or "Top ratio of all layers" constraint.
[0056] Hardware environment: Single A6000 GPU inference (48GB VRAM), batch size 1.
[0057] 2) Implementation steps This embodiment achieves the objective of this application through the following steps: Step 1: Obtain the weights of the backbone network layers and initialize the quantization parameters. Read the linear layer weight matrix that participates in the forward computation in the DiT model ; Set target number of bits And set the exception table precision to FP16, and set the exception size strategy, for example: Strategy A (Channel Balancing Top- ): Each output channel retains the one with the largest absolute value. Each weight is considered an anomaly; Strategy B (Full Layer Ratio) ): Each layer retains the one with the largest absolute value Weights are considered anomalies.
[0058] (Default value:) or ) Unlike conventional unified low-bit quantization, this step explicitly introduces an exception channel, and the abnormal weights will be separated from the backbone quantization later.
[0059] Step 2: Identify abnormal weights and generate an exception index set For each layer weight matrix Execution of anomaly detection: Calculate the absolute value matrix of weights ; Select the abnormal weight index set according to strategy A or strategy B. Abnormal weight index set Store the coordinates of the abnormal weight elements Or linear index; If the number of anomalies exceeds the limit, the data will be truncated from largest to smallest absolute value to meet the budget.
[0060] The purpose of this step is to prevent outliers from widening the main quantization range, so that the main weights can obtain a finer quantization step size.
[0061] Step 3: Construct backbone weights and perform low-bit quantization Abnormal locations are removed at each layer to form the backbone weights: in, This is a mask for abnormal locations.
[0062] Subsequently, the target weight matrix was analyzed. Perform 4-bit symmetric uniform quantization Q[ ] to obtain the low-bit weight matrix. and a set of scaling factors (which can be set by layer / by output channel / by grouped records), for example: Scaling by output channel: Each line (output channel) records the scaling factor separately to adapt to the dynamic range of different channels.
[0063] Step 4: Create an outlier exception table and compress and store it. Build an exception table Storage: Anomaly index (a list of input indexes grouped by output channel); anomaly weight values (FP16 or INT8); optional compression: differential encoding is used after index sorting to reduce storage.
[0064] Unlike existing mixed-precision "whole-layer bit incrementing", this application retains high precision only for a very small number of critical connections, resulting in lower overall cost.
[0065] Reference Figure 2 As shown, the weight settings W for each backbone network layer include an exception table. and the target weight matrix The quantization result (i.e., the low-bit weight matrix).
[0066] Step 5: Refer to Figure 3 As shown, the inference phase executes "main branch output, exception compensation". For input features The linear computation is obtained as follows: : Main branch: Use low-bit weight matrix (After dequantization or low-bit operator) Calculate the first term Ln of the backbone output. ; Exception branch: Traversing the exception list For outliers, perform sparsity compensation: for each outlier weight ,Will Add to output channel The second item was obtained. ; Fusion: .
[0067] This process is repeated in all relevant linear layers of the model to complete diffusion inference and output a high-resolution reconstructed image.
[0068] 3) Experimental Results To verify the feasibility of this application, this embodiment performs comparative tests on the real degradation test set DrealSR. The comparison objects include: Solution S1: Full-precision FP model; Scheme S2: Uniform 4-bit quantization (no exception table); Scheme S3: The method of this application (4-bit backbone + exception table compensation).
[0069] (1) Image quality and perceived quality: In the real degradation test set DrealSR (93 images), The following results were obtained from the above: Scheme S1 (FP): MUSIQ= 64.69, MANIQA= 0.4483, ClipIQA= 0.5555, LIQE= 4.031 Scheme S2 (Unified W4): MUSIQ = 64.05, MANIQA = 0.4274, ClipIQA = 0.5360, LIQE = 3.817 (significant quantization noise appears) Option S3 (this application): MUSIQ = 64.37, MANIQA = 0.4379, ClipIQA = 0.5496, LIQE = 3.947 (close to FP, with significantly better detail retention than S2). Among them, MUSIQ, MANIQA, ClipIQA, and LIQE are all image quality evaluation metrics. For all of these metrics, a higher value indicates higher image quality.
[0070] Figure 3 In RealLQ250:027, it refers to a real-world (REAL) low-resolution image dataset containing 250 images, with 027 representing image number 27. (See reference...) Figure 4It can be seen that Scheme S2 exhibits obvious over-smoothing and pseudo-texture, while the detail preservation of this application is significantly better than that of Scheme S2; in real degraded image super-resolution scenarios, it can effectively alleviate the problem of detail loss and structural instability caused by low bit quantization.
[0071] (2) Resource expenditure Model weight storage: Scheme S3 (this application) can significantly reduce the weight storage compared to Scheme S1 (FP); compared to Scheme S2 (unified W4), it only increases the exception table storage by a small amount (e.g., an additional weight volume of about 1% to 3%, depending on the exception ratio setting).
[0072] Inference time: Scheme S3 (in this application) only adds sparse compensation overhead (usually much less computational cost than increasing bits across the entire layer) on the basis of the low-bit backbone operator, and still maintains a significant speedup effect overall.
[0073] The above experimental results demonstrate that this application, through its "low-bit backbone and outlier exception table compensation" structure, can effectively alleviate the detail loss and structural instability caused by low-bit quantization in real-world degraded image super-resolution scenarios, while also balancing storage and inference efficiency.
[0074] Reference Figure 5 As shown, based on the same inventive concept, another embodiment of this application provides an image super-resolution reconstruction system based on an outlier exception table. The image super-resolution reconstruction system 100 includes: The acquisition module 110 is used to acquire the real degraded low-resolution image to be reconstructed and to determine the pre-trained DiT model. The weight extraction module 120 is used to obtain the weight matrix of the backbone network layer in the DiT model and define the exception table. Anomaly construction module 130 is used to identify anomalous weights for each weight matrix and generate anomaly index set; Exception table generation module 140 is used to generate low-bit weight matrices and exception tables for each backbone network layer based on the abnormal weight index set; Inference configuration module 150 is used to define the inference configuration of the DiT model based on the low bit weight matrix and exception table; The reconstruction module 160 is used to perform super-resolution reconstruction of real degraded low-resolution images based on a low bit weight matrix, exception table, and inference configuration using a DiT model, and output a high-resolution reconstructed image.
[0075] It should be noted that the modules in the image super-resolution reconstruction system based on outlier exception tables provided in the above embodiments of this application correspond to the steps of the image super-resolution reconstruction method based on outlier exception tables in any of the above embodiments. Those skilled in the art can refer to the step features of the image super-resolution reconstruction method based on outlier exception tables to implement the corresponding modules in the image super-resolution reconstruction system based on outlier exception tables, which will not be elaborated here.
[0076] In another embodiment of this application, an electronic device is also provided, including a memory and a processor; the memory is used to store program instructions; the processor is used to call the program instructions stored in the memory and execute the steps of the above-described image super-resolution reconstruction method based on an outlier exception table according to the obtained program instructions.
[0077] Optionally, the memory is used to store programs; the memory may include volatile memory, such as random-access memory (RAM), such as static random-access memory (SRAM), double data rate synchronous dynamic random-access memory (DDR SDRAM), etc.; the memory may also include non-volatile memory, such as flash memory. The memory is used to store computer programs (such as application programs and functional modules that implement the above methods), computer instructions, etc., and the aforementioned computer programs and computer instructions can be partitioned and stored in one or more memories. Furthermore, the aforementioned computer programs, computer instructions, data, etc., can be accessed by the processor.
[0078] The aforementioned computer programs, computer instructions, etc., can be stored in partitions within one or more memory locations. Furthermore, the aforementioned computer programs, computer instructions, data, etc., can be accessed by a processor.
[0079] A processor is used to execute a computer program stored in memory to implement the various steps of the methods involved in the above embodiments. For details, please refer to the relevant descriptions in the preceding method embodiments.
[0080] The processor and memory can be separate structures or integrated structures. When the processor and memory are separate structures, they can be coupled together via a bus.
[0081] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0082] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0083] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0084] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0085] The preferred features in the above embodiments can be used individually in any embodiment, or in any combination thereof, provided they do not conflict with each other. Furthermore, parts not described in detail in the embodiments can be implemented using existing technologies.
[0086] The foregoing has described some specific embodiments of this application. It should be understood that this application is not limited to the specific embodiments described above, and those skilled in the art can make various modifications or variations within the scope of the claims, which do not affect the substantive content of this application. The above-described preferred features can be used in any combination without conflict.
Claims
1. An image super-resolution reconstruction method based on an outlier exception table, characterized in that, include: Obtain the real degraded low-resolution image to be reconstructed and determine the pre-trained DiT model; Obtain the weight matrix of the backbone network layer in the DiT model and define the exception table; Identify anomalous weights for each weight matrix and generate a set of anomalous indices; A low-bit weight matrix and an exception table for each backbone network layer are generated based on the abnormal weight index set. The exception table is used to compensate for the output of the backbone network layer. The inference configuration of the DiT model is defined based on a low-bit weight matrix and an exception table; Based on a low-bit weight matrix, exception table, and inference configuration, the DiT model is used to perform super-resolution reconstruction of real degraded low-resolution images, outputting high-resolution reconstructed images.
2. The image super-resolution reconstruction method based on an outlier exception table according to claim 1, characterized in that, The process of obtaining the weight matrix of the backbone network layer in the DiT model and defining the exception table includes: Obtain the weight matrix of the backbone network layer in the DiT model; Set the target number of bits for quantization in the backbone network layer; Define an exception table, which is used to store the abnormal weights in the weight matrix corresponding to each backbone network layer; The exception table defines a selection strategy for exception weights, and the selection strategy is as follows: For each backbone network layer, its weight matrix is grouped, and weight elements whose absolute weight value is greater than a preset amplitude threshold are selected from each group and used as abnormal weights of that backbone network layer. Alternatively, for each backbone network layer, its weight matrix can be grouped, and a predetermined number of weight elements with the largest absolute weight value can be selected from each group as the abnormal weights of that backbone network layer.
3. The image super-resolution reconstruction method based on an outlier exception table according to claim 2, characterized in that, For each backbone network layer, its weight matrix is grouped according to any of the following methods: Output channel grouping: Using the output channels of the backbone network layer weight matrix as the basic unit, the weight elements corresponding to a single output channel are divided into a group; Grouping by input dimension: Using the input dimension of the backbone network layer weight matrix as the basic unit, the weight elements corresponding to a single input dimension are divided into a group; Block grouping: The weight matrix of the backbone network layer is divided into several non-overlapping block regions, and the weight elements in a single block region are used as a group.
4. The image super-resolution reconstruction method based on an outlier exception table according to claim 2, characterized in that, The step of identifying anomalous weights for each weight matrix and generating an anomalous index set includes: For each backbone network layer, take the absolute value of each weight element in the weight matrix to obtain the absolute weight matrix. Based on the absolute value matrix of weights, abnormal weights are selected from each backbone network layer according to the selection strategy of the abnormal weights. Count the number of abnormal weights in each backbone network layer; If the number of abnormal weights exceeds the preset abnormal number threshold, then the preset number of abnormal weights with the largest absolute weight value are selected from all abnormal weights as the final abnormal weights. If the number of abnormal weights does not exceed the preset abnormal number threshold, then the abnormal weights of the backbone network layer will not be processed. Generate an abnormal weight index set for each backbone network layer, wherein the coordinates of each abnormal weight are stored in the abnormal weight index set.
5. The image super-resolution reconstruction method based on an outlier exception table according to claim 1, characterized in that, The generation of low-bit weight matrices and exception tables for each backbone network layer based on the abnormal weight index set includes: For each backbone network layer, abnormal weights are removed from the abnormal weight index set from the weight matrix to obtain the target weight matrix; Each target weight matrix is subjected to low-bit quantization to obtain a low-bit weight matrix, and the scaling factor corresponding to the low-bit quantization is obtained. The quantization bit width of the low-bit quantization is configured to the target number of bits. An exception table is generated for each backbone network layer. The exception table stores the abnormal weight values in the abnormal weight index set and the coordinates of the abnormal weights.
6. The image super-resolution reconstruction method based on an outlier exception table according to claim 1, characterized in that, The inference configuration of the DiT model defined based on the low bit weight matrix and exception table includes: The equivalent linear calculation result of the low-bit weight matrix and the input features of the backbone network layer is used as the initial output value of the corresponding backbone network layer. The initial output of the backbone network layer is corrected based on the abnormal weights in the exception table. The specific correction method is as follows: traverse each abnormal weight in the exception table. Where i is the output channel index and j is the input component index, the input component corresponding to the j-th input dimension in the input features of the backbone network layer is obtained. , the input component with abnormal weights The product is accumulated and added to the output channel of the backbone network layer. This completes the correction of the output values of the backbone network layer.
7. The image super-resolution reconstruction method based on an outlier exception table according to claim 1, characterized in that, The method, based on a low-bit weight matrix, exception table, and inference configuration, employs a DiT model to perform super-resolution reconstruction of real degraded low-resolution images, outputting high-resolution reconstructed images, including: The low-bit weight matrix, scaling factor, exception table, and inference configuration of each backbone network layer are encapsulated into a model deployment package; Load the model deployment package into the inference framework of the pre-trained DiT model; The real degraded low-resolution image is input into the DiT model loaded with the model deployment package, and the high-resolution reconstructed image is output through the diffusion inference process of the DiT model.
8. An image super-resolution reconstruction system based on an outlier exception table, characterized in that, include: The acquisition module is used to acquire the real degraded low-resolution image to be reconstructed and to determine the completed pre-trained DiT model; The weight extraction module is used to obtain the weight matrix of the backbone network layer in the DiT model and define the exception table. An anomaly construction module is used to identify anomalous weights for each weight matrix and generate an anomaly index set. The exception table generation module is used to generate low-bit weight matrices and exception tables for each backbone network layer based on the set of exception weight indexes. The inference configuration module is used to define the inference configuration of the DiT model based on the low bit weight matrix and the exception table. The reconstruction module is used to perform super-resolution reconstruction of real degraded low-resolution images based on a low bit weight matrix, exception table, and inference configuration using a DiT model, and output a high-resolution reconstructed image.
9. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the steps of the method as described in any one of claims 1-7.
10. An electronic device, characterized in that, include: At least one memory for storing program instructions; At least one processor is configured to invoke program instructions stored in the memory and execute the steps of the method as described in any one of claims 1-7 according to the obtained program instructions.