Image super-resolution reconstruction method based on propagation geometry redesign

By employing a propagation-based geometric redesign image super-resolution method with diagonal complementary oblique propagation and position-adaptive fusion mechanisms, the axial path dependence problem in existing technologies is solved, improving the accuracy and applicability of image super-resolution reconstruction, especially showing superior performance in oblique structure and high-frequency texture restoration.

CN122243748APending Publication Date: 2026-06-19JIANGXI NORMAL UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JIANGXI NORMAL UNIV
Filing Date
2026-05-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing image super-resolution methods based on state-space models suffer from axial path dependence in the information propagation geometry, resulting in poor reconstruction of oblique structures and high-frequency textures. Furthermore, lightweight models are insufficient in balancing the efficiency of long-range dependency modeling with reconstruction accuracy.

Method used

A propagation-based geometric redesign approach is adopted, which uses efficient two-dimensional scanning units and auxiliary enhancement units to achieve diagonal complementary oblique propagation and position-aligned content adaptive gating fusion. Combined with cross-scale information interaction and local high-frequency texture restoration, the context aggregation effect of non-axis-aligned structures is improved.

Benefits of technology

While maintaining efficient long-range dependency modeling capabilities, it significantly improves the accuracy and scene adaptability of image super-resolution reconstruction, especially in oblique structure and high-frequency texture restoration.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122243748A_ABST
    Figure CN122243748A_ABST
Patent Text Reader

Abstract

This invention discloses an image super-resolution reconstruction method based on propagation geometry redesign, belonging to the field of image processing technology. The invention acquires low-resolution images and extracts initial features. These initial features are then input into a series-stacked deep feature reconstruction module for sequence modeling. The deep feature reconstruction module utilizes an efficient two-dimensional scanning unit to extract dynamic parameters, alternately selecting complementary propagation configurations along the main diagonal or sub-diagonal, combining a shared two-dimensional scanning backend for diagonal complementary state propagation, and performing content-adaptive feature fusion through a position-aligned gating fusion mechanism. An auxiliary enhancement unit then outputs locally enhanced features based on cross-scale interaction and local high-frequency texture recovery. Finally, a high-resolution image is output through anti-embedding, global residual fusion, and upsampling reconstruction. This invention solves the problems of anisotropic context aggregation and long interaction paths in existing models, significantly improving the reconstruction accuracy of oblique structures and high-frequency textures.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of computer vision and image processing technology, specifically relating to an image super-resolution reconstruction method based on propagation geometric redesign. Background Technology

[0002] Image super-resolution (SR) aims to reconstruct high-resolution images from low-resolution images and is a fundamental and core task in the field of computer vision. Based on the different backbone networks, existing methods can be mainly divided into three architectures: convolutional neural networks (CNN), visual Transformers (ViT), and state-space model-based (SSM).

[0003] Convolutional neural networks (CNNs) have achieved widespread success due to their powerful local modeling capabilities, but their receptive field is limited by the kernel size and network depth, making it difficult to efficiently capture global dependencies. Visual Transformers overcome the local receptive field limitations of CNNs through self-attention mechanisms, but their computational complexity increases quadratically with image size. In practical applications, attention is often forced to be limited to a local window, weakening their ability to model global dependencies. In recent years, selective state-space models (Mamba), due to their linear complexity and selective state transition capabilities for input dependencies, have become an efficient alternative for modeling long-range dependencies and have been introduced into image restoration tasks.

[0004] However, existing Mamba-based image super-resolution methods suffer from inherent technical limitations in the information propagation geometry. Most existing models rely on one-dimensional sequence unfolding or axis-dominated multi-directional scanning paradigms, leading to strong path dependence in information propagation. Under horizontal-vertical dominated scanning paths, the effective receptive field (ERF) of the model exhibits a pronounced "cross-shaped" anisotropic distribution, restricting effective interaction paths between pixels to the axial direction. This forces interactions between diagonally related pixels to require long-distance transit via axial paths, severely elongating the effective interaction paths of non-axis-aligned structures. While some existing methods attempt to fuse the results of multi-directional one-dimensional scans after fixed summation, this is essentially post-processing and cannot preserve the isotropic spatial structure of high-resolution images at the operator level, making it extremely difficult to accurately model frequently occurring oblique contours, repeating patterns, and high-frequency line textures in super-resolution tasks. Furthermore, in lightweight model scenarios, limited by model capacity, existing solutions cannot simultaneously achieve both long-range dependency modeling efficiency and reconstruction accuracy.

[0005] In summary, existing state-space model-based super-resolution techniques suffer from severe anisotropy in context aggregation due to the axis-dominated propagation path. A key challenge in this field is how to fundamentally address the core issues of long interaction paths for non-axis-aligned structures and poor reconstruction of oblique structures and high-frequency textures without introducing complex scan branches, thereby achieving more uniform information propagation geometry that better meets the geometric requirements of super-resolution tasks. Summary of the Invention

[0006] To address the technical problem in existing technologies where anisotropy in context aggregation caused by axial propagation paths leads to poor reconstruction results for oblique structures, repetitive textures, and non-axis-aligned details, this invention provides an image super-resolution reconstruction method based on propagation geometry redesign. This method redesigns the network at the propagation geometry level in a task-oriented manner, effectively improving the context aggregation effect for non-axis-aligned structures while maintaining efficient long-range dependency modeling capabilities.

[0007] In a first aspect, the present invention provides an image super-resolution reconstruction method based on propagation geometric redesign, comprising the following steps: Acquire a low-resolution image, extract features from the low-resolution image, and output initial features; The initial feature input is used to build up multiple deep feature reconstruction modules in series. These modules perform feature sequence modeling and reconstruction information propagation, and output deep features after layer-by-layer processing. Each deep feature reconstruction module includes an efficient two-dimensional scanning unit and an auxiliary enhancement unit. The efficient two-dimensional scanning unit extracts dynamic parameters of the input features, alternately selects a main diagonal complementary propagation configuration or a secondary diagonal complementary propagation configuration according to the module's hierarchical order in the network, and obtains two complementary propagation results through diagonal complementary state propagation based on view reorganization using a shared two-dimensional scanning backend. A position-aligned gating fusion mechanism is then used to perform content-adaptive gating fusion of the two complementary propagation results, outputting a global propagation fusion feature. The auxiliary enhancement unit refines the global propagation fusion feature based on cross-scale information interaction and local high-frequency texture restoration, outputting local enhanced features as input to the next deep feature reconstruction module or as the final deep feature. The deep features are sequentially subjected to layer normalization, patch inversion and channel adjustment operations to obtain the adjusted features in the two-dimensional feature space. The adjusted features are then weighted and global residuals are fused with the initial features to output the deep fused features. The deep fusion features are input into the reconstruction network to obtain feature upsampling results. The low-resolution image is then subjected to bicubic upsampling to obtain a basic residual image. The feature upsampling results are added to the basic residual image to output a high-resolution reconstructed image.

[0008] In a second aspect, embodiments of this application provide an electronic device, which includes a processor, a memory, and a program or instructions stored in the memory and executable on the processor. When the program or instructions are executed by the processor, they implement the steps of the method described in the first aspect.

[0009] Thirdly, embodiments of this application provide a readable storage medium on which a program or instructions are stored, which, when executed by a processor, implement the steps of the method described in the first aspect.

[0010] Compared with the prior art, the present invention has the following beneficial effects: (1) Diagonally Complementary Oblique Propagation Mechanism: This invention first reconstructs the core propagation logic of the Mamba-type image super-resolution model from the perspective of propagation geometry. Around the state propagation process on the two-dimensional feature grid points, in the efficient two-dimensional scanning unit, the traditional axial propagation geometry dominated by the horizontal and vertical directions is no longer used. Instead, the propagation path is organized into a diagonally complementary oblique propagation form through view reorganization.

[0011] (2) Position-aligned content-adaptive gating fusion mechanism: For the two propagation results obtained through diagonal complementary propagation in the current module, this invention further designs a position-aligned content-adaptive gating fusion mechanism. This mechanism dynamically predicts the fusion weights of the two complementary propagation results at each spatial location based on the propagation input features, and completes adaptive weighting through gating fusion, rather than using the fixed summation fusion method in traditional multi-directional scanning methods. With this mechanism, the model can adaptively adjust the contribution ratio of information from different propagation directions according to the local image structure: in oblique structures or regions with repetitive textures, propagation results more conducive to structure recovery are strengthened; in other regions, more suitable propagation information is retained. Therefore, this invention can achieve fine-grained control of features in different spatial directions, improving the matching degree between the propagation results and the reconstruction target.

[0012] (3) Auxiliary Enhancement Mechanism: After the efficient 2D scanning unit completes the main propagation modeling, this invention further introduces an auxiliary enhancement unit to supplement and optimize the propagation features. The auxiliary enhancement unit consists of a cross-scale enhancement module and a texture-aware local branch module. The cross-scale enhancement module explicitly constructs multi-scale features and combines them with lightweight 3D convolution to enhance the direct interaction between features of different scales, improving the structure restoration effect in high-magnification super-resolution tasks. The texture-aware local branch module, through local texture extraction, channel attention adjustment, and local texture compensation, specifically enhances fine lines, high-frequency textures, and repetitive microstructures, thereby compensating for the shortcomings of global propagation in restoring extremely fine-grained details. Through the above auxiliary enhancement process, this invention can simultaneously address the needs of cross-scale structure modeling and local texture restoration, further improving the quality of reconstructed features.

[0013] Through the above technical solutions, this invention can effectively address the problems of strong propagation path dependence, significant anisotropy in context aggregation, and insufficient recovery of oblique structures and high-frequency textures in existing image super-resolution methods, while retaining the linear computational complexity advantage of the Mamba model. It achieves propagation geometry redesign and feature co-enhancement for super-resolution tasks. Specifically, at the global propagation level, this invention achieves more balanced spatial information modeling through diagonally complementary oblique propagation; at the feature fusion level, it achieves finer direction adjustment through position-aligned content-adaptive gated fusion; and at the detail enhancement level, it achieves cross-scale information supplementation and local high-frequency texture recovery through auxiliary enhancement units. This significantly improves the accuracy, scene adaptability, and lightweight deployment capability of image super-resolution reconstruction. Attached Figure Description

[0014] Figure 1 This is a diagram illustrating the overall network architecture of an image super-resolution reconstruction method based on propagation geometry redesign, according to an embodiment of the present invention.

[0015] Figure 2 This is a schematic diagram of the specific internal structure of the high-efficiency two-dimensional scanning unit in an embodiment of the present invention.

[0016] Figure 3 This is a schematic diagram of the specific internal structure of the auxiliary enhancement unit in an embodiment of the present invention.

[0017] Figure 4 This is a comparison diagram of the effective receptive field (ERF) distribution of the backbone network in the prior art and the present invention.

[0018] Figure 5 This is a comparison chart of the Local Attribution Map (LAM) output from super-resolution reconstruction using the present invention and existing technologies.

[0019] Figure 6 This is a comparison chart showing the subjective visual quality of this invention and other mainstream technologies for classic super-resolution tasks. Detailed Implementation

[0020] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0021] The terms "first," "second," etc., used in the specification and claims of this application are used to distinguish similar objects and not to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that embodiments of this application can be implemented in orders other than those illustrated or described herein. Furthermore, in the specification and claims, "and / or" indicates at least one of the connected objects, and the character " / " generally indicates that the preceding and following objects are in an "or" relationship. In the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified.

[0022] Example This invention aims to propose an image super-resolution reconstruction network based on propagation geometric redesign (e.g., Figure 1 As shown in the figure, it mainly addresses the "anisotropy" problem of context aggregation caused by the reliance on one-dimensional unfolding and axial (horizontal / vertical) scanning when current state-space models (such as Mamba networks) handle super-resolution tasks.

[0023] like Figure 1 As shown, the image super-resolution reconstruction network of the present invention starts with a low-resolution input image and adopts a main process of "head 3×3 convolution, N deep feature reconstruction modules stacked, normalization, patch inverse embedding, adjustment convolution, reconstruction head", and combines bicubic upsampling residual branches to generate high-resolution reconstruction results.

[0024] Specifically, given a low-resolution input image, initial features are first extracted using the "head 3×3 convolution" shown in the diagram. This step corresponds to the network head feature extraction operation, used to map the input image to the feature space required for subsequent deep reconstruction. Subsequently, the initial features are fed into a deep feature reconstruction backbone composed of N deep feature reconstruction modules connected in series. Deep feature modeling and reconstruction information propagation are gradually completed within the continuously stacked deep feature reconstruction modules, ultimately yielding the final deep token output. After this processing, the output features undergo "normalization" and "patch inverse embedding" as shown in the diagram, restoring the deep representation in token form to a two-dimensional feature space. Then, "adjusted convolution" is used to adjust the channels and feature forms, and the weighted global residual connection (represented by parameter α) is used to fuse with the initial features obtained from the "head 3×3 convolution" at the front end, thus obtaining the deep fused features used for reconstruction. This weighted residual design can retain shallow structural information while introducing high-level reconstruction information obtained from deep propagation, which is beneficial for enhancing network training stability and final reconstruction performance. During the reconstruction phase, the fused features are input into the "reconstruction head" in the image, and the target high-resolution image is restored through convolutional reconstruction and upsampling operations. At the same time, the input low-resolution image is also processed by the "bicubic upsampling" branch in the image to generate the base upsampling result, and the residual is added to the result of the "reconstruction head" at the output to finally obtain the high-resolution reconstructed image.

[0025] Figure 1 The core idea of ​​the network backbone shown is to use the deep feature reconstruction module as the basic unit of deep feature reconstruction, and to design it specifically around the propagation geometry in the super-resolution task. It replaces the traditional axis-dominated propagation with complementary oblique propagation, so that long-range dependency modeling can better adapt to the recovery needs of oblique structures, repetitive textures and non-axis-aligned details.

[0026] The high-efficiency two-dimensional scanning unit and auxiliary enhancement unit are the main components of the deep feature reconstruction module. See details below. Figure 2 and Figure 3 .

[0027] like Figure 2 As shown, the efficient 2D scanning unit takes "propagation input features" as input and is composed of "1×1 convolution", "3×3 depth convolution", "shared 2D scanning backend", and "gated fusion". The function of this unit is to complete efficient state propagation and orientation-adaptive feature fusion for super-resolution tasks while maintaining the 2D spatial structure.

[0028] Specifically, the input features are first processed sequentially through a 1×1 convolution and a 3×3 depthwise convolution. This forms the input representation for subsequent propagation and generates dynamic parameters related to the propagation process. The 1×1 convolution is primarily used for channel dimension mapping and feature projection, while the 3×3 depthwise convolution introduces local spatial awareness, enabling content adaptation in the subsequent two-dimensional propagation process. After the front-end feature transformation, the propagation input features do not simultaneously enter the two parallel propagation branches. Instead, they are processed alternately using different diagonal propagation configurations according to the module's hierarchical order within the network. Figure 1 The red module corresponds to the main diagonal complementary propagation configuration, and... Figure 1 The yellow module corresponds to the sub-diagonal complementary propagation configuration, which is alternately set in adjacent modules. For any current module, only one diagonal propagation configuration is activated, and the same "shared 2D scanning backend" is called to complete the 2D scanning propagation in that direction; when moving to the next module, the other diagonal propagation configuration is switched to continue propagation. Through this alternating design, this invention does not need to set multiple sets of scanning operators in parallel within a single module, thus improving the uniformity of overall directional coverage while keeping the computational overhead of a single module under control, thereby balancing scanning efficiency and propagation effect. Subsequently, the two propagation results are adaptively fused at the "gated fusion" module in the figure. This step is not a simple addition, but rather dynamically adjusts the contribution ratio of the two propagation paths according to the input content, enabling the model to automatically select more suitable directional information for different local structures. For oblique edges, repetitive textures, and complex geometric regions, gated fusion can make fuller use of the effective information in the complementary propagation view, thereby improving the quality of deep feature representation. Finally, the result after gated fusion is used as the output feature of this module and fed into subsequent network layers for further processing.

[0029] Overall, Figure 2 The core idea of ​​the high-efficiency 2D scanning unit shown is to achieve propagation geometry redesign for super-resolution tasks through "front-end dynamic parameter generation + complementary dual-branch propagation + shared 2D scanning back-end + gated fusion". This enables long-range dependency modeling to no longer be limited to traditional axial propagation, but to more effectively cover oblique structures, repeating patterns and non-axis aligned details.

[0030] like Figure 3 As shown, the auxiliary enhancement unit takes "propagation features" as input and consists of a two-level structure: a "cross-scale enhancement module" and a "texture-aware local branch module," ultimately outputting "local enhanced features." This module is used to further supplement cross-scale information interaction capabilities and local high-frequency texture recovery capabilities outside the main propagation path, thereby enhancing the reconstruction effect of deep features.

[0031] Specifically, the input propagation features first enter the "cross-scale enhancement module" in the lower half of the diagram. Within this module, the propagation features undergo sequential processing via "1×1 convolution," "3×3 depthwise convolution," and "activation," progressively completing feature mapping, local spatial modeling, and nonlinear enhancement. Simultaneously, the diagram also shows the residual branch directly connecting the input to the output of this stage, indicating that the cross-scale enhancement module operates using residual enhancement. This means that while preserving the basic structural information of the original propagation features, it introduces enhancement information obtained through convolution and activation processing, thereby obtaining a more robust intermediate representation. After the cross-scale enhancement module outputs, the enhanced features continue into the "texture-aware local branch module" in the upper half of the diagram. This module first adaptively adjusts the response strength on different channels through "channel attention," highlighting feature components more relevant to texture recovery; then, it further supplements fine-grained texture information and local high-frequency details through "local texture compensation." Residual connections are also present in the diagram, allowing the texture-aware local branch module to selectively enhance local details without destroying the main structural information. Finally, after residual fusion, the local enhancement features of the auxiliary enhancement unit are output. Functionally, Figure 3 The auxiliary enhancement unit shown does not undertake the dominant global propagation task, but rather acts as... Figure 2 The supplementary optimization unit of the efficient 2D scanning unit plays a crucial role. Specifically, the cross-scale enhancement module focuses on strengthening the information interaction between representations at different scales, improving structural recovery capabilities in high-magnification super-resolution scenes; while the texture-aware local branch module focuses on compensating for potentially insufficient local texture details after global propagation, improving the reconstruction quality of fine lines, repetitive patterns, and high-frequency regions. The synergistic effect of these two parts allows the output local enhancement features to better serve subsequent high-resolution image reconstruction.

[0032] In summary, Figures 1 to 3 The technical solution of this invention is fully explained from three levels: overall architecture, core propagation unit, and auxiliary enhancement unit. Figure 1 This invention describes the overall network reconstruction process. Figure 2 This paper explains the efficient two-dimensional scanning mechanism based on propagation geometry redesign within the deep feature reconstruction module. The main diagonal propagation configuration and the sub-diagonal propagation configuration are alternately set along the network hierarchy to achieve more balanced spatial direction coverage while controlling the computational complexity of a single module. Figure 3 The auxiliary enhancement unit used in conjunction with the propagation mechanism is described, which further improves the quality of reconstructed features through cross-scale feature enhancement and local texture compensation. The synergistic effect of these three elements enables the present invention to enhance the recovery of non-axis-aligned structures, repeating patterns, and local high-frequency details while maintaining efficient long-range dependency modeling capabilities, thereby improving the overall performance of image super-resolution reconstruction.

[0033] Based on the aforementioned image super-resolution reconstruction network based on propagation geometric redesign, a new image super-resolution reconstruction method based on propagation geometric redesign is proposed. This method utilizes a deep learning network architecture that includes feature extraction, a deep feature reconstruction backbone, and an upsampling reconstruction headend. The specific implementation process of this method is as follows: 1. Definition of Single Image Super-Resolution Reconstruction Task In low-level image vision tasks, this invention addresses the problem of single-image super-resolution reconstruction, aiming to recover a high-resolution image with higher spatial resolution and richer detail from an input low-resolution image. The input image is a low-resolution image with three channels (red, green, and blue) and dimensions (height, width, and number of channels). The output is the corresponding high-resolution reconstructed image. Typically, low-resolution images are obtained by downsampling and degrading high-resolution images; therefore, the core of super-resolution reconstruction lies in recovering the structural information, texture details, and high-frequency responses lost during the degradation process.

[0034] Existing Mamba-based image super-resolution methods mostly rely on one-dimensional sequence unfolding or axis-dominated multi-directional scanning, resulting in strong path dependence in information propagation and significant anisotropy in context aggregation. This makes it difficult to effectively adapt to the recovery requirements of oblique contours, repetitive patterns, and non-axis-aligned high-frequency textures. To address this issue, this invention proposes an image super-resolution reconstruction method based on propagation geometry redesign. Through the synergistic effect of efficient two-dimensional scanning units and auxiliary enhancement units, it achieves more balanced spatial information propagation and more effective local detail compensation.

[0035] 2. Overall Network Architecture like Figure 1 As shown, the image super-resolution reconstruction network of this invention adopts a main process of "head 3×3 convolution, multiple deep feature reconstruction modules in series, normalization, patch inversion, adjustment convolution, and reconstruction head", and combines it with a bicubic upsampling residual branch to generate high-resolution reconstruction results. The input low-resolution image first extracts initial features through the head 3×3 convolution; then, the initial features enter multiple deep feature reconstruction modules, completing deep feature modeling and reconstruction information propagation layer by layer; after the deep feature reconstruction is completed, the output features are successively normalized, patch inversion, and adjustment convolution, and then fused with the front-end initial features through weighted global residual connections; finally, the fused features are sent to the reconstruction head and added to the output of the bicubic upsampling branch to obtain the final high-resolution reconstructed image.

[0036] Let the input low-resolution image be The initial features extracted by the 3×3 convolution in the head can be represented as follows: in This indicates a 3×3 convolution operation in the header.

[0037] Subsequently, the initial features are mapped to a deep feature space and input into multiple deep feature reconstruction modules for layer-by-layer processing. Let the input sequence features of the i-th deep feature reconstruction module be... Then its output can be expressed as: in Indicates the first Feature transformation operations of each deep feature reconstruction module The number of stacked deep feature reconstruction modules can be flexibly adjusted according to the model size and task requirements.

[0038] After processing by multiple deep feature reconstruction modules, the final sequence feature output is denoted as: The output is first normalized, then restored to a two-dimensional feature space through patch inversion, and the channels are adjusted by adjusting the convolution. Finally, it is fused with the initial front-end features through a weighted global residual connection to obtain deep fused features. in This indicates adjusting the convolution to unify the feature channel dimensions; It is a learnable residual scaling factor used to balance the contribution ratio of shallow and deep features; For layer normalization operations, This is for tile inversion operations. Weighted global residual connections effectively alleviate the gradient vanishing problem in deep networks while ensuring that low-level image structural information is not lost during depth propagation.

[0039] During the reconstruction phase, deep fusion features are input into the reconstruction head for upsampling reconstruction, while the input low-resolution image undergoes bicubic upsampling to form the basic residual branch, ultimately outputting a high-resolution image. in This represents the reconstruction head, which is responsible for mapping depth features to the target's high-resolution space; This indicates a bicubic upsampling operation, which provides a stable structural prior for the reconstruction process.

[0040] The core idea of ​​this invention's network is to use deep feature reconstruction modules as the basic unit of deep modeling, and to introduce propagation geometry design for super-resolution tasks within each reconstruction block, so that long-range dependency modeling is more suitable for the recovery needs of oblique structures, repetitive textures and non-axis aligned details.

[0041] 3. High-efficiency two-dimensional scanning unit The high-efficiency two-dimensional scanning unit is the core propagation unit in the deep feature reconstruction module of this invention. Its function is to complete efficient state propagation and orientation-adaptive feature fusion while maintaining the two-dimensional spatial structure. Unlike the traditional axis-dominated propagation method, this invention redesigns the propagation geometry to make information propagation more suitable for the recovery requirements of oblique contours, repeating patterns, and complex structures in image super-resolution tasks.

[0042] (1) Propagation input features and dynamic parameter generation The core of selective state transition lies in the dynamic parameter generation that depends on the input. To ensure that the spatial layout of the two-dimensional feature grid points is not destroyed, we designed a parameter generation path that combines point-by-point projection with depth convolution to avoid the loss of spatial structure information caused by one-dimensional sequence unfolding.

[0043] Given an input feature map Where B is the batch size and C is the number of feature channels. Let the height and width of the feature map be the values, respectively. We first generate fused features through pointwise projection and depthwise convolution, as shown in the following formula: in, This represents a 1×1 pointwise convolution, used to complete feature mapping along the channel dimension; express Depthwise convolution is used to fuse spatial local features while preserving channel independence. Features for the parameter generation stage. Subsequently, features for the parameter generation stage... Decomposing along the channel dimension yields the propagation input features and the original dynamic parameter tensor: Where U represents the propagation input feature used in subsequent two-dimensional scanning propagation, This represents the original dynamic parameter tensor. To ensure the numerical stability of the dynamic parameters, the original dynamic parameters are further mapped to a valid range of values, resulting in: in, This represents a soft positive activation function. These are the target dynamic parameters ultimately used for two-dimensional state propagation.

[0044] Through the above steps, the propagation input features and dynamic parameters were generated while preserving the spatial layout of the two-dimensional feature grid, laying the foundation for subsequent efficient two-dimensional scanning.

[0045] (2) Shared 2D scanning backend Many existing Mamba-based super-resolution methods typically flatten the 2D feature map into a 1D sequence before performing scan propagation, which disrupts the original 2D spatial continuity of the visual data. To address this, this invention introduces a shared 2D scan backend, directly performing state propagation on the 2D feature grid. Its basic propagation operator can be expressed as: in This indicates a shared 2D scanning backend. This represents the two-dimensional propagation result obtained in the basic propagation coordinate system. The shared two-dimensional scanning backend remains consistent across different modules, avoiding the construction of independent large-scale propagation operators for different directions, thereby ensuring propagation efficiency and structural simplicity.

[0046] (3) Diagonal complementary propagation based on view reorganization Although a shared 2D scanning backend can directly perform state propagation in 2D space, if the traditional axially dominant propagation geometry is still used, the effective interaction path between diagonally related pixels remains relatively long, which is not conducive to the reconstruction of oblique structures and repetitive textures. Therefore, this invention introduces a diagonal complementary propagation mechanism based on view reorganization outside the shared 2D scanning backend. Let Φ represent the geometric transformation operator that performs spatial rearrangement on the input features, then the corresponding transformation propagation operator is defined as: in, Transformation The inverse transform is used to restore the features to their original spatial layout after propagation. To achieve more balanced directional coverage, this invention sets up two sets of diagonal propagation configurations: a main diagonal complementary propagation configuration and a secondary diagonal complementary propagation configuration, defined as follows: in, This indicates a complementary propagation configuration along the main diagonal. This represents the complementary propagation configuration along the secondary diagonal direction. In the k-th deep feature reconstruction module, different diagonal propagation configurations are used alternately according to the module's hierarchical order in the network. The selection rule is as follows: This means that, with Figure 1 The red module in the middle corresponds to the main diagonal complementary propagation configuration, and... Figure 1 The yellow module corresponds to the secondary diagonal complementary propagation configuration, which alternates between the two in adjacent deep feature reconstruction modules. Let the selected diagonal propagation configuration for the current module be: The outputs of the two complementary propagation views within this module can then be represented as: in, and This represents two complementary propagation results under the diagonal propagation configuration selected by the current module, where m is the number of the complementary propagation view, taking the value 1 or 2. Through this propagation organization method of "alternating between modules and complementary within modules," this invention eliminates the need to set up multiple independent scanning operators in parallel within a single module, and can gradually expand the overall directional coverage of the network without significantly increasing the computational overhead of a single module.

[0047] (4) Position alignment gating fusion mechanism Existing multi-directional scanning methods often fuse different propagation results using a fixed summation approach, failing to dynamically adjust the contribution ratio of information from different directions based on local image structure. To address this, this invention designs a position-aligned gated fusion mechanism that adaptively fuses two complementary propagation results within the current module. Given the propagation input feature U, the fusion logarithmic weights are first predicted using a lightweight 1×1 convolution: in, Represents gated convolution. The fusion log-weight tensor generated for the current module. Subsequently, for... Execute along the propagation view dimension Normalization yields the position-wise fusion weights corresponding to the two complementary propagation results: in, and Let represent the positional weights of the two complementary propagation results in the current module, respectively, and satisfy the following: Finally, the output features of the current efficient two-dimensional scanning unit are obtained by element-wise weighted summation: in, This indicates element-wise multiplication. This represents the global propagation fusion feature output by the current high-efficiency 2D scanning unit. This fusion mechanism can automatically adjust the contribution ratio of the current complementary propagation results according to the local image structure, thereby improving the matching degree between the propagation geometry and the reconstruction target.

[0048] 4. Auxiliary Enhancement Unit The auxiliary enhancement unit follows the efficient 2D scanning unit. Its role is to further supplement cross-scale information interaction capabilities and local high-frequency texture recovery capabilities outside the main propagation path. This module consists of two parts: a cross-scale enhancement module and a texture-aware local branch module.

[0049] (1) Cross-scale enhancement module In high-magnification super-resolution tasks, strong dependencies exist between different resolution levels, and single-scale propagation paths are insufficient to efficiently establish direct interactions between cross-resolution features. To address this, this invention designs a cross-scale enhancement module that achieves cross-scale information fusion through explicit multi-scale feature construction and lightweight 3D convolution. Given the input features of the auxiliary enhancement unit... First, construct multi-scale features with spatially aligned dimensions: in, and These represent 2x and 4x average pooling, respectively. and These represent upsampling by 2x and 4x, respectively. Subsequently, the features from the three scales are stacked along the virtual scale dimension to obtain the three-dimensional interactive features: in, This indicates a stacking operation along a specified dimension. Subsequently, cross-scale feature interaction is achieved through 3D pointwise convolution and 3D depthwise convolution. in, This represents a 3D pointwise convolution used to perform channel fusion; This represents a 3D depthwise convolution used for local interactions along the scale dimension. Finally, feature slices at different scales are summed, and cross-scale enhanced features are obtained through pointwise convolutional projection. in, Indicates the first Feature slices at each scale location Represents projective convolution. This refers to the cross-scale enhancement features output by the cross-scale enhancement module.

[0050] (2) Texture-aware local branching module While efficient 2D scanning units excel at modeling long-range dependencies and global structures, their recovery capabilities can be further enhanced for extremely fine-grained high-frequency textures. To address this, this invention designs a texture-aware local branch module as a supplement to the main propagation path, specifically for extracting and enhancing local texture details. Given the enhanced features output by the cross-scale enhancement module... First, local texture features are extracted using a bottleneck convolutional structure: in Representation layer normalization operation; and This represents two 1×1 convolutions, used to compress and restore the channel dimension, respectively. express Depthwise convolution; This represents the Gaussian error linear unit activation function. Subsequently, local texture features are adaptively recalibrated using a channel attention mechanism. in, Indicates global average pooling and This represents two fully connected layers. Represents the linear rectification activation function. express Shape activation function, These are the channel attention weights. Finally, the output features of the texture-aware local branch module are obtained through attention weighting: in, This indicates element-wise multiplication. This represents the enhanced local texture features, i.e., the thinning features. This branch can compensate for and enhance fine lines, local high-frequency textures, and repetitive microstructures.

[0051] 5. Overall integration of the deep feature reconstruction module Each deep feature reconstruction module consists of a high-efficiency 2D scanning unit and an auxiliary enhancement unit. The former is responsible for efficient propagation modeling for super-resolution tasks, while the latter is responsible for cross-scale enhancement and local texture compensation based on the propagation results. By stacking multiple deep feature reconstruction modules in series, the feature representation capability can be improved layer by layer, thereby improving the detail recovery quality of the reconstructed image while maintaining the overall structural consistency.

[0052] Specifically, let the input features of the k-th deep feature reconstruction module be... .

[0053] First, in the global propagation stage, the input features are normalized by layers and then fed into an efficient two-dimensional scanning unit to obtain the globally propagated fused features: in This represents a high-efficiency two-dimensional scanning unit, responsible for completing diagonal complementary propagation and gated fusion in the current module; This indicates that the characteristics have been propagated and integrated globally.

[0054] Subsequently, the global propagation fusion features are further enhanced by a cross-scale enhancement module to obtain global enhanced features: in, This represents the cross-scale enhancement features output by the cross-scale enhancement module. is the first learnable residual scaling factor, used to balance the contribution ratio of backbone propagation features and cross-scale enhancement features.

[0055] In the local enhancement stage, the cross-scale enhanced features continue to undergo local detail compensation through the texture-aware local branch module, resulting in the final output of the current deep feature reconstruction module, i.e., the locally enhanced features: in, This represents the refined features output by the texture-aware local branch module. The second learnable residual scaling factor is used to balance the contribution ratio of backbone features and local texture enhancement features. This two-stage ensemble structure organically combines global propagation modeling, cross-scale interactive enhancement, and local texture compensation, and ensures the stability of the training process through residual connections.

[0056] Example verification Experimental results show that the present invention achieves superior results compared to most comparative methods under both classical super-resolution and lightweight super-resolution settings, specifically in the following aspects: 1. Achieve stable performance improvements in classic super-resolution tasks. In super-resolution tasks of ×2, ×3, and ×4 times, this invention was compared with mainstream methods (including EDSR, RCAN, SwinIR, MambaIR, etc.) on five standard datasets: Set5, Set14, BSD100, Urban100, and Manga109. Experimental data are shown in Table 1. This invention achieved optimal or near-optimal PSNR / SSIM metrics at multiple scales. Table 1. Quantitative comparison of classic image super-resolution tasks (×2, ×3, ×4 times) In the ×4 super-resolution task, this invention achieves 27.41 dB (PSNR) on the Urban100 dataset, an improvement of 0.23 dB over MambaIR; and 32.02 dB on the Manga109 dataset, an improvement of 0.16 dB over MambaIR, demonstrating superior reconstruction capabilities for repetitive textures and structural patterns.

[0057] In the ×2x super-resolution task, the present invention achieves 38.46 dB on Set5 and 33.83 dB on Urban100, both of which are superior to similar Mamba methods and mainstream Transformer methods, verifying the global modeling capability of the present invention in low-magnification reconstruction.

[0058] 2. Outperformed most comparative methods in lightweight super-resolution tasks. With a lightweight model configuration, this invention participates in the ×4x super-resolution task evaluation with approximately 1.6M parameters and achieves competitive performance on multiple standard datasets. (See Table 2 for details.) Table 2 Quantitative comparison of lightweight image super-resolution tasks (×2, ×3, ×4 times) On five datasets—Set5, Set14, BSD100, Urban100, and Manga109—the PSNR of this invention reached 32.54 dB, 28.91 dB, 27.78 dB, 26.73 dB, and 31.33 dB, respectively, all of which are superior to mainstream lightweight models such as SwinIR-light, MambaIR-light, and SRFormer-light.

[0059] On the Manga109 dataset, this invention improves performance by 0.39 dB (31.33 dB vs 30.94 dB) compared to MambaIR-light, and by 0.21 dB (26.73 dB vs 26.52 dB) on Urban100, demonstrating its significant advantages in structured image reconstruction.

[0060] 3. High-efficiency two-dimensional scanning ablation experiment To verify the effectiveness of the core technology of this invention—high-efficiency two-dimensional scanning—this section compares the performance differences of different propagation geometry configurations through ablation experiments. The experiments were conducted under a lightweight ×4 super-resolution setting, keeping other modules constant and only changing the propagation geometry configuration: Single-direction propagation in the main diagonal direction and single-direction propagation in the secondary diagonal direction: two different axial propagation settings. Based on the traditional horizontal-vertical scanning geometry, information propagation is mainly limited by the axial direction, and the effective receptive field presents a "cross-shaped" distribution. Main diagonal complementary propagation pairs and secondary diagonal complementary propagation pairs: Two fixed diagonal complementary propagation settings that introduce diagonal complementary paths with different orientations (such as the main diagonal direction and the secondary diagonal direction) on the basis of axial propagation, so that information can propagate more directly along the diagonal direction; Alternating main / secondary diagonal propagation: Two propagation paths, main diagonal complementary propagation pairs and secondary diagonal complementary propagation pairs, are used alternately between network layers to further enhance the diversity and uniformity of propagation directions. The ablation experiment results are shown in Table 3. Table 3. Propagation geometric ablation experiment (lightweight ×4 setting) Fixed diagonal complementary propagation outperforms unidirectional propagation on all five datasets, demonstrating that the shift from unidirectional propagation to diagonal complementary propagation is the core source of performance improvement. Taking Urban100 as an example, main diagonal complementary propagation improves performance by 0.1 dB compared to the optimal unidirectional propagation.

[0061] Alternating main / secondary diagonal fusion further outperforms fixed diagonal complementary propagation, achieving the best performance on all datasets. Compared to the optimal unidirectional propagation, it improves performance by 0.08 dB, 0.08 dB, 0.04 dB, 0.13 dB, and 0.13 dB on Set5, Set14, BSD100, Urban100, and Manga109, respectively.

[0062] The improvement is particularly significant on structure-dense datasets such as Urban100 and Manga109, indicating that the propagation geometry designed in this invention has a targeted advantage for the reconstruction of repetitive textures and oblique structures.

[0063] The above ablation experiment results show that the redesign of propagation geometry (from axial propagation to diagonal complementary propagation and the use of hierarchical alternating fusion) is the core source of the performance improvement of this invention.

[0064] 4. Ablation experiment of auxiliary enhancement unit To evaluate the contributions of the two sub-modules in the auxiliary enhancement unit of this invention—the cross-scale enhancement module and the texture-aware local branch module—ablation experiments were conducted under the scanning condition of alternating main / secondary diagonal fusion, comparing the following four configurations: The ablation experiment results are shown in Table 4: Table 4 Ablation Experiment of Auxiliary Module (Lightweight × 4 Settings) The complete model achieved state-of-the-art performance on four datasets: Set5, Set14, BSD100, and Manga109. Compared to the baseline model (which simultaneously removed local textures and the cross-scale boosting module), it improved by 0.08 dB, 0.03 dB, 0.02 dB, and 0.19 dB, respectively, verifying the synergistic gain effect of the two support modules.

[0065] Removing only the texture-aware local branch module resulted in a decrease of 0.06 dB and 0.02 dB compared to the complete model on Set5 and Set14, respectively, and a slight improvement on Urban100. This indicates that the texture-aware local branch module is sensitive to texture structure and still has auxiliary value overall, but its contribution is related to the image content.

[0066] Removing only the cross-scale boosting module resulted in performance degradation on the Set5, Set14, BSD100, and Manga109 datasets, reducing performance by 0.01 dB, 0.02 dB, 0.01 dB, and 0.03 dB respectively compared to the complete model, indicating that cross-scale information interaction has a stable positive contribution.

[0067] The ablation experiments above show that the two support modules can work together to bring stable gains, with the cross-scale enhancement module contributing more robustly and the texture-aware local branch module further enhancing the detail recovery capability as a supplement.

[0068] 5. Visualize the propagation uniformity and structural modeling capabilities. (1) Visual analysis of effective receptive field The effective receptive field (ERF) reflects the sensitivity of the output pixel to the response of pixels at various locations in the input image, and is an important indicator for measuring the model's ability to aggregate information. Figure 4 The effective receptive field distributions of different super-resolution backbone networks were compared: Transformer methods that employ window attention mechanisms (such as SwinIR and HAT) have limited effective receptive fields due to window size, resulting in limited coverage and making it difficult to utilize long-distance contextual information for high-resolution reconstruction.

[0069] Traditional Mamba methods (such as MambaIR) employ axially dominant scan propagation, resulting in a distinct "cross-shaped" distribution of the effective receptive field—strong responses in the horizontal and vertical directions, but significantly attenuated responses in the diagonal direction. This anisotropic receptive field means that for oblique structures (such as 45° edges or diagonal textures), the model needs to indirectly transmit information via an axial path, artificially lengthening the effective interaction path and making it difficult to accurately recover oblique details during reconstruction.

[0070] The method of this invention, through propagational geometric redesign, introduces diagonal complementary paths while maintaining axial propagation, effectively resulting in a more uniform receptive field distribution and wider coverage. This indicates that the invention achieves isotropic context aggregation, enabling the model to acquire information equally from all directions, thereby providing more comprehensive and balanced contextual support for high-resolution (HR) image reconstruction, which is particularly beneficial for the accurate recovery of oblique edges and complex textures.

[0071] (2) Visualization analysis of Local Attribution Map (LAM) Local Attribution Map (LAM) visually reflects the evidence utilization pattern of a model by measuring the contribution of each pixel in a low-resolution (LR) image to the target super-resolution reconstruction region. Figure 5The reconstruction attribution distribution of different methods in the ×2 super-resolution task is shown: As a CNN method, EDSR focuses its attribution scope on the local neighborhood of the target region, indicating that it mainly relies on local information for reconstruction and has difficulty utilizing long-distance context.

[0072] SwinIR is limited by the window attention mechanism. Although the scope of attribution has been expanded, it is still limited by the window boundaries and cannot fully aggregate cross-window information.

[0073] The attribution range of MambaIR exhibits a "cross-shaped" distribution, consistent with the characteristics of the effective receptive field, indicating that its information aggregation is still mainly based on the axial direction, with insufficient utilization of the context in the diagonal direction.

[0074] The method of this invention has a significantly wider attribution range and a higher diffusion index, and the attribution region exhibits a geometric distribution consistent with the target structure. This means that the model of this invention can aggregate effective information from a wider and more structurally relevant spatial region, providing richer evidence for HR reconstruction. In particular, for oblique structures, the model can directly obtain information from distant pixels in the diagonal direction, avoiding indirect path loss from axial propagation, thereby improving reconstruction accuracy.

[0075] (3) Subjective image comparison analysis Figure 6 This section presents a subjective visual comparison of the reconstruction results of different methods in a ×4x super-resolution task: CNN methods such as EDSR and RCAN are prone to blurring and artifacts in diagonal lines and areas with repetitive textures, making it difficult to recover high-frequency details.

[0076] SwinIR, as a Transformer method, performs well in areas with regular structures, but still suffers from some loss of detail in complex textures and diagonal edges.

[0077] MambaIR is an improvement over CNN methods, but due to the limitation of its cross-shaped receptive field, it still has problems with inaccurate reconstruction in areas with oblique structures (such as 45° tilted eaves and diagonal textures) and repetitive patterns (such as dense brick walls and grid structures), which manifests as jagged edges and texture breaks.

[0078] The method of this invention produces clearer reconstruction results with fewer artifacts in complex areas such as diagonal lines, repetitive textures, and edge structures. Specifically: in diagonal edge areas, the method of this invention can maintain the continuity and sharpness of the edges, avoiding the stepped jaggedness commonly found in traditional methods; in repetitive texture areas, the method of this invention can accurately restore the periodic structure of the texture, avoiding texture blurring or misalignment; in high-frequency detail areas, the method of this invention can recover richer texture information, and the visual effect is closer to that of a real high-resolution image.

[0079] In summary, this invention significantly improves the isotropic modeling capability and structural reconstruction quality of Mamba-type super-resolution models by propagating geometric redesign without introducing additional scan branches. Experimental data and visualization results consistently demonstrate that this technique achieves outstanding results in both classic and lightweight tasks, and is particularly suitable for image super-resolution scenarios with complex structures and repetitive textures.

[0080] Optionally, embodiments of this application also provide an electronic device, including a processor, a memory, and a program or instructions stored in the memory and executable on the processor. When the program or instructions are executed by the processor, they implement the various processes of the above-described embodiment of an image super-resolution reconstruction method based on propagation geometric redesign and achieve the same technical effect. To avoid repetition, they will not be described again here.

[0081] This application also provides a readable storage medium storing a program or instructions. When the program or instructions are executed by a processor, they implement the various processes of the above-described embodiment of an image super-resolution reconstruction method based on propagation geometric redesign and achieve the same technical effect. To avoid repetition, they will not be described again here.

[0082] The processor is the processor in the electronic device described in the above embodiments. The readable storage medium includes computer-readable storage media, such as computer read-only memory (ROM), random access memory (RAM), magnetic disk, or optical disk.

[0083] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element. Furthermore, it should be noted that the scope of the methods and apparatuses in the embodiments of this application is not limited to performing functions in the order shown or discussed, but may also include performing functions substantially simultaneously or in the reverse order, depending on the functions involved. For example, the described methods may be performed in a different order than described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

[0084] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0085] The embodiments of this application have been described above with reference to the accompanying drawings. However, this application is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of this application without departing from the spirit and scope of the claims, and all of these forms are within the protection scope of this application.

Claims

1. An image super-resolution reconstruction method based on propagation geometry redesign, characterized in that, Includes the following steps: Acquire a low-resolution image, extract features from the low-resolution image, and output initial features; The initial features are input into multiple deep feature reconstruction modules stacked in series for feature sequence modeling and reconstruction information propagation. After layer-by-layer processing, deep features are output. The deep feature reconstruction module includes a high-efficiency two-dimensional scanning unit and an auxiliary enhancement unit. The auxiliary enhancement unit includes a cascaded cross-scale enhancement module and a texture-aware local branch module. The efficient two-dimensional scanning unit extracts dynamic parameters of the input features, alternately selects the main diagonal complementary propagation configuration or the sub-diagonal complementary propagation configuration according to the hierarchical order of the module in the network, and obtains two complementary propagation results by performing diagonal complementary state propagation based on view reorganization through the shared two-dimensional scanning backend. The two complementary propagation results are then subjected to content adaptive gating fusion using the position alignment gating fusion mechanism to output global propagation fusion features. The auxiliary enhancement unit refines the global propagation fusion features based on cross-scale information interaction and local high-frequency texture restoration, and outputs local enhanced features as input to the next deep feature reconstruction module or as the final deep features. The deep features are sequentially subjected to layer normalization, patch inversion and channel adjustment operations to obtain the adjusted features in the two-dimensional feature space. The adjusted features are then weighted and global residuals are fused with the initial features to output the deep fused features. The deep fusion features are input into the reconstruction network to obtain feature upsampling results. The low-resolution image is then subjected to bicubic upsampling to obtain a basic residual image. The feature upsampling results are added to the basic residual image to output a high-resolution reconstructed image.

2. The method of claim 1, wherein, The data processing between the deep feature reconstruction modules includes: The input features are normalized by layers and then fed into the high-efficiency two-dimensional scanning unit to obtain the global propagation fusion features; The global propagation fusion feature is input into the cross-scale enhancement module, multiplied by the first learnable residual scaling factor, and then added to the global propagation fusion feature to output the global enhancement feature; The global enhancement feature is input into the texture-aware local branch module, multiplied by the second learnable residual scaling factor, and then added to the global enhancement feature to output the local enhancement feature.

3. The method of claim 1, wherein, The efficient two-dimensional scanning unit extracts dynamic parameters of the input features, and the specific process includes: The input features of the current deep feature reconstruction module are obtained, and the input features are input into the front end to perform channel dimension feature mapping through pointwise convolution. The mapping result is input into depth convolution to perform spatial local feature fusion, and the output includes content-aware parameter generation stage features. The features of the parameter generation stage are segmented along the channel dimension to separate and output the propagation input features used for backbone state propagation and the original dynamic parameter tensor. The original dynamic parameter tensor is mapped to a numerically stable range using a soft positive activation function to output target dynamic parameters, which are used to control the two-dimensional state transitions on the feature grid points.

4. The method of claim 3, wherein, The process of alternately selecting the main diagonal complementary propagation configuration or the sub-diagonal complementary propagation configuration according to the hierarchical order of the modules in the network includes: The main diagonal complementary propagation configuration and the secondary diagonal complementary propagation configuration are predefined. The main diagonal complementary propagation configuration includes a first set of spatial rearranged views extending forward and backward along the main diagonal of the image, and the secondary diagonal complementary propagation configuration includes a second set of spatial rearranged views extending forward and backward along the secondary diagonal of the image. Obtain the network layer sequence number of the current deep feature reconstruction module in the multiple deep feature reconstruction modules stacked in series; When the network layer sequence number is odd, the main diagonal complementary propagation configuration is selected as the current working path; when the network layer sequence number is even, the secondary diagonal complementary propagation configuration is selected as the current working path.

5. The method of claim 4, wherein, The process of obtaining two complementary propagation results through diagonal complementary state propagation based on view reconstruction via a shared 2D scanning backend includes: For the first diagonal propagation direction under the current working path, the first spatial rearrangement operator is used to perform view rearrangement on the propagation input features to obtain the first recombined view features. The first recombined view features, together with the target dynamic parameters, are input into the shared two-dimensional scanning backend to perform two-dimensional feature grid state propagation operation to obtain the first intermediate propagation features. The inverse transformation operator of the first spatial rearrangement operator is used to perform spatial restoration on the first intermediate propagation features, and the first complementary propagation result is output. For the second diagonal propagation direction under the current working path, the second spatial rearrangement operator is used to perform view rearrangement on the propagation input features to obtain the second recombined view features. The second recombined view features, together with the target dynamic parameters, are input into the shared two-dimensional scanning backend to perform two-dimensional feature grid state propagation operation to obtain the second intermediate propagation features. The inverse transformation operator of the second spatial rearrangement operator is used to perform spatial restoration on the second intermediate propagation features, and the second complementary propagation result is output. The first complementary propagation result and the second complementary propagation result are combined as the two complementary propagation results of the current module.

6. The method of claim 5, wherein, The method of using a position-aligned gating fusion mechanism to perform content-adaptive gating fusion on the two complementary propagation results and outputting global propagation fusion features includes the following specific steps: The propagated input features are input into a gated convolutional layer for feature extraction, and a fusion log-weight tensor reflecting the local image structure is predicted and generated. Normalization is performed on the fusion logarithmic weight tensor along the propagation view dimension to obtain the first positional fusion weight corresponding to the first complementary propagation result and the second positional fusion weight corresponding to the second complementary propagation result. The first complementary propagation result is multiplied element-wise with the first position-wise fusion weight, and the second complementary propagation result is multiplied element-wise with the second position-wise fusion weight. The result matrix obtained by the above two multiplications is summed position-wise, and the global propagation fusion feature is output.

7. The method according to claim 1 or 2, characterized in that, The first part of the cross-scale enhancement module includes constructing aligned scale features and generating 3D interactive features: The acquired global propagation fusion features are subjected to double average pooling and quadruple average pooling respectively to generate corresponding double downsampled features and quadruple downsampled features; Perform a double upsampling interpolation on the double downsampled features to obtain the first scale aligned features, and perform a quadruple upsampling interpolation on the quadruple downsampled features to obtain the second scale aligned features; The global propagation fusion feature, the first scale alignment feature, and the second scale alignment feature are stacked and assembled along the virtual scale dimension to output a three-dimensional interactive feature.

8. The method of claim 7, wherein, The second part of the cross-scale enhancement module includes performing inter-scale information transfer to extract cross-scale enhanced features: The three-dimensional feature tensor is input into a three-dimensional pointwise convolution to perform channel dimension fusion to obtain three-dimensional channel fusion features, and the three-dimensional channel fusion features are input into a three-dimensional depth convolution to perform feature interaction along the virtual scale dimension to obtain three-dimensional interactive features. The three-dimensional interactive features are sliced ​​into three feature slice matrices of different scales according to the virtual scale dimension. The feature slice matrices of the three different scales are summed element by element, and the summation result is input into the projection convolution layer for feature dimension restoration mapping, and the cross-scale enhanced features are output.

9. The method of claim 1 or 2, wherein, The first part of the texture-aware local branching module includes extracting local texture features using a bottleneck structure: The obtained cross-scale enhanced features are subjected to layer normalization and front-end pointwise convolution operations for compressing feature channels in sequence to obtain channel compressed features. A two-dimensional deep convolutional network is used to extract the spatial local receptive field of the channel compression features, and the local activation features are obtained by nonlinear feature activation through Gaussian error linear unit activation function. The local activation features are feature-mapped using a back-end pointwise convolution used to recover the original feature channels, and the local texture features to be recalibrated are output.

10. The method of claim 9, wherein, The second part of the texture-aware local branch module includes adaptive recalibration of the local texture features in terms of channel dimensions: Global average pooling is performed on the local texture features to compress the spatial dimension data and obtain the global spatial context vector; The global spatial context vector is sequentially input into the first fully connected layer, the linear rectified activation function, the second fully connected layer, and the shape activation function. The network automatically learns the inter-channel dependencies and calculates the channel attention weights corresponding to each feature channel. The channel attention weights are multiplied element-wise with the local texture features by channel. The feature representation of texture-related channels is enhanced and the response of irrelevant channels is suppressed according to the channel attention weights, and the local enhanced detail features are output.