A gray image prediction method based on a lightweight neural network
The image prediction method built with a lightweight neural network utilizes contextual pixels and multi-directional gradient features to perform image prediction, solving the challenges of high accuracy and lightweight deployment, and achieving efficient image compression in resource-constrained scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GUANGXI NORMAL UNIV
- Filing Date
- 2026-04-14
- Publication Date
- 2026-06-26
AI Technical Summary
Existing image compression models cannot simultaneously meet the requirements of high-precision prediction and lightweight deployment. Traditional models have poor adaptability, and deep learning-based models are computationally complex and resource-intensive, making them difficult to adapt to resource-constrained scenarios.
Lightweight Neural Networks (LNNs) are constructed using fully connected layers and activation functions to perform image prediction based on contextual pixels and multi-directional gradient features. High-precision prediction is achieved by utilizing feature projection, fusion, and prediction weight calculation. The model structure is lightweight and does not require deep learning framework support.
It achieves high-precision image prediction in resource-constrained scenarios, reduces computation and storage requirements, and improves image compression efficiency and adaptability.
Smart Images

Figure CN122289408A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to image processing and video processing technologies, specifically a grayscale image prediction method based on a lightweight neural network. Background Technology
[0002] Images and videos, as core carriers of information transmission, have seen their data volume grow rapidly with the booming development of the digital information industry. Due to the high data redundancy in original images, compression can significantly reduce data volume, playing a crucial role in alleviating storage pressure, reducing storage and transmission costs, and improving transmission efficiency. As a core supporting technology for image compression, pixel prediction typically builds a prediction model based on the spatial correlation between pixels. In natural images, adjacent pixels often exhibit strong correlations. High-quality prediction models can generate accurate prediction values, resulting in prediction residuals concentrated near zero. Compared to encoding the original data, encoding the prediction residuals significantly improves compression efficiency; therefore, the performance of the prediction model plays a decisive role in image compression efficiency.
[0003] In the field of image compression, prediction models can be broadly categorized into traditional prediction models and deep learning-based prediction models. Traditional prediction models typically contain only a few manually designed prediction patterns, characterized by low computational complexity and low deployment costs, making them suitable for resource-constrained scenarios such as embedded devices. However, their drawbacks are also significant: limited by fixed prediction patterns, resulting in poor adaptability; difficulty in capturing nonlinear relationships between pixels in texture-dense scenes, leading to lower prediction accuracy. Deep learning-based prediction models, on the other hand, autonomously learn complex relationship patterns between pixels using neural networks, generally achieving higher prediction accuracy and stronger generalization ability. However, these models are typically complex in structure, have a large number of parameters, require substantial computing and storage resources for inference, are highly dependent on deep learning frameworks for deployment, and have stringent hardware requirements, making them difficult to adapt to resource-constrained scenarios such as terminal devices and IoT nodes. In summary, existing prediction models struggle to simultaneously meet the dual demands of high-precision prediction and lightweight deployment. Therefore, developing a high-precision image prediction model based on lightweight neural networks has become a pressing technical challenge with significant industrial value and broad application prospects. Summary of the Invention
[0004] The purpose of this invention is to address the shortcomings of existing technologies by providing a grayscale image prediction method based on lightweight neural networks (LNNs). This method uses neural networks to learn and model the complex relationships between image pixels, achieving high-precision image prediction.
[0005] The technical solution to achieve the objective of this invention is: A grayscale image prediction method based on a lightweight neural network includes the following steps: 1) Training the LNN network: The LNN network consists of four types of fully connected layers, denoted as T1, T2, T3, and T4, constructed in conjunction with activation functions. Based on two key types of information from the input image—contextual pixels and multi-directional gradient features—the LNN network achieves accurate prediction of the target image through an end-to-end process of "feature projection—feature fusion—prediction weight calculation—pixel prediction." Fully connected layers T1 and T2 are responsible for linearly projecting the contextual pixels and multi-directional gradient features, respectively, ensuring consistent feature dimensions after projection. Subsequently, feature fusion generates a fused feature F that combines pixel spatial correlation and image texture gradient information. The fused feature F is input to the prediction weight calculation module CCB, composed of fully connected layers T3 and T4, which outputs the prediction weights corresponding to the reference pixels. Finally, based on the weighted calculation of the prediction weights and the reference pixels, the predicted value of the target pixel is obtained. The fully connected layers of the LNN network do not introduce bias terms, specifically including: 1-1) Image preprocessing: To improve the generalization ability of the model and accelerate training convergence, the image is processed during the training phase: the image is randomly cropped into n×n image blocks, and random horizontal or vertical flipping is performed for data augmentation. The input tensor dimension of the LNN network is (B, 1, n, n), where B represents the batch size, the number "1" corresponds to the single channel dimension of the grayscale image, and n×n represents the size of the input image block. To solve the problem of missing context pixels at the image boundary, the input image block needs to be preprocessed by copying and padding: 2 rows of pixels are added above the image block, and 2 columns and 1 column of pixels are added to the left and right sides respectively. The padding pixels are copied from the existing pixels with the nearest Euclidean distance. After the above padding operation, the n×n image block is expanded into a (n+2)×(n+3) image block. 1-2) Calculate the multi-directional gradient features of the image patch: The multi-directional gradient features include the image patch gradients in four directions: horizontal, vertical, main diagonal, and secondary diagonal. The gradient of each direction is defined as the sum of the absolute values of the differences between the corresponding pixel pairs, as shown in formula (1): (1), Where i and j represent pixel coordinates, and g1, g2, g3 and g4 correspond to the gradient features of the horizontal, vertical, main diagonal and secondary diagonal directions of the image block, respectively. In order to eliminate the influence of the numerical magnitude difference of different features on the prediction results, the above gradient features are normalized. The normalization formula is shown in formula (2): (2), In applications requiring lossless compression and reversible information hiding that necessitate complete reconstruction of the original image, the normalized gradient features need to be stored as auxiliary information along with the compressed data or the image itself. This is because the normalized gradient features satisfy... and The constraints, gradient characteristics in the second diagonal direction It can be derived from the first three gradient features, as shown in formula (3): (3), Therefore, in actual storage, only storage is required. Three gradient features are stored, with the gradient features retained to two decimal places, i.e., a precision of 0.01. Each gradient feature is represented by a 7-bit encoding, and storing the multi-directional gradient features of an image patch requires 21 bits. 1-3) Extracting context pixels: For the pixel to be predicted, i.e., the target pixel x t Using the 10 neighboring pixels x1, x2, ..., x of the target pixel 10 For x t For prediction, these adjacent pixels are called context pixels, located to the left or above the target pixel; 1-4) Feature Projection and Feature Fusion: Let the context pixels be X = (x1, x2, … , x 10 The multi-directional gradient features of the image patch are G = (g1, g2, g3, g4). Fully connected layers T1 and T2 are used to linearly project X and G respectively, and ReLU (Rectified linear unit, or ReLU for short) activation function is applied to introduce non-linear expressive power. The specific mapping process is shown in formula (4): (4), in, and W represents the projected context pixels and multi-directional gradient features, both with dimensions of 1×4D, where D is a preset parameter; c The weight parameters for the fully connected layer T1 are 10×4D; W g These are the weight parameters for the fully connected layer T2, with a dimension of 4×4D. and The fusion feature F is obtained by adding each element together, as shown in formula (5): (5), The operator "+" means adding the elements at corresponding positions of two vectors one by one to obtain a new vector of the same dimension. 1-5) Prediction Weight Calculation: The prediction weights are calculated synchronously by four structurally identical but parameter-independent weight calculation modules (CCBs). Each CCB module consists of a fully connected layer T3, a ReLU activation function, and a fully connected layer T4 concatenated. It takes a 1×4D fusion feature F as input and outputs the prediction weight for the corresponding reference pixel. Let the prediction weight output by the i-th CCB module be c. ii = 1, 2, 3, 4, c i The fusion feature F is obtained by sequentially mapping through the fully connected layer T3, the ReLU activation function, and the fully connected layer T4 within the module, as shown in formula (6): (6), Among them, W i,1 Let W be the weight parameters of the fully connected layer T3 in the i-th CCB module, with dimensions of 4D×D. i,2 These are the weight parameters for the fully connected layer T4, with a dimension of D×1; 1-6) Pixel Prediction: Predict all pixels of the input image patch to obtain the corresponding predicted image patch, which is the output of the LNN network with dimensions (B, 1, n, n). For each pixel in the input image patch, calculate its prediction weight vector C = (c1, c2, c3, c4). Assume the reference pixel vector is P = (x 10 (x7, x6, x8) T Target pixel prediction value The calculation formula is shown in formula (7): (7), The predicted values of all pixels constitute the predicted image block of the input image block; The LNN network was iteratively trained using the BOSSBase dataset, which contains 10,000 grayscale images of 512×512 pixels. The validation set consisted of 10 standard test images of the same size, used to evaluate the model's predictive performance. The network training aimed to minimize the difference between the input image patch and the predicted image patch. The loss function was defined as shown in Equation (8). (8), in, I represents the parameter set of the LNN network model. These correspond to the input image patch and the predicted image patch, respectively, where N is the total number of image patches during the training process. The weight decay coefficient (used to suppress overfitting) is used. The training hyperparameters are set as follows: a total of 400 training epochs, a batch size of 32, and an initial learning rate of 0.0004. A dynamic learning rate adjustment strategy is used during training, with the learning rate decaying to 50% of its current value at epochs 30, 60, and 160, respectively. 2) Grayscale Image Prediction: After training, the LNN network parameters are fixed and the LNN network is used to predict grayscale images. The prediction process is similar to the training phase, but the image preprocessing method is different. The grayscale image is preprocessed so that the height and width of the image are integer multiples of n. The image is divided into non-overlapping n×n image blocks. Unlike the training phase, the first row and first column pixels of the image block are not predicted, so the filling method is different: one row of pixels is filled at the top of the image block, and one column of pixels is filled on the left and right sides of the image block. After filling, the image block expands to (n+1)×(n+2). Each image block of the grayscale image is input into the LNN network after preprocessing to obtain the corresponding predicted image block. All predicted image blocks are stitched together in order according to their original positions to obtain the predicted image of the grayscale image. The predicted value of the target pixel is directly calculated by the network parameters of the LNN network, the gradient features of the image block, and the context pixels of the target pixel, without the participation of the deep learning framework.
[0006] This method uses neural networks to learn and model the complex relationships between image pixels, enabling high-precision image prediction. Attached Figure Description
[0007] Figure 1 This is a schematic diagram of the LNN network configuration in the embodiment; Figure 2 This is a schematic diagram of the spatial location of the target pixel and the context pixel in the embodiment. Detailed Implementation
[0008] The present invention will be further described below with reference to the accompanying drawings and embodiments, but this is not intended to limit the scope of the invention.
[0009] Example: A grayscale image prediction method based on a lightweight neural network includes the following steps: 1) Training the LNN network: such as Figure 1As shown, the LNN network consists of four types of fully connected layers, denoted as T1, T2, T3, and T4, constructed in conjunction with activation functions. Based on two key types of information from the input image—contextual pixels and multi-directional gradient features—the LNN network achieves accurate prediction of the target image through an end-to-end process of "feature projection—feature fusion—prediction weight calculation—pixel prediction." Fully connected layers T1 and T2 are responsible for linearly projecting the contextual pixels and multi-directional gradient features, respectively, ensuring consistent feature dimensions after projection. Subsequently, feature fusion generates a fused feature F that combines pixel spatial correlation and image texture gradient information. The fused feature F is input to the prediction weight calculation module CCB, composed of fully connected layers T3 and T4, which outputs the prediction weights corresponding to the reference pixels. Finally, based on the weighted calculation of the prediction weights and the reference pixels, the predicted value of the target pixel is obtained. No bias terms are introduced in the fully connected layers of the LNN network. Specifically, the following are included: 1-1) Image preprocessing: To improve the generalization ability of the model and accelerate training convergence, the image is processed during the training phase: the image is randomly cropped into n×n image blocks, and random horizontal or vertical flipping is performed for data augmentation. The input tensor dimension of the LNN network is (B, 1, n, n), where B represents the batch size, the number "1" corresponds to the single channel dimension of the grayscale image, and n×n represents the size of the input image block. To solve the problem of missing context pixels at the image boundary, the input image block needs to be preprocessed by copying and padding: 2 rows of pixels are added above the image block, and 2 columns and 1 column of pixels are added to the left and right sides respectively. The padding pixels are copied from the existing pixels with the nearest Euclidean distance. After the above padding operation, the n×n image block is expanded into a (n+2)×(n+3) image block. 1-2) Calculate the multi-directional gradient features of the image patch: The multi-directional gradient features include the image patch gradients in four directions: horizontal, vertical, main diagonal, and secondary diagonal. The gradient in each direction is defined as the sum of the absolute values of the differences between pixel pairs in the corresponding direction. (1), Where i and j represent pixel coordinates, and g1, g2, g3, and g4 correspond to the gradient features along the horizontal, vertical, main diagonal, and secondary diagonal directions of the image patch, respectively. To eliminate the influence of differences in the numerical magnitude of different features on the prediction results, the above gradient features are normalized. The normalization formula is as follows: (2), In applications requiring lossless compression and reversible information hiding that necessitate complete reconstruction of the original image, the normalized gradient features need to be stored as auxiliary information along with the compressed data or the image itself. This is because the normalized gradient features satisfy... and The constraints, gradient characteristics in the second diagonal direction This can be derived from the first three gradient features: (3), Therefore, in actual storage, only storage is required. Three gradient features are stored, with the gradient features retained to two decimal places, i.e., a precision of 0.01. Each gradient feature is represented by a 7-bit encoding, and storing the multi-directional gradient features of an image patch requires 21 bits. 1-3) Extracting context pixels: For the pixel to be predicted, i.e., the target pixel x t Using the 10 neighboring pixels x1, x2, ..., x of the target pixel 10 For x t In making predictions, these neighboring pixels are called context pixels, located to the left or above the target pixel, such as... Figure 2 As shown; 1-4) Feature Projection and Feature Fusion: Let the context pixels of the target pixel be X = (x1, x2, … , x 10 The multi-directional gradient features of the image patch are G = (g1, g2, g3, g4). Fully connected layers T1 and T2 are used to linearly project X and G respectively, and a ReLU activation function is applied to introduce non-linear expressive power. The specific mapping process is as follows: (4), in, and W represents the projected context pixels and multi-directional gradient features, both with dimensions of 1×4D, where D is a preset parameter; c The weight parameters for the fully connected layer T1 are 10×4D; W g These are the weight parameters for the fully connected layer T2, with a dimension of 4×4D. and The fusion feature F is obtained by adding elements one by one: (5), The operator "+" means adding the elements at corresponding positions of two vectors one by one to obtain a new vector of the same dimension. 1-5) Prediction Weight Calculation: The prediction weights are calculated synchronously by four structurally identical but parameter-independent weight calculation modules (CCBs). Each CCB module consists of a fully connected layer T3, a ReLU activation function, and a fully connected layer T4 concatenated. It takes a 1×4D fusion feature F as input and outputs the prediction weight for the corresponding reference pixel. Let the prediction weight output by the i-th CCB module be c. i i = 1, 2, 3, 4, c iThe fused feature F is obtained by sequentially mapping through a fully connected layer T3, a ReLU activation function, and a fully connected layer T4 within the module: (6), Among them, W i,1 Let W be the weight parameters of the fully connected layer T3 in the i-th CCB module, with dimensions of 4D×D. i,2 These are the weight parameters for the fully connected layer T4, with a dimension of D×1; 1-6) Pixel Prediction: Predict all pixels of the input image patch to obtain the corresponding predicted image patch, which is the output of the LNN network with dimensions (B, 1, n, n). For each pixel in the input image patch, calculate its prediction weight vector C = (c1, c2, c3, c4). Assume the reference pixel vector is P = (x 10 (x7, x6, x8) T Target pixel prediction value The calculation formula is as follows: (7), The predicted values of all pixels constitute the predicted image block of the input image block; The LNN network was iteratively trained using the BOSSBase dataset, which contains 10,000 grayscale images of 512×512 pixels. The validation set consisted of 10 standard test images of the same size, used to evaluate the model's predictive performance. The network training aimed to minimize the difference between the input image patch and the predicted image patch, and the loss function was defined as follows: (8), in, I represents the parameter set of the LNN network model. These correspond to the input image patch and the predicted image patch, respectively, where N is the total number of image patches during the training process. The weight decay coefficient (used to suppress overfitting) is used. The training hyperparameters are set as follows: a total of 400 training epochs, a batch size of 32, and an initial learning rate of 0.0004. A dynamic learning rate adjustment strategy is used during training, with the learning rate decaying to 50% of its current value at epochs 30, 60, and 160, respectively. 2) Grayscale Image Prediction: After training, the LNN network parameters are fixed and the LNN network is used to predict grayscale images. The prediction process is similar to the training phase, but the image preprocessing method is different. The grayscale image is preprocessed so that the height and width of the image are integer multiples of n. The image is divided into non-overlapping n×n image blocks. Unlike the training phase, the first row and first column of the image blocks are not predicted, so the filling method is different: one row of pixels is filled at the top of the image block, and one column of pixels is filled on the left and right sides of the image block. After filling, the image block expands to (n+1)×(n+2). Each image block of the grayscale image is input into the LNN network after preprocessing to obtain the corresponding predicted image block. All predicted image blocks are stitched together in order according to their original positions to obtain the predicted image of the grayscale image. The predicted value of the target pixel is directly calculated by the network parameters of the LNN network, the gradient features of the image block, and the context pixels of the target pixel, without the participation of the deep learning framework.
[0010] Experimental comparison with other prediction methods: In this example, the LNN network is trained to determine the network parameters. After training, the LNN prediction method can run in the traditional mode: it can complete the prediction task normally even without a deep learning software framework. The main body of the LNN network contains only a few fully connected layers. When the dimension parameter D of the fully connected layer is 64, the total number of model parameters of the LNN network is only 69,376, and the uncompressed network model size is 276KB, showing a significant lightweight advantage. The above characteristics enable the LNN prediction method to be flexibly deployed in resource-constrained scenarios such as terminal devices, embedded systems, and high-resolution cameras. In order to comprehensively evaluate the prediction performance of the LNN method in this example, a comparative experiment was conducted with the current mainstream prediction methods. The selected comparison methods include MED (WEINBERGER M, SEROUSSI G, SAPIRO G. The LOCO-I lossless image compression algorithm: principles and standardization into JPEG-LS[J]. IEEE Transactions on Image Processing, 2000, 9(8): 1309-1324.) and GAP (WU X, MEMON N.). Context-based, adaptive, lossless image coding[J]. IEEE Transactions onCommunications, 1997, 45(4): 437-444), MDGP (ZHANG 35596-35609. ), RR (ZHANGFour classic prediction methods were used, with the information entropy of the prediction difference as the core evaluation index. The lower the information entropy, the higher the redundancy and compressibility of the data, and the higher the corresponding prediction accuracy. The experiment used 10 standard grayscale test images of size 512×512, covering smooth images (Airplane, Jetplane) and textured images (Baboon), to comprehensively test the adaptability of each method under different image types. The information entropy test results of each method on the test images are shown in Table 1. Table 1. Information entropy of each prediction method on the test image. , From the perspective of individual images, the LNN prediction method in this example achieved the lowest information entropy in 6 out of 10 test images, indicating that its prediction accuracy is better than the comparison methods in most scenarios. The best performance of the comparison methods has limitations. The RR prediction method only achieved the lowest entropy value on the Barbara image, and the MDGP prediction method only achieved the lowest entropy value on the Lake and Peppers images. However, even in scenarios where the LNN prediction method did not achieve the optimal value, the difference between its information entropy and the optimal value was very small, demonstrating strong performance stability. In terms of overall performance, the average information entropy of the LNN prediction method is 4.552, which is significantly lower than the other four comparison methods (MED: 4.776, GAP: 4.749, MDGP: 4.672, RR: 4.592). Moreover, it maintains stable performance in both smooth and textured images, indicating that the LNN method in this example has strong scene adaptability and its overall prediction performance is better than the current mainstream MED, GAP, MDGP and RR prediction methods.
Claims
1. A grayscale image prediction method based on a lightweight neural network, characterized in that, Includes the following steps: 1) Training the LNN network: The LNN network consists of four types of fully connected layers, denoted as T1, T2, T3, and T4, constructed in conjunction with activation functions. Based on two key types of information from the input image—contextual pixels and multi-directional gradient features—the LNN network achieves accurate prediction of the target image through an end-to-end process of "feature projection—feature fusion—prediction weight calculation—pixel prediction." Fully connected layers T1 and T2 are responsible for linearly projecting the contextual pixels and multi-directional gradient features, respectively, ensuring consistent feature dimensions after projection. Subsequently, feature fusion generates a fused feature F that combines pixel spatial correlation and image texture gradient information. The fused feature F is input to the prediction weight calculation module CCB, composed of fully connected layers T3 and T4, which outputs the prediction weights corresponding to the reference pixels. Finally, based on the weighted calculation of the prediction weights and the reference pixels, the predicted value of the target pixel is obtained. The fully connected layers of the LNN network do not introduce bias terms, specifically including: 1-1) Image preprocessing: During the training phase, the image is processed as follows: The image is randomly cropped into n×n image blocks, and random horizontal or vertical flipping is performed for data augmentation. The input tensor dimension of the LNN network is (B, 1, n, n), where B represents the batch size, the number "1" corresponds to the single channel dimension of the grayscale image, and n×n represents the size of the input image block. Copying and padding preprocessing is performed on the input image block: 2 rows of pixels are added above the image block, and 2 columns and 1 column of pixels are added to the left and right sides respectively. The padding pixels are copied from the existing pixels with the nearest Euclidean distance. After the padding operation, the n×n image block is expanded into a (n+2)×(n+3) image block. 1-2) Calculate the multi-directional gradient features of the image patch: The multi-directional gradient features include the image patch gradients in four directions: horizontal, vertical, main diagonal, and secondary diagonal. The gradient of each direction is defined as the sum of the absolute values of the differences between the corresponding pixel pairs, as shown in formula (1): (1), Where i and j represent pixel coordinates, and g1, g2, g3 and g4 correspond to the gradient features of the horizontal, vertical, main diagonal and secondary diagonal directions of the image block, respectively. The above gradient features are normalized, and the normalization formula is shown in formula (2): (2), In applications requiring lossless compression and reversible information hiding that necessitate complete reconstruction of the original image, the normalized gradient features are stored as auxiliary information along with the compressed data or the image itself. This is because the normalized gradient features satisfy... and The constraints, gradient characteristics in the second diagonal direction The results are derived from the first three gradient features, as shown in formula (3): (3), Therefore, in actual storage, only storage is required. Three gradient features are stored, with the gradient features retained to two decimal places, i.e., a precision of 0.
01. Each gradient feature is represented by a 7-bit encoding, and storing the multi-directional gradient features of an image patch requires 21 bits. 1-3) Extracting context pixels: For the pixel to be predicted, i.e., the target pixel x t Using the 10 neighboring pixels x1, x2, ..., x of the target pixel 10 For x t For prediction, these adjacent pixels are called context pixels, located to the left or above the target pixel; 1-4) Feature Projection and Feature Fusion: Let the context pixels be X = (x1, x2, … , x 10 The multi-directional gradient features of the image patch are G = (g1, g2, g3, g4). Fully connected layers T1 and T2 are used to linearly project X and G respectively, and ReLU activation function is applied to introduce non-linear expressive power. The specific mapping process is shown in formula (4): (4), in, and W represents the projected context pixels and multi-directional gradient features, both with dimensions of 1×4D, where D is a preset parameter; c The weight parameters for the fully connected layer T1 are 10×4D; W g These are the weight parameters for the fully connected layer T2, with a dimension of 4×4D. and The fusion feature F is obtained by adding each element together, as shown in formula (5): (5), The operator "+" means adding the elements at corresponding positions of two vectors one by one to obtain a new vector of the same dimension. 1-5) Prediction Weight Calculation: The prediction weights are calculated synchronously by four structurally identical but parameter-independent weight calculation modules (CCBs). Each CCB module consists of a fully connected layer T3, a ReLU activation function, and a fully connected layer T4 concatenated. It takes a 1×4D fusion feature F as input and outputs the prediction weight for the corresponding reference pixel. Let the prediction weight output by the i-th CCB module be c. i i = 1, 2, 3, 4, c i The fusion feature F is obtained by sequentially mapping through the fully connected layer T3, the ReLU activation function, and the fully connected layer T4 within the module, as shown in formula (6): (6), Among them, W i,1 Let W be the weight parameters of the fully connected layer T3 in the i-th CCB module, with dimensions of 4D×D. i,2 These are the weight parameters for the fully connected layer T4, with a dimension of D×1; 1-6) Pixel Prediction: Predict all pixels of the input image patch to obtain the corresponding predicted image patch, which is the output of the LNN network with dimensions (B, 1, n, n). For each pixel in the input image patch, calculate its prediction weight vector C = (c1, c2, c3, c4). Assume the reference pixel vector is P = (x 10 (x7, x6, x8) T Target pixel prediction value The calculation formula is shown in formula (7): (7), The predicted values of all pixels constitute the predicted image block of the input image block; The LNN network was iteratively trained using the BOSSBase dataset, which contains 10,000 grayscale images of 512×512 pixels. The validation set consisted of 10 standard test images of the same size, used to evaluate the model's predictive performance. The network training aimed to minimize the difference between the input image patch and the predicted image patch. The loss function was defined as shown in Equation (8). (8), in, I represents the parameter set of the LNN network model. These correspond to the input image patch and the predicted image patch, respectively, where N is the total number of image patches during the training process. The weight decay coefficient is used, and the training hyperparameters are set as follows: a total of 400 training epochs, a batch size of 32, and an initial learning rate of 0.0004; a dynamic learning rate adjustment strategy is used during training, with the learning rate decaying to 50% of its current value at epochs 30, 60, and 160 respectively. 2) Grayscale Image Prediction: After training, the LNN network parameters are fixed and the LNN network is used to predict grayscale images. The grayscale image is preprocessed so that the height and width of the image are integer multiples of n. The image is divided into non-overlapping n×n image blocks. Unlike the training phase, the first row and first column of the image block are not predicted. The filling method is different: one row of pixels is filled at the top of the image block, and one column of pixels is filled on the left and right sides of the image block. After filling, the image block expands to (n+1)×(n+2). Each image block of the grayscale image is input into the LNN network after preprocessing to obtain the corresponding predicted image block. All predicted image blocks are stitched together in order according to their original positions to obtain the predicted image of the grayscale image. The predicted value of the target pixel is directly calculated by the network parameters of the LNN network, the gradient features of the image block, and the context pixels of the target pixel, without the participation of the deep learning framework.